About Mediachain
- Why is Mediachain useful?
- What problem does Mediachain solve?
- What is the design philosophy behind Mediachain?
- What are the features of Mediachain?
- Case studies
Why is Mediachain useful?
Mediachain creates a single logical space, organized by topic or application, for multiple participants to publish and discover data without a central point of control or failure.
In other words, Mediachain allows multiple participants to collaborate on indexes of data in a completely decentralized way.
What problem does Mediachain solve?
Data coordination in decentralized systems is a difficult problem. Early peer-to-peer systems such as BitTorrent relied on centralized trackers to provide indexing and discovery services which resulted in central points of control and failure for finding the underlying data.
While blockchain systems such as Bitcoin and Ethereum have demonstrated the viability of completely decentralized indexes for the underlying transactions on their platform, they are inappropriate solutions for general purpose data storage. While both systems have mechanisms such as OP_RETURN
for storing arbitrary data, they are poor storage solutions even for pointers to data generated by a typical media application, in terms of cost, throughput, and capacity.
By relaxing requirements of global consensus to establish ordering and prevent double spends, Mediachain offers a novel implementation of a completely decentralized, peer-to-peer database.
What is the design philosophy behind Mediachain?
Mediachain makes the following assumptions:
- Because double spends aren’t relevant to metadata, global consensus is not necessary or meaningful
- Data can be sharded by topic or application into “namespaces”
- While certain statements have a causal/temporal relationship, there is no inherent total ordering for a majority of the statements
- Our domain can be thought of as a partially ordered set (or even an unordered set of partially ordered sets)
Since Mediachain doesn’t require a singular linearly ordered view of the world, it can take advantage of a CRDT data structure, which allows data to achieve eventually consistent state without the need for consensus.
In sum, Mediachain is a distributed database that supports upserts connecting statements to one or more domain specific identifier. An upsert is an idempotent insert/update operation that does not require knowledge of the current state of the database. This allows operations to execute concurrently without requiring locks or ordering.
Rich relations between objects can be expressed with merkle DAGs (IPLD) in object content, allowing the application layer to evolve according to user needs. Mediachain provides the coordination layer to discover, merge, and connect multiple DAGs generated by independent individuals into a single, collaborative DAG.
In addition, Mediachain offers rich query support and the ability to discover datasets through a decentralized directory.
What are the features of Mediachain?
Mediachain enables writing and discovery of data in a standard, secure, collaborative, and decentralized way.
Data is content addressed and signed
All data in Mediachain is content addressed and signed. Every object in the system is location independent and self certifying, meaning data can be replicated and served by multiple untrusted participants while remaining tamper-proof and trustworthy since data integrity is cryptographically verifiable. In addition, content addressing enables immutability of all data in the system. Because all statements are signed, the identity of the contributor can be verified and a trail of provenance is maintained when data is reused.
Data is natively linked and interoperable
Mediachain conforms to the IPLD spec, enabling rich relations between objects to be expressed with merkle DAGs. This allows the application layer to evolve in expressiveness according to user needs. Data that is in the system can be referenced or extended using content adressed links in a rich yet non-destructive manner.
Lower costs through shared stewardship
Because all data in Mediachain is content addressed, anyone can replicate and serve the datasets of others in a secure way. Just like BitTorrent, this reduces costs for participants and increases the bandwidth when accessing the data. For example, a group of museums can replicate each other’s datasets, thereby contributing resources to the global cultural commons, while being confident about the integrity of the data.
A single logical space
Mediachain implements hierarchical namespaces to enable one or more participant to collaboratively organize statements by topic without requiring any external indexing or coordination. Just like a traditional database allows multiple users to publish to the same table, multiple users can publish to a namespace in Mediachain in a completely decentralized way.
This mechanism allows developers using Mediachain to build dynamic decentralized applications with user-generated content in a way that other systems cannot.
Mediachain offers flexible permission modes for writing to namespaces, including:
- permissionless
- consortium
- custom governance through smart contracts
A decentralized directory facilitates dataset discovery by allowing users to look up namespaces in the system without needing a central party.
Some of the namespaces currently live in the Mediachain network:
$ mcclient listNamespaces
images.500px
images.dpla
images.flickr
images.pexels
mediachain.schemas
museums.brooklynmuseum.artists
museums.brooklynmuseum.collections
museums.brooklynmuseum.exhibitions
museums.brooklynmuseum.geographicallocations
museums.brooklynmuseum.museumlocations
museums.brooklynmuseum.objects
museums.cooperhewitt.objects
museums.moma.artists
museums.moma.artworks
museums.rijksmuseum.artworks
museums.tate.artists
museums.tate.artworks
Developers looking to reuse data in the Mediachain have a single, standard API to interact with the thousands of datasets in the network, enabling completely decentralized discovery and collaboration on data.
Data synchronization
Participants can manually poll namespaces to synchronize data or subscribe for real-time notifications via pubsub.
Case studies
Shared cultural heritage
Mediachain lowers the cost of participating in open access for institutions like museums and libraries, fostering a vibrant community of like minded contributors by offering a common technological framework without needing to build a proprietary system or requiring a third party custodian.
In today’s centralized open access ecosystem, cultural heritage data is usually published through proprietary APIs or as static data dumps on GitHub. The datasets live in disparate silos, unaware of each other’s existence. There is no way to query across datasets when searching for the same artwork, works by the same artist, or collections spanning the same movement. Developers hoping to use open data must accept the burden of interacting with multiple incompatible APIs. There is also no standard, collaborative way to extend or supplement a dataset.
Mediachain enables metadata interoperability and preservation, functioning as a decentralized data co-op where information about creative works from multiple institutions and contributors can be shared, discovered, linked, and extended. Instead of needing to create or consume dozens of disparate APIs and data formats, organizations and developers can publish and access data through a single interface.
Data from one organization can interlink with data from others. If two sources have information about the same creative work, they can be linked using a single identifier while preserving attribution for all sources. The participants can continue referencing and querying the data using their proprietary ID.
Developers and researchers can extend data in the system without destructively altering the original dataset in a Git-like fashion. Source institutions can later choose to audit contributions and decide how to make use of them, or to simply let them co-exist independently.
Cryptographic identity and a reputation enables institutions to verify each other, maintain attribution, and filter contributions from credible sources.
Because Mediachain is open source and decentralized, all data is guaranteed to remain free and open, with no central point of control, yet with tremendously increased access and reuse potential.
Learn more in this blog post.
Decentralized global rights database for music
No central database exists to keep track of information about music. This affects the entire music supply chain, and most critically, digital service providers (DSPs) who are trying to pay artists and rights owners. Today, in many cases they simply don’t know who to pay, to the great detriment of artists, music services and consumers alike.
The “Global Repertoire Database” initiative was started in 2011 to aggregate ownership data in a central database, but infamously failed in 2014 after millions of dollars of investment. Sure enough, no party wanted to cede control to a new centralized entity and the resulting political friction ended in failure.
Mediachain offers a scalable, decentralized solution for a global rights database for music: a single place to publish all information about who made what song, without having to trust a third-party organization.
Mediachain enables multiple participants to share data in a single logical space. Because Mediaschain is decentralized, it removes the need to assign a central gatekeeper, removing the political friction of centralized approaches like the Global Repertoire Database.
Mediachain enables data from two or more organizations to become interoperable and linked, and be queried with a common identifier for a song, such as ISRC, ISWC or other internal fields or keywords.
Mediachain enables modular and lightweight translation of data, so data from interested parties can become interoperable, while preserving their proprietary data formats. A single participant can simply publish data in their own format on day one, and the network becomes incrementally compatible over time, without requiring agreement about a mandatory new data standard up front.
All data published to the system is cryptographically signed by the contributor. This ensures data is securely verifiable and attributed, and can be filtered by identity. Organizations can query for data from whitelisted entities and ignore data from unknown participants.
Because preventing “double spending” is unnecessary for media metadata, Mediachain scales efficiently to accommodate the volume of the music repertoire compared to traditional blockchains which are limited by cost, throughput and capacity, while preserving the complex relationships and revision history of structured media metadata.
Because Mediachain is open source and decentralized, all participants remain in control of their data and there is no central point of failure.
Learn more in this blog post.
Scalable storage solution for decentralized media applications
You’re building a decentralized app that generates a large volume of data, or needs to store and process user-generated content off-chain.
On-chain storage is only appropriate for the simplest test cases, and other off-chain storage solutions are generally limited in their expressiveness. Mediachain provides scalable, long-term storage for your dApp data with a standard API, and allows your data to be reused by others, creating more value for your users.
- Ujo
- Userfeeds
- TAP