The NFT Metadata Scalability Problem¶
In my previous NFT platform post, I give some examples on why on-chain metadata is required for future NFT behaviors. There are many new token standards that are under considerations that required additional metadata, such as Subscription Token Standard, Entangled Tokens, Micropayments Standard, Re-Fungible, etc. Gamefi will most likely push the boundary of on-chain metadata. NFT could represent metaverse player characters and items. Their behaviors could be governed by on-chain NFT state transitions. Increasing complex on-chain behaviors require increasing high volume of metadata.
A standard implementation of ERC721 contains the following data on-chain: owner address, name, and symbol. The state transition function only depends on the variable owner address
. A common extension includes an off-chain metadata specification, where each token id also contains a tokenUrl
. The url is a link to some off-chain bytes. These off-chain bytes are usually required to be a json file that contains key-value pairs such as description
, image_url
, etc. Those key-value pairs are metadata.
NFT state transitions operates on on-chain metadata. For example, we want to program an in-game NFT representing a property in a game such that it is only transferable if it has a house built on it. This simple verification check could not be programmed against the metadata has_house: bool
. It is possible to get around it. For example, if there is a on-chain merkle hash representing the off-chain metadata, the state transition could take a merkle proof as an input argument. This is complicated. It introduce the issue data availability for off-chain metadata. Any metadata-update would still require an on-chain update of the merkle hash. With on-chain metadata, programming a state transition is straight forward.
On-chain metadata has not been popular because of cost and existing behaviors. Storing data on-chain is expensive on smart contract platforms. The off-chain json metadata standard is a work-around solution. It introduces a user experience problem and third party service dependency. There is only one popular form of NFT usage pattern today, and that is ERC721. It uses limited set of metadata, e.g. owners
, approved_address
, and approved_operator
. Future NFTs will be an ever increasing variety of mini-programs that require metadata that goes well beyond what we are seeing today. The usage of on-chain metadata will grow in orders of magnitude as NFTs usage grow both in size and complexity.
A blockchain has limited space for on-chain metadata. Every blockchain will at least be limited by two factors: blockspace and state size. Blockspace corresponds to how much data the blockchain could fit through its various components, e.g. gossip network, mempool, execution, and consensus algorithm to be recorded in blocks for perpetuity. Blocks contain transactions, and transactions change and append to the state. If there are a steady stream of append only metadata transactions, the state could grow indefinitely. Both historic blocks and state take up local storage. “State size limit is the limit of how much metadata could be accumulated overtime”
Some network participants, usually the validators, have to maintain the active state of the blockchain to process transactions. They need to read and write to the state to process transactions. This sets a limit on the state size. The state size is limited by the amount of storage available to the validator node. Even if a chain place a high end hardware requirements to its validators, the single machine state size is about a few hundred gigabytes to a few terabytes, depending on implementation details.
On-chain metadata accumulates into the state. A blockchain could have unlimited throughput in all other components, but its accumulated state cannot go beyond its single machine limit.
There are two solutions to expand the state size beyond a single machine. One is state expiry and another is state sharding. The first strategy prunes the old data from the state. The strategy has to allows expired data to be recovered. One way to do that is that anyone could active the old state by providing the data and a validity proof. This approach has few drawbacks. It introduces a dependency to a third party service to hold the expired data, it does not scale executions, it does not scale throughput, and it is not user friendly. The second strategy is sharding. When the state grows beyond what could be handled in a single machine, the system use more machines. The key challenge is how to shard the state so that each shard could process transactions in parallel.
Cost of Onchain Metadata¶
There are fundamental reasons why on-chain metadata is always going to be somewhat expensive. The ecosystem has many convoluted discussions on high transaction fee. Every chain claims that they could lower fees to negligible level. Each virtual machine has different rules on how to charge fees. Regardless of those internal accounting mechanisms, the hard truth is that there are fundamental costs to on-chain data storage. There are two major categories: resource cost and market premium.
Market premium is due to supply and demand. When the throughput of an open blockchain’s virtual machine has abundance of free capacity, the fee would match the costs that network operators contributed to running and securing the network. When there is high demand for the limited capacity, there is going to be a premium surcharge.
Network participants commit resource to be part of the network. They are rational actors who are expected to at least recuperate their investment. Different blockchain mechanisms require varying levels of capital commitments. Broadly speaking, they are server cost and capital cost. Server cost includes the cost of hardware, network, electricity, physical property, and maintenance. Capital cost is not always needed, but it is an essential part of proof of stake systems.
Server Cost | Capital Cost | |
---|---|---|
Proof of Work | high | n/a |
Proof of Authority | low | n/a |
Proof of Stake | low | high |
Rollup Chain (Proof of Stake DA) | low | medium |
Proof of work chains require many expensive mining servers to maintain the network. Hardware and electricity cost are high. That cost has to pass on to the chain users. There are many estimates that put the cost of 51% attack of Bitcoin or Ethereum at the $10s billion range. We assume that amortized cost is $1 billion annually, the on-chain state accumulation limit is 100GB, and each data point is 100 byte. A per data point cost would be $1.00. This level of fee would completely eliminates any meaningful NFT use cases that we spoke of earlier. Note this ball park estimate ignores many of intricate details of a blockchain’s fee structure. However, it is not far off from real world data. For example, 128 byte in Ethereum with ETH price at $1000 would be $2.4.
Proof of stake chains have a high requirement on capital. The server cost of a PoS chain depends on how many nodes need to maintain the state to process and validate the transactions. This is not going to be significant. Even if there are 10,000 validator nodes with each costing $2000 per year, the total cost is only $20 million. The capital cost is mucher higher. For example, Ethereum has a $20 billion of ETH staked for the Beacon Chain. Other prominent chains are also in the range of billions or above, e.g. Solana at $15 billion, Near at $2 billion, and Polkadot at $5 billion. Assume that the staked capital expects a 5% annual return. The annual cost is $1 billion for Ethereum. This cost is similar to proof of work. It might be more environmentally friendly to switch from proof of work to proof of stake. But the ultimate cost that the chain users have to bear might be similar because the staked capital needs to be compensated. In the short run, they could be rewarded by network value inflation and growth. In the long run, network users has to pay for cost of securing capital.
Proof of authority chains could have a low fundamental cost. Let say the chain only has 7 validators. Each server cost $10,000 annually. The total cost is only $70,000. It does not have any additional capital cost. The drawback is centralization risk. The nodes operators are fixed. These node operators could be hacked1.
Rollup chains has a very low cost due to validity, and the cost of data availability (DA) varies depending on its trust model. The fundamental cost of a rollup chain is mostly on data availability. A rollup chain has a few key costs: server cost of sequencers, state transition validity, and data availability. The server cost of sequencers is negligible because it operates with a 1/n security assumption. The network only needs a single honest sequencer, or it could be zero if user is allowed to submit transactions to the main chain. The cost of state transition validity is also negligible because it is a fixed cost that could be amortized over 1000s or 10,000s of transactions. The key cost that scales linearly with the amount of on-chain data is data availability. The cost of data availability depends on how that data is secured. For example, if it is proof of authority, it is cheap. Solutions that are termed data availability committee (DAC) is just another name that the data availability layer is secured by proof of authority.
It is harder to make ball park estimate on the fundamental cost of how dedicated DA chains translate into on-chain data cost. For example, say a rollup chain wants to accumulate 100 GB of on-chain data. Let’s assume that it corresponds to 1 TB of transaction data that have to be post to the data availability layer. It is important to note that this 1 TB could be split up and erasure coded, where each data chunk is assigned only to less than 10 storing nodes. They do not need to be replicated hundreds of times to guaranteed persistence even in a completely open, decentralized network. Depending on the desired level of security, this could be secured by $1 billion or $10 million. That is the deciding factor on how costly this component is. It is important to note that for a fixed stake, say $1 billion, the more data it is pledged against, the less secure the individual data chucks. The reasoning is simple. The stake has to spread to more data nodes, the cost of getting slashed for reach node is smaller. It lowers the cost for an attacker to bribe the necessary nodes to censor or delete a particular data chuck.
Low Cost, Scalable Onchain Metadata¶
Future NFT use cases will demand a low cost, horizontal scalable on-chain metadata and state transitions operating on those metadata. The use cases will be especially prevalent in metaverse type applications.
Deploying NFTs as smart contract on a sharded blockchain is suboptimal. First, a general purpose sharded blockchain usually do not allow smart contracts to control shard location. Shard control is useful because it could take advantage of the simplicity of the NFT data model to shard state and corresponding state transitions. For example, it could have split the NFT state into primary and secondary shards. The primary shard involves state transitions that need to be synchronized with to the NFTs. the second shard only involves in self-contained state transitions.
Second, existing general purpose, proof of stake chain is expensive. Those costs are fundamental to how cost of securing the network, and hence they will not go down in the future. Use cases of programming on-chain metadata should be cheaper than transaction that involves value exchanges. A general purpose chain is not likely to offer those low cost transactions because it would require different shards to have different cryptoeconomic security levels.
Third, working with smart contract does not lead to a great developer experience. Game developers who want to integrate NFTs should not need to learn the intricacies of writing and deploy smart contract. Smart contracts are not yet mature, and bugs to lead to direct loss of funds. The developer experience of programming and using NFTs should be through APIs and SDK.
A proof of stake, sharded NFT blockchain is a straight forward way to remedy the challenges mentioned above. The NFT data model is much easier to shard than a general purpose virtual machine. The sharding design could accomodate both NFT specific or cross-shard transitions. The network could put different cryptoeconomic requirements on different shard types to fine tune the tradeoff between cost and security.
It should be noted that a proof of stake, sharded NFT chain could be implemented as a sharded rollup chain. The core of the two chains are the same. They could have the same sharding model, same set of transaction capabilities, and the exactly the same APIs and SDK. The key difference is on how to they maintain data availability and state validity. In the case of proof of stake chain, each shard is assigned a set of staked validators that guarantee both data availability and state validity. In the case of a rollup chain, state validity is through validity proof, and data availability must be handled by a separate network.
A sharded rollup chain is a valid approach to deliver low cost, horizontal scalable on-chain metadata. It should be noted that data availability solutions and validity proof are not as mature as the proof of stake technology. Proof of stake technology is still a complex technology, but it is fairly well known and is already in wide spread production use. There are already many mainnet proof of stake blockchains that built on cosmos-sdk and tendermint. On the other hand, dedicated data availability chains are still a work in progress [celetia, ethereum data shard]. For validity proofs, an application specific rollup chain would have to write its own validity proof system. A generic zk proof system such as Starkex only support limited use cases, e.g. ERC20, ERC-721, ERC-1155. The platform would reequire a a zk development framework that allow users to easily write on-chain verifiers and generate off-chain proof for arbitrary programs.
Final Remarks¶
I believe in a world where future use cases will be many order of magnitudes of what we are seeing today. They will be mini-programs that govern interactions in gaming and in media. These programs will require on-chain operations that could only be satisfied with a horizontally scalable solution.
On-chain data is always going to relatively expensive because their availability and validity have to be guaranteed by some kind of scarce resources. That scarce resource could be staking tokens, hash rate, or trust on central authority. Proof of stake is probably the best compromise, but cost of capital is going to pass on to the users. It could still be prohibitively expensive for some NFT use cases. The NFT blockchain should be able to differentiate that different behaviors has different security requirements. For example, updating the residence zip-code of a metaverse character should not be as expensive as trading an NFT. There will be dedicated NFT chains that allow the developers to control those parameters and adjust the fee structure to match their NFT use cases.
Footnotes¶
- Ronin bridge is an example ↩