A non-fungible token (NFT) represents ownership of a non-interchangeable unit of data. NFT is a powerful primitive because it enables natively digital ownership. Existing use cases of NFTs are still at the beginning of this evolution. The ERC721 standard has only a few functions that are programmed primarily on the ownership variable, allowing users to create, own, and trade their digital assets. Future NFTs will be programs of a more diverse and sophisticated set of state variables. For example, NFTs could represent a metaverse playing character. The player’s history, status, and relationships could be represented by the NFT state and state transitions. The NFT’s program defines how the character could or could not interact with its environments.
NFT platforms are the decentralized infrastructure where NFTs are minted, updated, and traded. Platforms are different from NFT marketplaces. Users buy and sell NFTs in marketplaces, and marketplaces are built on top of NFT platforms. Examples of NFT marketplaces are OpenSea, NBA Top Shot, or Autograph. The core of an NFT platform is a blockchain, the decentralized database where NFTs are defined and created. NFT platform could include more offerings, such as SDK, bridges, managed services, etc. Examples of NFT platforms are Ethereum, Solana, and ImmutableX, and Stargaze.
In this post, I describe the NFT data model as a foundation to understand NFT platforms. I then compare and discuss the features of NFT platforms: blockchain type, programmability, scalability, media storage and delivery, and developer experience.
NFT Data Model
NFTs are simple, specialized state machines. A state machine has state and transition functions. An NFT has three types of data: an owner, metadata, and the unique media content. An NFT has an owner. Although it is not required, most NFTs are uniquely associated with some media content. An NFT also has a set of state transitions that define the capabilities of the NFT.
Let’s make these ideas concrete. Let’s take ERC721 as an example. A typical ERC721 implementation contains the following on-chain data: token_id
, owner address
, name
, and symbol
. All the public and private functions only depend on the variable owner address
. A common extension includes an off-chain metadata specification, where an NFT also contains a token_url
. The url is a link to some off-chain bytes. These off-chain bytes are usually required to be a json file that contains key-value pairs such as description, image_url, etc. These key-value pairs are metadata. It should be noted that the on-chain data points name
and symbol
are also metadata. The third data type is the actual media content. It is referenced as a special off-chain metadata in the image_url
key. Its value is a yet another link, pointing to the off-chain location where the media content is stored.
The distinction of on-chain variables and off-chain metadata are important because off-chain data is not programmable. For example, an in-game NFT representing a property such that it is only transferable if it has a house built on it. This simple verification check could not be coded again the metadata has_house: bool
1. Other than programmability, off-chain metadata introduces unnecessary dependencies. The data’s availability is not guaranteed by the platform but by a third party service provider. For example, one popular solution is to pay for Pinata’s pinning service to ensure that the media content is available through a IPFS query.
Off-chain media content is not programmable. For example, someone wants to program an audio NFT such that the first 1000 listeners would get rewarded. The NFT’s on-chain state machine only contains a reference to its media. The media is stored and delivered by a third party service, and the service does not talk back to the state machine to inform who and how the media is used.
Comparing NFT Platforms
Future NFTs will be mini-programs that does not just operate on ownership data but also on arbitrary metadata and the usage behaviors of associated media content. For an example in gaming, monster cards could be combined to form a more powerful card. Those rules will be on-chain transition functions. For an example in the creator economy, a song represented by an NFT could monetize directly if the song becomes popular and garner millions of streams.
I will discuss the key aspects of how to evaluate NFT platforms: blockchain type, programmability, scalability, media storage and delivery, and developer experience.
Blockchain Type
An NFT platform is a place where the NFT data model could be deployed. There are two common patterns to accomplish that. The first pattern is via smart contracts. The second pattern is to build application specific blockchains that natively encodes the NFT state machine.
Any general purpose L1 and L2 chains could be used as a NFT platform. Examples are Ethereum, Solana, Flow, Polygon, Optimism, and Starknet. These blockchains do not natively understand the concept of NFTs. Developers implement the state machine that encapsulates the NFT data model. In practice, the community of a blockchain agrees on some standards that correspond to basic functionalities of NFTs.
The second type of NFT platforms are decentralized protocols that are purpose-built for NFTs. The core state machine of a general purpose blockchain is a virtual machine. Instead, NFT specific chains implement the NFT data model directly. NFT behaviors are explicitly coded as state transition functions in the core protocol. ImmutableX is an example of a purpose-built state machine. Developers do not write the NFT data model in smart contracts. NFT functionalities are explicitly defined as core concepts by the protocol.
Smart contract NFTs inherit blockchain characteristics of the host blockchain. The blockchain does the heavy lifting on making available a general purpose, deterministic state machine. An NFT is a small program that runs inside that virtual machine.
ImmutableX is an application-specific, zero-knowledge (zk) rollup chain built on the Starkex platform and custom data availability solution. The chain sequence and process transactions in batches. The hash of the final state of each batch and a zk validity proof are posted to Ethereum. The ImmutableX platform relies on STARK validity proof to guarantee the validity of NFT state transitions. However, the protocol stores raw transactions in a custom network that is run by a data availability committee, which is equivalent to proof of authority.
Programmability
For example, a developer wants to program an NFT to represent a metaverse character. The character’s skill development path should be governed by a set of rules. The character has to first attend woodworking training to gain certification. The certification unlocks her ability to enter quests.
Future NFT use cases will expand greatly beyond the ERC721 standard. There are already many proposed standards on the Ethereum platforms, e.g. Permit, Micropayments, Subscription, Entangled Token, Multi Tokens, Re-Fungible Token, etc.
There will also be custom use cases that won’t be standardized. For example, a monster card game is encoded by NFTs. Cards could be combined. Cards could get updated based on previous battles. Each card’s metadata is recorded on-chain. The cards’ state transitions are different for each card game. For a second example, a developer wants to program an NFT to represent a sword that is interoperable in many games. There are specific rules that the sword could get upgraded. Special gemstones could be melted into the sword, or special monster kills could enhance a sword’s characteristics. The rules are written as the state transition functions, and on-chain metadata are used to record the sword’s current status. For a third example, a developer wants to program an NFT sword to be non-transferable once the sword gains a certain status. This is similar to soul bound token, with the exception that the NFT is soul bound to another in-game character, which could be another NFT. There will be countless such examples.
There are also use cases that require the NFT program to operate on media usage data. For example, a developer wants to program an audio NFT such that the first 1000 listeners would get rewarded. For a second example, a developer wants to program a video NFT such that the NFT owner could claim a reward whenever the video has a 1000 incremental views. A creator works with a sponsor to create a video that embeds a short message from the sponsor, and the sponsor contributes to a reward. If the NFT transition functions only contain a reference to its media, the two previous examples cannot be programmed. When the media is stored and delivered by a third party service, the NFT state machine cannot learn who and how the NFT media is used.
NFT platforms need to satisfy two requirements to allow for these future use cases. First, the platform allows a scripting environment to program user defined state transition functions. Second, the scripting environment has access to arbitrary on-chain metadata and metadata associated with media usage.
Smart contract NFT platform trivially allows for on-chain metadata and scripting environments. The key challenge for smart contract NFTs is that on-chain metadata is expensive. This will be discussed more in the scalability subsection. Another challenge is that smart contract platforms do not have access to media usage data. This feature could be built separately to complement a smart contract platform. At a minimum, it requires coordination between the NFT state machine and the content delivery network. This will be discussed more in media storage and delivery subsection.
ImmutableX does not allow user defined NFT behaviors. ImmutableX only supports basic ERC721 behaviors, so it does not need to allow for user defined programming. Furthermore, it depends on Starkex to generate and verify validity proof. Starkex only supports basic token operations that are defined in ERC-20, ERC-721, and ERC-1155. Starkex cannot verify state transitions of custom state machines, even those state machines were written in specialized Cairo programs. The choice of zk rollup makes adding user defined state transition more complicated. It would require their zk validity proof system to have the ability to interpret and validate user defined scripts because all the NFT level state transitions have to be aggregated into a single zk validation proof. Beyond scripting, ImmutableX has to add support for generic metadata. It is hard to envision that ImmutableX can add support for media storage and delivery.
Stargaze is supposed to be a purpose-built application blockchain, but it is built as a blockchain with a general purpose virtual machine. The NFTs are implemented as smart contracts written in webassembly. The key difference is that developers are not allowed to deploy smart contracts directly. User defined smart contracts are posted on a forum and voted into deployment by a governance protocol. Scalability
The discussion of blockchain scalability should be application specific. The most common metric of blockchain scalability focuses on the metric of transactions per second (TPS), which incorporates data from block time, block size, and transaction size. However, applications have different kinds of transactions. For example, some applications require a lot of transactions that create new key-value pairs on-chain. These transactions increase the state size incrementally. The blockchain has to deal with a storage bottleneck because transaction processors have to maintain the state in local storage. In a different example, payment applications mostly create transactions that are about fungible token transfers. The transactions are small, require low compute resources to process, and do not accumulate into a large state. TPS is a excellent metric for this type of application. For a third example, if transactions are computationally expensive, the bottleneck might be on processing power of validators. Different transaction types push a blockchain’s boundaries in different ways. It could be bottlenecks of physical limitation in storage, CPU, or network. Or it could be bottlenecks that are imposed by design choices such as in gossip, consensus, and data availability mechanisms.
Future use cases of NFT will increase the demand for on-chain metadata and user defined transactions that operate on those metadata. These use cases might push the limit on blockchain design choices. I will not go into details of the limitations of those design choices in this post. Instead, I will discuss the physical limitations that impact all blockchains regardless of how the implementationof mempool, gossip, consensus, and blockchain data model. Processing nodes might be able to avoid the bottleneck in network bandwidth or CPU resources. Network could reach 100MB/s, and it is possible there are hardware and software solutions that could allow a single machine to handle that data rate. Storage is a hard limit. NFT use cases will incrementally put more and more metadata on-chain. Programmability requires the state to be read and write accessible. The state grows indefinitely. If the accumulation rate is just 1 MB/s, the state grows to 30 TB in one year. The rest of the blockchain components might be able to handle transactions that summed to 100MB/s, but if even just 1% of those transactions are metadata updates, a non-sharded blockchain will be bottlenecked by storage.
There are two common solutions to expand the state size beyond a single machine. One is state expiry and another is state sharding. The first strategy prunes the old data from the state. The strategy has to allow expired data to be recovered. The main drawback is dependence on third party services to hold the expired data. The second strategy is sharding. When the state grows beyond what could be handled in a single machine, the system uses more machines. This approach does not just scale storage, it increases throughput for network and CPU resources as well.
As NFT usage of on-chain metadata increases overtime, blockchains will have to deal with the problem of large state. Ethereum approaches the problem with both a state expiry solution and sharding. Some blockchains are built as sharded chains, e.g. Near, Polkadot, and Sui. But some blockchains are not sharded, e.g. Solana, Polygon, ImmutableX. It is worth noting that a layer 2 chain such as ImmutableX could be sharded as well. They could shard their chain state and have different sequencing nodes for each shard. Each shard could still use the same non-sharded L1 layer to maintain state and verify validation proofs.
There is no existing NFT platform that guarantees low cost, horizontally scalable on-chain metadata. The state of non-sharded blockchains will be limited to a single machine. It is also difficult to fully harness the scalability of sharded Even for smart contract NFTs on sharded blockchains, NFTs
Sharded blockchains usually do not expose sharding control to smart contracts. For NFT smart contracts to fully leverage sharding, they have to adapt to the specific sharding architecture of the host chain so NFTs could interoperate across shards. A lot of the popular NFT smart contract platforms are blockchains that are optimized financial applications. On-chain data is too expensive for the purpose of NFT programmability.
One solution is to build a sharded, NFT-specific blockchain. A blockchain of this kind could shard the chain according to the NFT data model and behavior. A lot of NFT behaviors are self-contained or limited to interact with closely related NFTs. These patterns are easily parallelizable. The chain should tailor the security requirement and cost structure for NFT use cases.
Media Storage and Delivery
Existing NFT platforms are designed to support collectible NFTs. Existing NFT platforms do not handle media storage and delivery. It is important to note that on-chain media storage is not feasible. The generally accepted practice is that the NFT contains a reference to the media. The media is stored and delivered through third party solutions that do not communicate with the state machines that manage the NFTs.
A NFT platform that integrates the handling of media storage and delivery offers two major benefits. The first benefit is convenience and decentralization. NFT developers do not need to go to a centrally operated third party service provider to host the media. The second benefit is NFT programmability. When users interact with third party services to retrieve the media, the NFT platform cannot exercise any control over how that media is consumed or how to react to the usage behavior.
This feature could be seen as an incentivized, peer-to-peer content network. The nodes participating in the delivery network interact directly with client applications. They fulfill media requests. These content nodes want to be paid for their work. The content nodes report back to a blockchain about their work and get rewarded there.
This component could be built on top of smart contract NFT platforms as well. The blockchain network and the delivery network are connected via usage report transactions. The content nodes submit transactions to claim rewards. In theory, these user defined functions could be defined on any smart contract platform. The key challenge is that usage reports might be large transactions and require heavy use of on-chain metadata.
I am not aware that anyone is working on such a solution as of the writing of this post.
Developer Experiences
Existing NFT platforms is not user friendly. Choosing a platform usually requires in-depth knowledge of decentralized infrastructure. Developers have to learn the best practices of smart contract systems because security is a must-have. Developers have to integrate with NFT marketplaces. They have to provide guidance to users on how to manage wallets, acquire cryptocurrencies, and navigate blockchain transactions. Developers might also need to develop purpose-built user interfaces so users could access and manage NFTs in contexts that are not already covered by existing marketplaces.
ImmutableX alleviates one aspect of the challenges listed above. ImmutableX allows developers to interact with their chain through APIs calls. Developers do not need to write smart contracts to define NFT behaviors because all the available behaviors are predefined. The platform also provides users with a dedicated marketplace UI to interact with its NFTs.
Developers should have the option to implement NFT behaviors with managed services. For example, a game developer could simply sign up for a service and acquire an API key to get started on implementing NFT functionalities. It could be as easy as integrating an SMS texting service using Twilio. The platform could also enable game users to interact with NFTs without requiring them to navigate a web3 wallet. Both the developers and users should always have the option to reclaim custody of their assets. Developers should always have the option to interact directly with the decentralized layers should they choose to go with that route.
Footnotes
- It is possible to get around it with a complex setup of merkelizing the off-chain metadata state and the transition function takes in an additional argument in the form of a merkle proof. It makes a simple function very complicated. It also introduces the problem of data availability for the off-chain data because the function caller has to construct the merkle proof. ↩