I discuss three categories of L2 solutions: state channel, sidechain, and transaction aggregation. State channel is simple but limited. I do not think sidechain should be considered a L2 solution because it has to manage a security model of its own that is comparable in scope to running yet another L1 chain. Transaction aggregation has the most variety and is getting the most attention in research and development.
State Channel
State channel is the first L2 technique that is used to conduct secure peer to peer interaction. Payment channel is the easiest to understand. For example, two parties A and B setup a channel when both parties commit 100 dollars each. The setup is on chain. If A sends money to B, A signs off on an updated state \((A=90, B=100, nonce=1)\). As long as B receives this state with signature, B would be sure that payment is finalized. Any payment could take this form. After the 1000th payment, the state could look like \((A=20, B=180, nonce=1000)\). As long as both of them signed this state, and either party is sure that she had not signed anything with nonce greater or equal to 1000. They could cryptographically trust all the previous payments are correct. Whenever either party wants to close the channel, they would have to submit an onchain transaction. The idea of payment channel could be easily extended to state channel to allow arbitrary state transitions.
The key advantage of state channel is instantaneous p2p interaction without any onchain execution. There are some limitations as well. First, state channel has a setup cost. Transactions could only happen for those who have already setup the channel. Second, state channel locks up fund. This locked funding also represents the transaction limit between the two parties. Third, participants have to keep track of the channel states. If they lose the latest and the counter-party refuses to provide a copy, the counter-party could use a previous state to close the chanel. Fourth, state channel is only limited to two parties.
State channel could be extended to use a hub model. In the hub model, each user sets up a state channel with the same central X. Any transaction between A and B could conducted through X. This enables any pairwise transaction between participants that have a channel open with central operator X. This introduces a different issue, a centralization risk. If X goes offline, the entire system is down. X could censor certain transactions. Connext vector implements such a protocol. This technique is used as micropayment for the TheGraph.
The hub model could be extended further still to a network model. Instead of one hub, there are many interconnected hubs. With the help of the network, a user could use the hubs to find a path to any other counter-party. This is the basic idea behind lighting network.
Sidechain
A sidechain functions effectively as an independent L1 chain. A sidechain runs in parallel of the mainchain. A sidechain has to operate its own consensus mechanism. If the sidechain’s security is compromised, the mainchain is not impacted but also, the mainchain does not provide any guarantee to the users of the sidechain. The activities in the sidechain are independent of those happening on the mainchain until those activities are bridged back into the sidechain. For almost all intents and purposes, a sidechain behaves like an independent L1 chain with bridges.
In general, I would not consider sidechains a layer 2 solution. However, Polygon’s proof of state (PoS) chain is often mentioned as a layer 2 solution. In fact, many people consider Polygon to be one of the most successful L2 solutions. I disagree with that categorization. I put Polygon PoS in the category of a sidechain.
Polygon PoS
Polygon PoS is a blockchain built on top of the Tendermint consensus engine. Validators stake their MATIC tokens on Ethereum. The validators on the Polygon PoS Chain must stake their MATIC tokens and run a full node. The validator set is maintained in Ethereum smart contracts. A Merkle hash of polygon sidechain state is posted to Ethereum as a checkpoint. Assets are connected between Ethereum and Polygon through two way bridges. Polygon offers two bridges. The PoS bridge takes advantage of the validator set to confirm the transfers. The Plasma bridge uses a fraud proof mechanism to secure the transfers. It is important to note that the use of Plasms here only refers to the bridge. The EVM activities inside Polygon are not aggregated as Plasma checkpoint. The EVM activities are aggregated into a state hash checkpoint. If the PoS validators go rogue, users cannot challenge the Polygon transactions in L1.
This PoS chain differentiates from other L1 chains, e.g. NEAR protocol or Avalanche, by maintaining its validator state and checkpointing its L2 state on Ethereum. These differentiations alow Polygon to call itself a committed sidechain instead of just another EVM compatible blockchain. Polygon maintains its validator set and their state on a smart contract. The Polygon validators periodically checkpoint the Merkle hash of the chain’s state onto Ethereum. Instead of having these two system level state information on Ethereum, Polygon could just maintain them as part its L2 state by the validator set. Using smart contract to store these information has some values. First, the fork choice and rule of finality are backed by Ethereum. It eliminates the need to design and implement those mechanisms. L2 transactions are final as soon as those checkpoints are accepted by in L1. Second, there is data availability guarantee on its validator set and state checkpoints. However, these are only marginal benefits. The key security guarantee is the proof of state algorithm, the key data availability guarantee is on the set of validators keeping a copy of the all the historical transactions.
Polygon as a committed sidechain is only marginally more secure than a hypothetical Polygon PoS that is set up as a completely independent L1 chain.
Transaction Aggregation
I use the phrase transaction aggregation as the category for many of the L2 solutions that are differentiated by their choices in two dimensions: the choice of verification mechanism and the choice of data availability (DA) solutions. The table below illustrates how to further categorize these different approaches. I won’t be surprise if this table will get additional columns or rows in the near future.
zk | fraud proof | |
---|---|---|
onchain tx data | zk rollup (Aztec, zksync, StarkNet) | optimistic rollup (Optimism, Arbitrum) |
offchain tx data | validium | plasma (OMG) |
user choice | volition (ImmutableX) | n/a |
data availability committee | (ImmutableX) | n/a |
data availability L1 chains | (Celestia, Polygon Avail) | n/a |
Note that some categories have specific category names, e.g zk rollup, validium, and plasma. Some categories only have examples but no category name, e.g Celestia uses zk proof and another L1 blockchain for data availability. And yet, some categories do really have any working examples or category name; no one is working on those solutions.
All of these L2 solutions process and aggregate individual transactions outside of the mainchain. Aggregation works as follow. Let say the starting L2 state is S0, this L2 state hash1 is stored onchain in the mainchain. The L2 chain receives 1000 transactions. The L2 engine processes all of them and performs state transitions so the newest L2 state is S1000. The hash of S1000 is posted to the mainchain. There are two questions that the L2 protocol has to make clear: How do users know that the hash of state S1000 is valid? How do users reconstruct the state S1000 that corresponds to the hash? A L2 protocol has to provide a verification mechanism of the state transitions. The most popular mechanisms are zero knowledge proof and fraud proof challenge. Second, the L2 protocol has to allow the user to reconstruct state S1000 from state S0. For that to happen, the original transaction data have to be available to everyone2.
The choice of verification mechanism is usually between zero knowledge proof and fraud proof challenge. I do not rule out that there are different techniques developed in the future. A zk proof cryptographically ensure state transitions are valid. A fraud proof challenge mechanism incentivizes users or watchers to validate the posted state transitions. There is a waiting period for the newly posted state to become finalized. Anyone could validate the posted state, and if they see something wrong, they post a fraud proof to revert the state and claim a reward. Zk proof has the advantage that posted state is finalized instantaneously. Zk proof could have additional privacy features as well. However, zk proofs requires heavy computations both in generation and verification. Zk proofs constructions are difficult to understand and audit, and the toolings and development experiences of zk framework are still a work in progress. See zk rollup discussion below for more details.
The spectrum of choices in data availability (DA) exists because onchain data is expensive. How to make transaction data available is a tradeoff between data availability guarantee and cost. If the transaction data is not available to users, users cannot verify the state transitions and cannot access their accounts. By some estimate, calldata3 costs about 16 gas per byte in Etheruem. Assuming a gas cost of 60 gwei and $3000/ETH, 1kb of data costs close to $3 USD. At this price, both zk and optimistic rollup will have its limit both on overall throughput as well as high transaction costs. Even with compression, a zk rollup is still going to take up 10-15 bytes of data, so a rollup would cost $0.03 at a minimum. In practice, it is much high than that due to fix gas costs for each aggregation.
The alternatives to onchain data gives rise to a variety of strategies for data availabilities. It should be noted that offchain storage was widely considered by the community before the recent rise in popularity of rollup, which use onchain solutions. There are also an emerging trend to use a third party and hybrid solutions for data availability.
It should be noted that both zk and fraud proof are considered just as secure as L1. If data is also onchain, i.e. optimistic and zk rollup, it is generally accepted that they have the same level of security as L1. By offloading data availability away from the L1 chain, security is compromised to an extent. It does not mean it is not secure. One has to evaluate a data availability solution on its own.
Optimistic Rollup (Optimism, Arbitrum)
Rollup moves computation offchain, but all the original transactions data are kept on chain. For each set of transactions that a rollup protocol aggregates, a single summary transaction is submitted to the L1 chain. In this summary transaction, it contains the most up-to-date state of the rollup chain, and in the calldata field, it contains all the original transactions. They could be greatly compressed if the rollup is a special purpose protocol. If the protocol is general purpose, the compression level is less.
Both Optimism and Arbitrum implement an EVM inside their rollup chains. Their rollup chains have a full full feature EVM.
Optimism and Arbitrum have different implementation details. They implement a different virtual machine, even though both provide tooling to support solidity smart contracts. Their proof fraud verification mechanisms are different. Optimism is single round and Arbitrum multi-round fraud proof. See more details on other posts that discuss and compare both projects (post1, post2].
Zk Rollup (StarkNet, Zksync, Aztec)
The development of zk rollup is behind optimistic rollup. The main reason is that zero knowledge computing platform and tooling are still early. It is still hard to generate zk proof for general purpose computations; hence it is hard to make full feature EVM zk rollup chain. The hope that is technology will improve overtime, and zero knowledge platform will become sufficient mature and general purpose soon.
Many people believe that zk rollup would eventually be the better technology of the two flavors of rollups. Zk rollup has some advantages over optimistic flavor. Zk rollup has fast withdrawal period. Zk rollup allow for higher level of data compression because verification data could be collapsed into the zk proof.
StarkNet is a zk rollup chain deployed on Ethereum. It is the flagship project of the StarkWare, a company that also develops zero knowledge friendly language and platform Cairo that aims to be the tooling for generating validity proofs for general computation. StarkNet does not natively support EVM yet, but it supports smart contracts that are written in Cairo. It is not hard to see that Solidity could be transpiled into high level Cairo or compiled into Cairo compatible byte code. StarkNet allows L1 and L2 smart contracts to interact. The interaction is a form of message passing. It takes advantage of the fast finality of zk rollup. Unlike optimistic rollup, as soon as the L2 layer posts the aggregated state transition to L1, those state transitions are considered final. The messages calls could be posted in the same L1 transaction as the L2 state hash and zk proof update.
There are two other zk rollup solutions. Zksync is already on mainnet. Aztec is only on testnet. Similar to StarkNet, they are not offering a general purpose EVM yet. Both of them only support ETH and ERC20 token transfer so far. I have not spent the time to understand them in details.
Plasma
Plasma was once touted as the most likely candidate to scale Etheruem. Plasma was first proposed in this paper. There are a few additional iterations: MinimumViable Plasma (MVP) plasma, Plasma Cash, and then MoreViable plasma.
Plasma never took off for a few reasons4. First, Plasma stores data offchain, it needs a solution for data availability. The proposed solutions could be a centralized operator, PoS, or PoA system. Because the operator(s) have control over data, most proposed implementation also give them sole control to be able to submit aggregation to the chain. The data availability layer and aggregation layer make a Plasma chain to behave effectively as a sidechain. The security of the overall system is as secure as the PoS or the centralized operator layer. Second, Plasma has a rather complex exit game. This problem is compounded because transaction data is only available to the validator set or the centralized operator in most implementations. Lastly, other techniques for data availability become popular. Teams that were working on Plasma solutions have one way or the other pivoted to working on a variant that uses similar fraud proof but a different data availability strategy.
Plasma is not implemented for good reasons. When it was first proposed, the important of data availability was somewhat obscured. The blockchain community was mostly focused on developing consensus algorithm to ensure correctness of computation. It is much more clear now that we could prove correctness by cryptographic proof or a challenge mechanism. The harder part is to ensure that everyone could have access the data to verify state transition or generate fraud proof. In many ways, the issues encountered in building a plasma system naturally gives rise to the variety of data availability solutions today.
Data Availability L1 Chains
Celestium is an L2 solution that uses Celestia for data availability. Celetia is an independent data availability blockchain. Celestia itself is a proof of stake chain. It uses data sampling technique to ensure data availability. It has not launched its mainnet yet. Celestium has not publicly stated whether it will use zk or fraud proof as their verification mechanism.
Polygon is also working on a general purpose data availability layer, known as Polygon Avail. It is based on Cosmos SDK, data sampling technique, and KZG polynomial commitment.
The development of Celestia and Poly Avail aim to fill a void. Of the two dimensions in the categorization table above, the dimension of verification scheme is somewhat well settled. The community know that fraud proof mechanism is a matter of implementation, and the community is improving on the zero knowledge proof ecosystem. Once the zk proof stack is mature, the community expect zk rollup will become the go-to verification scheme. For the other dimension of data availability, onchain is most the easiest and the easiest solution, but it is also the most expensive. Ethereum is aiming to fix it, thus becoming an effective data availability chain. Polygon Avail and Celestia are attempting to accomplish the same thing.
Volition
Volition follows a simple idea that user could choose where the transaction data go. The user pays for the corresponding level of data availability. Onchain data costs the most, and offchain data costs nothing because there is no guarantee that the transaction data is recoverable. The existence of this category demonstrates that data availability is a tradeoff between cost and availability guarantee.
ImmutableX allows user to choose whether transaction data to go onchain or to be validated by a data availability committee.
ImmutableX
ImmutableX is not a generic EVM platform. The protocol is designed specifically for NFTs. The protocol makes it easy to transfer, trade, and mint NFTs.
ImmutableX aggregate ImmutableX protocol (L2) transactions and generate a zk proof. Its proof generation engine is developed in partnership with StarkWare, using StarkEx prover and verifier.
ImmutableX uses a hybrid DA solution. It allows users to choose post the transaction data onchain. It also allows user to use a offchain mode. The offchain mode uses the strategy of Data Availability Committee (DAC). All members of the committee have to sign off that they have received a copy of the data. It is a 1 of n security model, in that only one of the committee members need to be an honest participant for data to stay available.
References
- A blog post about data availability landscape
- A reddit thread on shared security
- Vitalik’s writeup on rollup
Footnotes
- One example is a Merkle Hash. But it could be a generic polynomial or even inner product commitment as well. See polynomial commitment. ↩
- There is one key difference between the zk and fraud proof approach. If there are verification only information contained in the original transaction, e.g. signatures, those information could be ignored. A ZK proof on state transitions already retains the cryptographic evidences that all the state transitions are valid. The only data needed to be made available is the data that allow for reconstruction of state transitions from S0 to S1000. If verification is through fraud proof, all the original transaction data have to be available. ↩
- Note that calldata is not stored in EVM memory. This data is not accessible by smart contracts. It is already much cheaper than bytes stored in EVM memory. ↩
- See post1 and post2 for more details ↩