Preliminaries

In this blog post, I discuss down my observations on the throughputs of some of the major blockchains: Ethereum 2, Cosmo, Polkadot, and Solana.

I choose to use byte throughout, byte/s, as a key metric to comparing different blockchains instead of transactions, tx/s. Most of of scalability comparisons focus on transaction throughput. See ex1, ex2, or ex3. Transactions per second could be a misleading metric because different data models and layer 2 solutions could lead to dramatically different byte size per transaction. For example, Ethereum transactions averaged to about 500 bytes. If a blockchain only uses a 8 byte address space and take advantage of aggregate signatures to limit signature size to 2 byte, a transaction would be 10 bytes. For another example, state channels could also be used to put data off-chain to achieve scalability. Byte is a more generic measure to compare blockchain throughputs.

Data on a public ledger could be used to encode vastly different messages, ranging from simple token transfers to arbitrary computations. Transaction data in Bitcoin is primarily used to do accounting for unspent transaction outputs. A general purpose blockchain such as Ethereum could use data to encode computations via op codes and contract states. Measuring a blockchain’s throughput in bytes allows us to ignore the blockchain’s computing paradigm, allowing us to focus on how its consensus mechanism and p2p network architecture limit what a blockchain could do.

Bitcoin’s block size is 1MB and there about 150 blocks per day. Bitcoin’s data throughput is about 1.5 KB/s. Ethereum 3.5 KB/s. The two blockchain’s data rates are remarkably similar. See Ethereum Average Block Size and Ethereum Blocks Per Day.

Different data rates support different applications. For example, much of the decentralized finance applications are similar in that they record infrequent buying and selling activities of some blockchain assets. Assume that I make 100 transactions a year, with each transaction being about 100 bytes. I only need to 10KB of data. If there are 10 million people using the platform, 100 GB/year is required. That is about 4 KB/s. Ethereum could support this type of applications as of today. In fact, defi applications are thriving in Ethereum today.

An application that uses more data would be a widely payment system. Visa is the probably the most frequently cited example showing the inadequate throughput of Bitcoin and Ethereum. Assuming Visa payment system processes 10,000 tx/s, it would be about 1 MB/s. This far exceeds the current 1-4 KB/s range in Ethereum. But it is insurmountable. We are likely to get 10-50x improvement in Eth2. If we get another performance boost through layer 2 technologies, it is not inconceivable that we will get to the payment system throughput requirement. I will discuss this later in the post.

Gaming, messaging, social networking, or online advertising system have a much higher data throughput requirement. I would want to envision a decentralized internet where my data is not owned by a centralized private company. Let us assume that an average American uses 100 KB per day in emails, textings, and social media platforms. Assume 300 million people. The platform would have to support 300 MB/s. That sounds a lot, but there are a lot of enterprise applications that easily process data volume much greater than that. It will be hard for any blockchain to support this type of data throughput. I will look into details later in this blog post to explain why. Applications in this category will likely use off-chain data processing and storage techniques. It is a useful reminder that decentralization does not all data and computation have to be pushed on-chain.

Eth2

Eth2 has by far the most detailed designs and open discussions of the all the major blockchains. The proposed specs shed the most light on what blockchain platforms will look like in the near future.

Eth2 is a sharded blockchain. The initial target is 64 shards. The other important proposed targets are block size of 128 KB and a block time of 12 seconds. See mainnet spec, beacon spec, and note on shard chain proposal. The two key metrics that determine data throughput is the data capacity of one shard and the upper limit of the number of shards that the Eth2 could support.

Shard Capacity

The targeted throughput of one shard is roughly 10 KB/s. This target is already well calibrated for various theoretical and practical reasons. Vitalik has a good post explaining how the practical details of internet bandwidth, computing power, and storage dictate the limit the data throughput of a single shard. I do not fully agree with Vitalik’s stated limit on bandwidth because requiring a gigabit bandwidth does not increase centralization risk. There are sufficient people who have residential gigabit or are able to leverage cloud resources. The accumulated size the chain state is a hard limit. It is not inconceivable that network participants are required to have large storage devices. The users could have 10 TB of SSD storage to store active data. For simplicity of argument, assume that the most recent blocks from the last 3 years are active data, and older blockers could be put into deep, networked storage. This would lead to a throughput of 100 KB/s. With such a large state db, where the hot data spans 10 TB, the best we could do to provide fast access to the state db is through memory mapped indices. Even assuming that the nodes are required to have 64 GB of main memory, majority of reads and writes would still require the system to swap pages. Even if all other validation steps are instantaneous, reading state itself slows down the event processing. There is sufficient evidences in the last decade of enterprise experiences from data queries and stream processing frameworks that a database processing node with 10 TB of hot data is unlikely to perform well. If the chain accumulates about 1 TB of data over 3 years, the throughput is about 10 KB/s.

It should be noted that all of those limits could be increased. Instead of thinking the participating validator as a single machine, we could design that to be a processing cluster. The validator runs a cluster of workers to have both distributed storage and stream processing. However, this would dramatically increase centralization risk. The participants are likely to be enterprise users or highly motivated group of individuals. It would be easier for them to find each other and collude to compromise the integrity of the network.

It cannot be overstated the importance of decentralization. If data is stored and processed only by a few large well-known participants, the overall ecosystem will emerge to look similar to today’s internet. Decentralization requires the barriers of entry on participating in core network to be as small as possible. This requirement leads and some heuristics about shard throughput. For more details, see a post on scalability dilemma

It is also important to point out that the limit on per shard throughput is not due to specifics of consensus mechanisms. A consensus mechanics limits the latency (block time) of when data is processed, but it does not limit the targeted block size. For a fixed block time, block size is limited by the data throughput of the validating nodes.

One misunderstanding about overcoming the per shard capacity is that one could further shard the shards. That is a misguided solution. A transaction processing node in the network has an upper limit in its ability to hold a portion of the chain’s state and process the assigned transactions. The appropriate partitioning of the network lead to a sharding design. The network could add more shards or split the shards further, but the network does not uniformly benefit further from a two level sharding design unless there are specific justified transaction network topologies. For example, a layer 1 shard could correspond to countries, and a layer 2 shard could correspond to cities within countries. This type of network design is not likely to be general purpose. I will talk about transaction network topology in the next section.

Sharding Limit and Shard Network

The primary shard is called the Beacon Chain, which supports the consensus mechanism. There is fixed amount of work that the Beacon Chain has to perform to keep tab of validators, proposer committees, block headers, etc. If shards do not talk to each other, the per shard data on the Beacon Chain is minimum. The Beacon Chain should be able to support thousands of shards if the only data being put on Beacon is block headers and other administrative accountings. However, cross-shard communication is a requirement for Eth2.

The size of Beacon Chain scales linearly with the number of transactions that have cross-shard communications. Cross-shard linkage is still a work in progress. See roadmap, sharding research note, and research discussion. One implementation of cross-shard transactions could be aggregating log receipts in Beacon. Beacon would grow with the number of cross-shard transactions. If the fraction of cross-shard transactions is relatively stable, there would be a natural limit to the number of shards that Beacon could support. If the majority of the transactions are cross-shard, effectively Beacon has to make available all of those data. Jamming all the sharded data into one shard would quickly overwhelm the Beacon shard. The number of shards that Beacon could support is inversely proportionate to the fraction of transactions that are cross-shard. For example, if 1% of the transactions are cross-shard, we could expect Beacon to support 100 shards.

Combining the heuristics of one shard throughput and shard number limit, it would suggest that a reasonable byte throughput upper bound is 1 MB/s if the cross-shard rate is 1%. If the cross-shard rate is 0.1%, and we would get 10 MB/s. Referencing our discussions earlier, it would mean that a global payment system could be implemented on chain. But we won’t see micropayment system or social networks implemented directly on Eth2. Those applications will require layer 2 scaling solutions.

The bottleneck of cross-shard communications puts an upper bound on the connectedness of the sharded topology. Shard to shard communications could not exceed the limit of a per shard capacity. For example, a global payment system is implemented on chain. Each zipcode’s internal payments are placed in one shard. If the payment going inside and outside of that zipcode far exceeds the remaining capacity of the shard, the payment system gets overloaded. An application designer has to identify the eventual network architecture and understands if the bottleneck could be eliminated by grouping those users to a shard. It is possible that transaction topology is sufficiently connected that there is no sharding solution available given the per shard throughput. For those who are interested, I wrote a paper a long time ago on network bottleneck that had more details on how to identify and calculate network hotspots.

Cosmo

Cosmos is a decentralized network of independent parallel blockchains. Each cosmos-backed blockchain is secured by the Tindermint Byzantine Fault Tolerant (BFT) consensus. Application developers use the Cosmo SDK to write their own data model and state transition logic. A blockchain could exist completely independent of the rest of the Cosmo network. Each Cosmos blockchain is called a zone, and there are special Cosmos chains called hubs that coordinate cross-chain communications through IBC. Each zone has its own governance to decide how the validations are performed. The hub does not validate any transactions coming from zones.

Each zone is essentially is a parallel chain. The throughput of each zone falls under a similar set of scalability requirements of a single Ethereum shard, 10 KB/s. If transactions do not have to be cross-zone, we could create as many parallel zones as possible for different applications. As long as there is no single application that exceeds the capacity of a single zone, there is no limit on throughput.

However, an example application such as a global payment system would require many cross-chain communications. Such an application would have to designed with many zones and many hubs. Because cross-chain communications could be designed in any zone-hub connection, a payment application could divide users appropriately into different interconnected zone. The theoretical limit would be whether the overall network could support all the cross-chain links. For an extreme example, if users in the payment network form a complete network and there are frequent cross-links, even if each person takes up all the capacity of a dedicated zone and hub, the number of cross-links would overwhelm the dedicated hub. Most real life networks would be much more sparse. Still, application developers would have to be clever about managing sharding the users to different zones. This is no different than designing a sharded database in centralized web technologies. The key difference is that per zone capacity is small, and each zone’s cross-chain throughput could only take up a fraction of that.

Polkadot

Polkadot uses a master Relay Chain to coordinate consensus. Transactions are finalized when they are ordered by the master chain. The system has a universal set of validators that are assigned to validate each of the parachains. Validators do not store state information, but they store block headers and block proofs. Validators are capable of ensuring that the blocks are valid without the need to build out the chain’s history or current state. Each parachain has a set of Collators. Collators maintain the full data that is used to construct the state transitions for their assigned parachain. Collators generate the blocks and provide validity proofs of the generated blocks.

The per node storage and computing throughput of Collators limits the throughput of a parachain. Each collator has to store the full parachain. This means that each parachain faces the same shard limit as a single Eth2 shard assuming Collators use similar hardwares as Eth2 validators. Validators collectively ensure block proofs are available through erasure coding. Assuming block proof is 1000th of the block size, then it is unlikely that storing block proofs will be a bottleneck because each validator would only need to store 1000th many bytes to a Collator, multiplying the shard number, which is 100. Hence a parachain would process about 10 KB/s. In Polkadot’s proposal, it suggests that Collators are expected to run with enterprise level hardwares and network connections. We could expect the actual throughput to be multiple of that, gaining that at the expense of higher centralization risk.

Polkadot enables direct cross-chain interaction through cross-chain message passing (xcmp). The content of cross chain messages do not get to Relay Chain, but the metadata of the messages are included in the Relay Chain. Cross-chain messaging implies two limits. First, the cross-chain messages cannot exceed the capacity of a single parachain. Second, the combined metadata of all cross-chain messages on all parachains cannot exceed the capacity of Relay Chain. This is similar to Eth2 Beacon Chain. The exception is that Beacon Chain might include the actual cross-shard messages, and Polkadot Relay Chain only include metadata. The fundamental limit is similar: the combined cross-shard traffic capacity is the throughput of one single shard. For this reason and probably also due to other nuances, the number of parachain is limited to 100. I would expect the Polkadot throughput to be around 1-20 MB/s.

Solana

Solana is not proposed to be a sharded blockchain, at least not as of the writing of this post. Solana aims to push the blockchain’s throughput for a single machine. It has a high hardware requirements for validator. It recommends validator to have 16 cores, 256 GB of memory, TBs of SSD disk storage, and gigabit network.

Solana aims to obtain a transaction throughput that matches a network’s gigabit speed, 125 MB/s. It combines a number of techniques to remove network, consensus, and storage bottlenecks. The network has a concept of slot and epoch. An slot is about 400 ms, and an epoch is 432,000 slots, which is roughly 2-3 days. A leader is deterministically chosen every epoch. Its transaction forwarding protocol Gulf Stream ensures that the network could quickly route the transactions to the leader. The leader uses proof of history to order all the transactions. Tower BFT allows validators to quickly vote on the blocks. Fork choice rule is a result of proof of history and tower BFT. Furthermore, its turbine gossip protocol allows the out-going messages from the leader to reach validators without breaking the 125 MB/s network bottleneck. Furthermore, Solana proposes an archiving system to allow validator nodes to shed historical data to a storage network. These innovations are clever. It is believable that they could achieve the stated throughput at the network, consensus, and storage layer. That is, this throughput is possible if the state machine’s operations from processing each tx are trivial operations.

Processing transactions requires reading from and writing to the state. At 125 MB/s, it is about 4 PB per year. In theory, all of this data could be stateful updates. For example, if each transaction is a simple transaction writing key-value pair where the key is random and the value is strictly increasing of all existing values. The state size would grow quickly and easily exceed the limit of a single machine. Solana should not be able to handle the large state without distributing the reading and writing of the state, which are in the petabyte scale. The 125 MB/s throughput is only possible if almost all of the transaction data does not lead to stateful updates. This data processing problem is not unique to blockchain. Many enterprise companies have built data streaming processing framework to tackle similar problems. A processing system has to be able to split the work and coordinate stateful updates.

As mentioned before, Eth2’s data throughput is not limited by the shortcoming of its consensus mechanism. The 15 seconds block time in Eth1 is due to proof of work. The difficulty is adjusted so that blocks are created roughly once over that period. There is empirical research to show that 12 seconds delay is needed for the network to propagate and verify the transactions in a large decentralized system such as Bitcoin and Etheruem. The gossiping mechanism for propagating proposed blocks is slower than Solana’s approach. That slows down blocktime but it is not a bottleneck for the overall data throughput. There are a number of heuristics that could reduce the targeted block time: smaller blocks, separate block headers and block contents (see this post), etc. The true limit of data throughput is state size and transaction history, which is a result of storage requirement.

Solana proposes key innovations to reduce bottlenecks around network, consensus, and storage. I do not believe that Solana could handle the theoretical 125MB/s throughput for arbitrary transaction data. Let’s assume that a Solana validator could safely support a state size of 3 TB and that is accumulated in 3 years, and on averaged, 10% of the transaction data become the state. Solana would be able to support about 1 MB/s. That is still really impressive throughput.

Final Words

The throughput of a single shard blockchain is limited. This limit could be increased if each validator effectively runs a distributed computing and a storage cluster, making one shard a network of clusters. Assuming that per shard throughput is fixed, a network of shards increases throughput linearly. A sharded blockchain could be bottlenecked by cross-shard communications. If an application could tolerate a network topology where there are small complete networks encapsulated within shards and there are only sparse cross-shard linkage, the application could be supported on chain. A sharded blockchain could support throughput up to 1-20 MB/s while allowing reasonable cross-shard linkages. This would be sufficient to support a global payment system. Data heavy applications such as gaming, messaging, and social media will likely have to leverage layer 2 techniques.



Related Posts


Published

Tags

Contact