What Performance Bottlenecks are Rollups Facing After the Upgrade in Cancun?

Why is the cost of L2 gas not decreasing but increasing? This article is sourced from Keone Hon, co-founder of Monad, and was compiled and written by Odaily.

Summary:
In this article, Keone Hon discusses the performance of Rollups after the Cancun upgrade. He explains how the theoretical TPS upper limit for Rollup is calculated and why the transaction fees for some Layer2 (Base) transactions are still high, even after the upgrade. Keone also outlines the bottlenecks and potential improvements faced by Rollup.

Table of Contents:
1. Data Availability (DA)
2. Gas Limit of Rollup
3. Issue 1: Bottleneck of Execution Throughput
4. Issue 2: Hidden Dangers of State Growth
5. Why is Hardware Optimization Not Enough?
6. Community Interaction

On March 26th, Keone Hon, co-founder of Monad, released an in-depth article on Rollup performance on his personal X platform. In the article, Keone explains the calculation of the theoretical TPS upper limit for Rollup after the Cancun upgrade and why the transaction fees for some Layer2 (Base) transactions are still high, reaching several dollars. Additionally, Keone outlines the bottlenecks and potential improvements faced by Rollup.

The following is the original content by Keone, translated and supplemented by Odaily for the convenience of readers.

There have been discussions in the market regarding the bottlenecks of Rollup execution and gas limitations, which involve not only Layer1 but also Layer2. I will discuss these bottlenecks in the following text.

With the introduction of the Blob data structure (EIP-4844) in the Cancun upgrade, the data availability (DA) of Ethereum has been significantly improved, and the data synchronization transactions of Layer2 no longer need to compete in the same fee market as regular Layer1 transactions.

Currently, the Blob has a capacity of approximately 3 125kb Blobs per block (12 seconds), which is approximately 31.25kb per second. Considering that the size of a single transaction is approximately 100 bytes, this means that the shared TPS of all Rollups is approximately 300.

Of course, there are some important points to note here.
First, if Rollup adopts better transaction data compression technology to reduce the size of a single transaction, the TPS can increase.
Second, theoretically, Rollup can continue to use calldata synchronization in addition to Blob synchronization (the old solution before the Cancun upgrade), although this would introduce additional complexity.
Third, different ZK-rollups have differences in the way they release states (especially zkSync Era and Starknet), so the calculation methods and results will also vary for these Rollups.

Recently, there has been significant attention on Base due to the surge in gas fees, with the cost of a regular transaction on the network reaching several dollars.

Why did Base’s gas fees decrease for a period of time after the Cancun upgrade but now exceed or reach the same level as before the upgrade? This is because there is a gas limit for blocks on Base, which is determined by an argument in its code.

The gas argument currently used by Base is the same as Optimism, which means that there is a total gas limit of 5 million gas for each Layer2 block (2 seconds). When the demand (total number of transactions) on the network exceeds the supply (block space), the price settlement mechanism will execute transactions on demand, leading to a surge in gas fees on the network.

Why doesn’t Base increase this gas limit? Or in other words, why does Rollup need to set a gas limit?

In addition to the previously mentioned data availability limiting the TPS, there are two other main reasons: bottlenecks in execution throughput and hidden dangers of state growth.

Generally, EVM Rollups execute using a fork of Geth’s EVM, which means they have similar performance characteristics to the Geth client. The Geth client is single-threaded (meaning it can only process one task at a time) and uses LevelDB/PebbleDB encoding to store its state in a merkle patricia trie (MPT). This is a general-purpose database that uses another tree structure (LSM tree) as the underlying data storage on solid-state drives (SSDs).

For Rollup, the most costly processes are “state access” (reading values from the merkle trie) and “state update” (updating the merkle trie at the end of each block). This is because the cost of reading from an SSD in a single query is around 40-100 microseconds, and the merkle trie data structure is embedded in another data structure (LSM tree), resulting in unnecessary additional queries.

This process can be imagined as searching for a specific file in a complex file system. You need to navigate from the root directory (trie root node) to the target file (leaf node). For each file accessed, a specific key in the LevelDB database needs to be queried, and within LevelDB, actual data storage operations are executed through another data structure called the LSM tree. These additional steps make the entire data reading and updating process slow and inefficient.

In Monad’s design, we solve this problem through MonadDb. MonadDb is a custom database that supports storing the merkle trie directly on the disk, avoiding the overhead of LevelDb. It supports asynchronous IO, allowing multiple reads to be processed in parallel, and bypasses the file system.

Additionally, Monad uses an “optimistic parallel execution” mechanism that allows multiple transactions to be executed in parallel and extract their states from MonadDb in parallel.

However, Rollups do not have these optimizations, resulting in bottlenecks in execution throughput.

It should be noted that the efficiency of the database has been optimized in the Erigon/Reth client, and some Rollup clients are built based on these clients (such as OP-Reth). Erigon/Reth uses a flat data structure, which reduces the query cost during reading to some extent. However, they do not support asynchronous reading or multi-thread processing. Additionally, the merkle root needs to be recalculated after each block, which is a slow process.

Like other blockchains, Rollups also limit their throughput to prevent rapid growth in their state.

A common argument in the market is that the concern about state growth is due to the potential increase in demand for solid-state drives (SSDs) if the state data grows significantly. However, I believe this is somewhat inaccurate. SSDs are relatively cheap (a high-quality 2TB SSD costs about $200), and in the nearly 10-year history of Ethereum, the entire state is only about 200GB. From a storage perspective, there is still a lot of room for growth.

The bigger concern is that as the state continues to grow, the time required to query specific state fragments will increase. This is because the current merkle patricia trie uses “shortcuts” when the condition of “a node has only one child node” is met, which reduces the effective depth of the trie and accelerates the query process. However, if the merkle trie becomes more crowded, the available “shortcuts” will become fewer.

Overall, the hidden danger of state growth is essentially a problem of state access efficiency. Therefore, accelerating state access is the key to making state growth more sustainable.

Currently, Layer2 is still relatively centralized, with the network relying on a single sequencer to maintain the state and produce blocks. One might ask why the sequencer is not executed on hardware with high RAM to store all states in memory.

There are two reasons for this.
First, it does not solve the data availability bottleneck of the Ethereum mainnet. Although in the current situation of Base, the surge in gas fees is not due to insufficient data availability performance on the mainnet, in the long run, this will become a major bottleneck for Rollup.

Second, it involves decentralization. Although the sequencer is still highly centralized, other roles participating in the network execution are also important. They need to be able to independently run nodes, replay the same transaction history, and maintain the same state.

The original transaction data and state submissions on Layer1 are not sufficient to unlock the complete state. Any role that requires access to the complete state (such as merchants, exchanges, or automated traders) should run a complete Layer2 node to process transactions and have an up-to-date state copy.

Rollups are still blockchains, and what makes blockchains interesting is their ability to achieve global coordination through shared global state. Powerful software is necessary for all blockchains, and hardware optimization alone is not enough to solve the problem.

After Keone published this article, key personnel from various top Layer2 projects interacted in the comments section.

zkSync co-founder Alex Gluchowski asked Keone about Monad’s differences in calculating merkle root after each block.

Keone replied that there is an optimization algorithm for calculating the merkle root after each block.

Jesse Pollak, the person in charge of Base, also explained why gas fees on Base increased instead of decreasing after the Cancun upgrade. He stated that EIP-4844 has significantly reduced the DA cost on the Layer1 level, and gas fees should have decreased. However, due to the demand on the network increasing by more than 5 times and the existence of a gas limit of 250 gas/s on the Base network, the demand exceeds the supply, resulting in an increase in gas fees.

Related Reports:
– Vitalik Buterin is optimistic about Ethereum Validium: a better choice for many DApps than Rollups
– Can Rollups really scale Ethereum, or are we deceiving ourselves?
– Interpretation: The competition between Ethereum Rollups, Solana, and Cosmos application chains

What Performance Bottlenecks are Rollups Facing After the Upgrade in Cancun?

Understanding Ethereum ERC-7786: A Unified Multichain Collaboration Standard, Heralding the Era of “Unity” in the ETH Ecosystem?

What Could Be the Potential Peak of Bitcoin This Cycle? An Analysis Using Multiple Valuation Models

Cardano Prepares for Coin-to-Coin Exchange: Founder Proposes $100 Million in ADA for Bitcoin and Stablecoins to Address DeFi Liquidity Issues

Federal Bank Explains the Ban on Scheduled Transfers: High Proportion of Alert Accounts in Cryptocurrency Accounts Makes Fraudulent Money Flows Difficult to Track.

Understanding Ethereum ERC-7786: A Unified Multichain Collaboration Standard, Heralding the Era of “Unity” in the ETH Ecosystem?

ARK Invest Sells Approximately $51.7 Million of Circle Stock, Representing Only 10% of Cost Basis

What Could Be the Potential Peak of Bitcoin This Cycle? An Analysis Using Multiple Valuation Models

Federal Bank Explains the Ban on Scheduled Transfers: High Proportion of Alert Accounts in Cryptocurrency Accounts Makes Fraudulent Money Flows Difficult to Track.

Understanding Ethereum ERC-7786: A Unified Multichain Collaboration Standard, Heralding the Era of “Unity” in the ETH Ecosystem?

ARK Invest Sells Approximately $51.7 Million of Circle Stock, Representing Only 10% of Cost Basis

What Could Be the Potential Peak of Bitcoin This Cycle? An Analysis Using Multiple Valuation Models

Federal Bank Explains the Ban on Scheduled Transfers: High Proportion of Alert Accounts in Cryptocurrency Accounts Makes Fraudulent Money Flows Difficult to Track.

Understanding Ethereum ERC-7786: A Unified Multichain Collaboration Standard, Heralding the Era of “Unity” in the ETH Ecosystem?

ARK Invest Sells Approximately $51.7 Million of Circle Stock, Representing Only 10% of Cost Basis

What Could Be the Potential Peak of Bitcoin This Cycle? An Analysis Using Multiple Valuation Models

What Performance Bottlenecks are Rollups Facing After the Upgrade in Cancun?

Related Posts