On February 6th, Solana network experienced another “long-awaited” outage, with the last outage occurring around February 25th, 2023. According to Matthew Sigel, Head of Digital Asset Research at VanEck, this Solana outage was caused by a failure in the BPF (Berkley Packet Filter) loader, which is the mechanism used to deploy, upgrade, and execute programs on Solana.
This may be related to a previous SMID proposal that added an interceptor to prevent the use of metadata in BPF, as this metadata is no longer needed. This came from the 0093 upgrade, but there was some error in it, which was discovered and a fix was created on the test network but not yet implemented. It is speculated that someone manually triggered this error, resulting in the Solana outage.
Solana’s “outage” issues have been criticized by the community in the past. Although the network has been relatively stable over the past year, Solana has experienced several outages or network freezes. Here is a summary:
1. On February 6th, 2024, there was a failure in the BPF loader, causing an outage that lasted for 4 hours and 46 minutes.
2. On February 25th, 2023, Solana’s mainnet experienced performance issues and was unable to process user transactions. Solana later released an improvement network upgrade plan, including improving the upgrade process, forming a response team, and improving the restart process.
3. Around October 1st, 2022, the network experienced an outage due to node configuration errors.
4. Around August 3rd, 2022, there was a large-scale theft of coins in the Solana wallet, which was ultimately found to be a vulnerability caused by centralized Sentry servers.
5. Around June 1st, 2022, there was a network restart due to a durable nonce vulnerability in transactions, resulting in an interruption of approximately 4.5 hours.
6. Around May 1st, 2022, a large number of robot transactions emerged due to a new NFT project minting, causing a loss of consensus in the mainnet nodes and a block suspension of up to 7 hours.
7. Around January 21st, 2022, due to significant market volatility, the network was flooded with a large number of arbitrage robot-submitted transactions, causing severe overload and a downtime of up to 30 hours. However, the official classification at the time was “degraded performance”. Solana’s community later updated the mainnet to version 1.8.14 in an attempt to improve the network status.
8. Around September 14th, 2021, during the IDO activity of the decentralized social networking protocol Grape Protocol on the Raydium platform, many users sent a large number of transactions through scripted commands, causing “memory overflow” and resulting in the collapse of verification nodes. The entire network was unable to produce blocks, resulting in a downtime of up to 17 hours.
9. Around September 3rd, 2021, the network was unstable and experienced a performance decline for approximately 1 hour.
10. Around May 4th, 2021, there was a decline in network performance, resulting in a large number of transactions unable to be executed.
Looking back at the history of network events, it is evident that the surge in transactions has been the main cause of historical network outages. This may be related to Solana’s mechanism. According to Hu Zhiwei, the Director of the Border Intelligence Research Institute, Solana treats consensus messages as a special type of transaction message transmitted between verification nodes. The congestion of messages has caused the consensus messages to be unable to be transmitted normally, resulting in a failure in consensus.
At the same time, some of Solana’s features have been targeted and exploited, leading to network outages. For example, the write-lock for concurrent transaction processing is locked on many important addresses, causing transactions to be executed sequentially instead of concurrently, greatly affecting the processing capacity of messages. Nodes retain possible fork information to handle forks, leading to memory overflow, among other issues.
Facing the common problem of network performance decline or outages caused by the surge of junk transactions, Solana co-founder Anatoly Yakovenko has acknowledged the issue and stated the introduction of “actual flow control” to solve the problem. As for network outages caused by factors such as transaction nonces or node configuration errors, Solana’s official team has promptly released repair versions for node upgrades.
This one-year interval before another outage may be both good and bad news, but it serves as a warning, especially in the current context of the growing popularity of the Solana ecosystem. Network stability still requires significant attention and investment.
Related reports:
– Bankless predicts in 2024: EigenLayer TVL surpasses $10 billion, and Solana will experience another outage.
– Solana’s 9th outage: After the verification upgrade, it was unable to produce blocks, and the root cause has not been clarified.
– Solana’s mainnet is down again! Major outage restart instructions have been issued, and the network has been restored.