Dusk Network’s public testnet will launch as soon as possible.
Yesterday, we made the difficult decision to postpone Daybreak to a later date to give the development team time to identify and resolve the emerging issues with the network. As mentioned previously, last week we identified an issue with the state. In this report, we provide insights into the manifestation, as well as ongoing strategies to resolve the issues quickly and efficiently.
Issue: Situation of the state
As we proceeded testing the network in a decentralized fashion, we noticed delays in the consensus that we did not encounter before. Due to the simultaneous manifestation of two unrelated issues which both affect the state, it was quite hard to pinpoint the root cause of such behavior. Only after close scrutiny of the network did the team discover that it had to do with both the generation and the persistence of the global state. In other words, every node thought the blockchain to be in a different state.
Why is this an issue? While every node was at the same block height detailing the same transactions, it turned out that in fact, they were all referring to a different hash of the state of the blockchain. Every node was in essence its own version of the blockchain. An issue that has to be resolved before launching Daybreak.
How could this have happened? An explainer:
The state of the blockchain is identified by a hash of all the data stored upon transaction execution since the genesis of the blockchain. In particular, the genesis block is created by including the bytecode of the compiled core contracts, such as the stake and the transfer contract. However, when using a compiler, some references to the environment where such a compilation happens, are retained in the resulting binary artifact.
The most common of these references is the directory path where the compilation happened. This path is different for each and every node. This means when the compilation happens on different machines, even if the data of the blockchain is absolutely the same, the identifier (i.e. the hash of the data and the bytecode of the genesis contracts) differs.
To give an example, if Alice compiles and launch her own node, after sending 3 DUSK to Bob, the hash of the state will be the hash of this transaction, plus the hash of the compiled artifact (which will include, say, her home path somewhere - i.e. /home/alice/rusk). Bob however, has compiled his node from a different path (say home/bob/rusk). While the transaction shows no differences, the hash of the state of the blockchain differs between the nodes.
This discrepancy is exacerbated by an additional problem occurring during the persistence of this state. When each block is accepted, the Rusk virtual machine executes all transactions in order to update the global state. In the example above, the global state is changed by executing Alice’s transaction, so that Alice can no longer spend the 3 DUSK she sent to Bob, while Bob will own these DUSK. When this mutation happens, two core components of Rusk are activated: Canonical, which encodes the spendable DUSK in the Merkle Tree (called the note tree), and Microkelvin, which is appointed to retain the information about the notes.
In particular, the issue is with the recursive algorithm used by Canonical to store the notes in a memory area accessible by Microkelvin. As a result, while the contract state recognizes the correct position and amount of notes in the tree, it then fails to retrieve the actual data, again causing problems with the correct computation of the state.
Two strategies to resolve the issue
While the issues with the situation of the state are being tackled, we can’t go live with a decentralized network. As we are planning to go live as soon as possible, we are simultaneously working on two strategies for a prompt and robust resolution of said issues.
- Patch the issue in Canonical. It needs to be said that we already deprecated the use of Canonical in favor of rkyv. Ironically, the reason why we planned to go live with Canonical and switch to rkyv later in the release cycle, was to first test with a relatively consolidated library we have been working with for a longer time. This strategy would be focusing on the short term since it is our full intention to go live with rkyv on mainnet.
- Resolve the issue with a faster upgrade to rkyv. This is the stack we ultimately go for as it ships massively improved performance, cheaper gas fees, lower complexity, and a much simpler smart contract layout. However, as this strategy involves a relatively new tech component, we need to be more vigilant on unforeseen side-effects during our bi-weekly release cycles.
If you’d like to see the ongoing progress on the above-mentioned strategies for yourself, you can take a look at the issues on the Dusk Network Github.
We thank you for your patience and continued support. All of your questions will be answered during our AMA with Emanuele Francioni, on February 1st, at 4.00pm CET on our Discord.
We remain confident we can support a timely release and are proud of our developers who came up with a structured approach to identify and isolate the issue quickly, set up two strategies in parallel, and are pushing through these challenges at this very moment. We look forward to welcoming you to Daybreak soon!