Informal Systems

2023-08-17

818 Post Mortem

Jehan Tremback • 2023-08-17

There has recently been a lot of discussion about Cosmos Hub proposal #818. This was the first use of the equivocation slashing proposal type, which is intended to slash validators who double sign on consumer chains.

Unfortunately, a lot went wrong with proposal #818, and while no funds were ever at risk, it wasted a lot of time and energy. In this post we will go over what happened, what we should have done differently, and why this won’t happen again.

It’s important to note that equivocation slashing proposals are a temporary solution which will be retired soon when cryptographic equivocation verification is finished. When this is done, validators who double sign on consumer chains will be slashed immediately and automatically.

Table of Contents

What happened

The double sign

On August 1st, a consumer chain called Neutron had an upgrade. This upgrade had some issues and resulted in a chain halt, and so validators felt under pressure to upgrade quickly. During this process, two validators deleted a file on their Neutron nodes which is supposed to prevent CometBFT from double signing. While the upgrade was rushed, it needs to be said that if the validators in question were following best practices, they wouldn’t have double signed.

The reason that double signs are not allowed is that by signing off on two alternate versions of history, a group of validators could try to pull off a double spend attack. For there to be any hope of this type of attack succeeding, 33% of validation power must participate. The two validators involved had around 0.2% of the power, and there is no question that the double signs were accidental.

The debate

A little bit more than a week later, a community member did a query on their Neutron node and noticed that the two validators had double signed during the upgrade. This community member quickly put up an equivocation slashing proposal, proposal #818.

This proposal was contentious. Some voters thought that since ICS is new, and there was obviously no attack, validators should be cut some slack. Some thought that the rules are the rules, and changing them on the spur of the moment would set a bad precedent. This debate raged on until a new piece of information entered the arena.

The typo

Taking a closer look at the proposal, our team at Informal realized something - the proposal had an incorrect parameter. An equivocation slashing proposal takes a block height parameter, but this is intended to be a block height on the Cosmos Hub, not on the consumer chain. The creator of the proposal used the consumer chain block height. This is in our documentation, but it is also an understandable mistake to make.

The result of this mistake is that even if the proposal passes, it will not result in a slash.

We evaluated what would happen if someone were to create a corrected proposal, and concluded that it would be a bad idea. Due to the length of time that had already passed since the double sign, the math of which delegators to slash would be thrown off. Some delegators who had undelegated after the slash would get off with no slash, while those who remained would be slashed for more. This is because by the time the proposal passed, the unbonding period would have been over for a while.

What went wrong, and how it could have been prevented

Problem: There was a debate about whether or not to slash

When we built the governance-gated slashing feature, it was intended to use governance as a filter to prevent consumer chains from going crazy and slashing everyone, since we did not yet have the code to cryptographically verify slashes. Our assumption was that the only slashes that would be voted down by governance would be ones that were obviously fake, and originated from malicious or malfunctioning consumer chain code. This is reflected in our writing about the feature, but we never explicitly stated what it should or shouldn’t be used for.

However, another valid view is that the governance-gating could be used as “training wheels” while validators get used to the added load of ICS, and possibly be used to forgive accidental slashes. This wasn’t our intention, but it’s not crazy. Many systems, including Ethereum and Anoma try to mitigate the effect of slashing on accidental double signs. What is clear is that the voting period for an equivocation slashing proposal is not the right time to answer these questions. We should have worked more to build clear consensus and understanding about what the feature was to be used for at the time that it was introduced. This could have been done with a “constitution” that would be ratified along with the acceptance of the feature by governance. This could have spell out principles of what the feature was to be used for, as well guidelines of what to do in various scenarios (accidental double sign, attempted attack, malfunctioning consumer chain).

Problem: Our documentation for how to create a slashing proposal was not as clear as it could have been, and our code did not reject obviously wrong proposals

You can read it and decide for yourself, but while our documentation discusses the theory and process for creating a slash proposal, it did not do a good enough job of providing step by step instructions. The issue in proposal #818 is that the proposal’s author used the consumer chain’s block height, not the provider chain’s block height. This is an easy mistake to make, and it could have been prevented by instructions like:

  1. Get the consumer’s time of double signing from the consumer’s evidence.

  2. Find out what the block height was on the provider at this time.

  3. Use this block height and time to create the equivocation slashing proposal.

In general, the UX to create an equivocation slashing proposal is not great, and that didn’t help either.

Additionally, our code could have done more validation of the parameters supplied in the proposal. This is not technically a risk, because it is not possible to supply a parameter that can crash the chain or anything. But we could have done some validation for obvious mistakes, like a block height that was far lower than the current block height, as in proposal 818. This is not foolproof, and wouldn’t work if the block heights on provider and consumer were close, but perhaps we could have rejected extremely wrong values to catch obvious mistakes at proposal submission time.

Problem: We did not have a system or procedure in place to create equivocation slashing proposals

When we designed the feature, we were thinking about preventing a worst case scenario in an attack. In this dramatic scenario, the submission of equivocation slashing proposals can be taken for granted. We did not think hard enough about ensuring that slash proposals were reliably created for any slash, even accidental ones which cause no disruption.

Creating the slashing proposals automatically with complete reliability is difficult, but we (or other teams in the space) could have had a procedure in place to create them manually immediately upon detection of a double sign. This would have ensured that they were created in a timely manner and completely correctly.

Why this won’t happen again

There are many technical improvements that we could make to governance-gated slashing to make it more foolproof and give it a better UX. However, we will be focusing on different improvements. Automatic cryptographically verified equivocation slashing is almost ready, and will be deployed in Gaia v13, which should be live in mid-October. At this point, equivocation slashing proposals will be removed.

Making UX and governance improvements to governance-gated slashing couldn’t be deployed much sooner (at least not without delaying Gaia v12, which contains the Liquid Staking Module) so at this point devoting our time to those improvements would not be wise.

While automatic verified slashing is in development and testing, we will work to prevent another 818-like situation if another validator accidently double signs on a consumer chain. Our team is monitoring for double signing evidence, and we will submit equivocation slashing proposals immediately as we come upon such evidence. By taking on this task we will make sure that any slashing proposals submitted before the feature is sunsetted are submitted properly. That being said, anyone else can submit these proposals as well.