A draft of an October 2019 proposal seeks to address a centralization risk in the Cosmos protocol by introducing “Proportional Slashing.” The concept as proposed, to “make a validator’s slash percent proportional to their share of consensus voting power,” takes into account large validators splitting voting power across accounts by factoring in the voting percentage of all the other validators who get slashed in a specific timeframe. The proposal also disincentivizes operators splitting stake across accounts, by proposing to punish multiple validators that fault together more heavily than a single validator that faults.
The proposal addresses a valid problem with a novel solution, but it might not be the best solution or least not the best solution for the state of the Cosmos ecosystem today. In fact, if adopted, it would be safer to run Cosmos validators with Bison Trails than other providers. We detail why below.
Details of the Proportional Slashing Proposal
The proposal suggests that slashing change from a model where all stake is slashed equally, i.e. 5% for double signing and 0.01% for downtime, to a model where stake is slashed at a higher rate depending on how much stake goes offline simultaneously.
The proposal specifies that validators would be slashed based on the following formula:
slash_amount = k * ((power_1)^(1/r) + (power_2)^(1/r) + ... + (power_n)^(1/r))^r // where k and r are both on-chain constants
So now, for example, assuming k=1 and r=2, if one validator of 10% faults, it gets a 10% slash, while if two validators of 5% each fault together, they both get a 20% slash ((sqrt(0.05)+sqrt(0.05))^2).
For the rest of this document we will focus on double signing. The proposal outlines that a lower k value for liveness might be used, but the general effects of this proposal are the same on liveness and double signing -- double signing just has significantly higher penalties today.
For k = 1 and r = 2, validators that double sign in a short time period will have their slashing penalty be equal to:
penalty = (sqrt(voting power validator 1) + sqrt(voting power validator 2) + …+ sqrt(voting power validator n) )^2
A single validator that had 10% of the network who got slashed would be slashed sqrt(10)^2 = 10%.
Two validators with 5% each would be slashed (sqrt(5)+sqrt(5))^2=20%
Five validators with 0.5% (total of 2.5% of the network) each would be slashed (sqrt(0.5)+sqrt(0.5)+sqrt(0.5)+sqrt(0.5)+sqrt(0.5))^2 = 12.5%
What was the community’s response to the proposal?
The proposal generated healthy discussion in the Cosmos community forum. Matt Harrop of Figment Networks wrote a thoughtful response:
I don’t think this proposal would have the effect of flattening the voting power distribution. In fact, it might have the opposite effect, and drive further consolidation of stake to a smaller number of large, well capitalized entities.
Increasing slashing penalties for larger validators favors sophisticated and will [sic] capitalized entities who can afford to build infrastructures that are very unlikely to fault. Smaller validators are less able to invest in high quality infrastructure and technical operations, and are viewed as higher risk operations. A simple example is the use of HSM based key management, which even in a simple configuration reduces the risk of a double sign by a considerable degree. Many small operators are not able to afford (or choose not) to pay the cost of the physical infrastructure required, and instead use local software signing with plain text keys on disk. Because the risk of a slashing events can be mitigated by the larger operators, even with a higher cost of a fault their relative risk of loss may be lower than a smaller and less sophisticated operator. It is rational to select a 1% risk of a 10% loss over a 3% risk of a 5% loss (numbers for illustration only).
The need to introduce an anti sybil measure further disadvantages small operators. For many reasons it is difficult, perhaps impossible, to reason about the risk of correlated slashing faults between validators. Small operators are more likely to have similar infrastructure, deployed in similar configurations, with identical software stacks. Many small operators do not use hardware based key management, leaving them all vulnerable to similar risks, which will correlate across diverse cloud infrastructures. While small operators can make claims about their infrastructures and operational skills, they are less able to invest in things like 3rd party audits to verify claims. If a delegator can not confidently assess the risk of correlated faults causing larger slashing events, they will assign higher risk to smaller operators.
Finally, large operators will be better able to insure against slashing losses. As markets mature, sophisticated and well funded operators are likely to be able to acquire third party insurance against slashing, negating the increased slashing penalties. The cost structures inherent in offering financial products of this type advantage larger operators, despite larger slashing penalties. It will be more difficult for smaller operators to qualify for and afford such coverage, leaving them less able to compete. Well capitalized entities, such as centralized exchanges can self insure and provide full guarantees against loss. Very few operators other than large centralized exchanges have the available capital to offer meaningful guarantees of this nature.
In addition to the negative consequences of the anti sybil measure, it is also unlikely to actually work. The incremental cost to large operators to split their operation into a number of smaller validators is small, even if they do so on diverse infrastructure. Many large operators already operate hardware in multiple physical locations and spread cloud based operations across multiple providers. In the context of the high operating costs of these entities, the incremental cost to split their operations would be small.
Does a platform like Bison Trails make it “exponentially worse” to be caught in a correlated fault?
No. In fact, this type of proposal likely makes it safer to run Cosmos validators with Bison Trails than other providers. As the premier multi-cloud, geographically distributed platform with enterprise-grade security, orchestration, redundancy, and scaling we can run many distinct validators on different cloud providers and do so in different locations. We build all of our infrastructure to the highest security and reliability standards: we use security and architecture that makes it extremely unlikely to equivocate or experience significant downtime that could lead to slashing.
This leaves our infrastructure less vulnerable to the broader range of cloud provider problems that could cause validator mistakes, and means we can build better tooling that automates failure states and protects our engineers from making human errors. We also run canary infrastructure across networks to test upgrades, changes in software, and more. We perform rolling upgrades and have software and systems that can quickly and safely roll back in the event of any issue.
The Cosmos Proportional Slashing proposal might hit smaller validators hardest. In order to run a successful validation business, a decent amount of stake is needed to cover costs and resourcing. Many smaller validators run simpler, less sophisticated setups. Those setups are often good for the needs of the network today. They don’t use HSMs and they congregate around popular cloud providers and regions.
Does Bison Trails support the proposal?
Theoretically, a proposal like this might actually lead to the strengthening of the network through removing those with less stable or secure infrastructure. It is almost certainly a good thing for companies like Bison Trails with resources. The problem this proposal is trying to solve is valid, and the proposal is a novel solution, but it might not be the best solution. It is a really strong forcing function for larger and smaller validators alike. Our opinion is that it is too early for something this strong and there might be alternatives that help the network achieve the same result over time. As a professional infrastructure provider, we’ve built a platform that enables us to avoid the fundamental issue the proposal seeks to address. But, it’s much harder for individuals running a single node to do so.
Most professional validators that use remote HSM signing use the https://github.com/tendermint/kms library. While we don’t know everyone’s setup, this means that it is likely more than 33% of the network uses the same software for signing blocks as everyone else. A critical bug in this open-source tooling that causes an equivocation would cause a massive slashing event as it is: with this proposal the impact would be significantly worse.
In the future we might support a proposal like this, but are unable to do so currently. The community and open-source tooling just aren’t in a place for us to endorse it. We care deeply about the Cosmos ecosystem and want it to flourish for a very long time.
What does Bison Trails do to minimize slashing?
To learn more about slashing and the work we do to minimize this risk, see this help article.