Babylon for Fast Stake Unbonding

101 Series

101 Series

Sep 1, 2022

How does Babylon utilize Bitcoin security to dramatically reduce the stake unbonding time?

Background

Proof-of-Stake blockchains such as Cosmos zones allow fast finality for transactions but suffer from a long unbonding period (withdrawal delay) for stake with serious effects on user experience and liquidity of tokens. In the previous blog post, we identified long range attacks as the root cause of imposing a withdrawal delay. To mitigate these attacks, PoS chains often require external observers (e.g., major stake-holders of the protocol) to reach consensus on the finality of blocks containing withdrawal requests through off-chain communication. As this social consensus involve active participation of human actors, it can take weeks, resulting in a long withdrawal delay (Figure 1).

Checkpointing to Bitcoin as a solution

Fortunately, it is possible to speed up and automate social consensus by using Bitcoin, the most secure blockchain in the world, as the modicum of agreement on the finality of withdrawal requests. For this purpose, weak subjectivity checkpoints on PoS blocks are replaced with actual checkpoints of these blocks on Bitcoin (Figure 2). The order of these checkpoints determine the canonical PoS chain. Once the checkpoint of a PoS block becomes sufficiently deep in Bitcoin (e.g., k=6 blocks deep), the checkpoint cannot be reverted, implying that the checkpoints of any attack chain would appear later on Bitcoin, and can subsequently be ignored. Thus, to grant withdrawal requests, it is sufficient for the checkpoint of the PoS block containing the request to appear earlier than conflicting checkpoints on Bitcoin, and be k-deep for a sufficiently large k. As this takes on the order of hours rather than weeks, the withdrawal delay can be reduced from weeks to hours using Bitcoin, (c.f., paper for technical details).

Can we directly send checkpoints to Bitcoin?

Let’s consider a Cosmos zone (which is a PoS blockchain) sending checkpoints to Bitcoin to mitigate long range attacks and reduce its withdrawal delay. The checkpointing can be realized by using the OP_RETURN opcode of Bitcoin that allows posting 80 bytes of arbitrary data within an unspendable Bitcoin transaction. Each checkpoint must include at least the hash of the PoS block (32 bytes) to be checkpointed as well as the signatures (each 32 bytes) that finalize the block. Here, the hash is used to identify the PoS block that is checkpointed. The signatures are needed to prevent the adversary from sending arbitrary hashes and pretending like it is checkpointing a Cosmos block on Bitcoin (We will see in the next section why this is a problem).

 As each Cosmos block might contain a withdrawal request, the zone might have to send a new checkpoint for every new Cosmos block. Whereas the zone might have a new block every 10 seconds, there is a Bitcoin block only every 10 minutes, implying at least 60 blocks worth of data to be checkpointed at each Bitcoin block. Then, even if the zone is run by a single validator, this implies a total size of at least 3840 bytes to be checkpointed at every new Bitcoin block, requiring at least 48 OP_RETURN transactions per block, just for checkpointing a single Cosmos zone! At this point, we are not even considering a zone run by multiple validator, or checkpointing multiple PoS chains. In general, checkpointing Cosmos zones directly on Bitcoin might not be feasible due to three major reasons: 

  • It is necessary to checkpoint every Cosmos block containing a withdrawal request, which might be all of the blocks in a Cosmos zone.

  • It is necessary to post sufficiently many signatures with each checkpoint to prevent a weak adversary (that does not control many validators) from sending arbitrary hashes and pretending like it is checkpointing a Cosmos block on Bitcoin. Unfortunately, Tendermint signatures used by the consensus of Cosmos zones are not aggregable and take up a lot of space.

  • There might be many Cosmos zones or other PoS chains sending checkpoints. In this context, the limited space of the Bitcoin blocks implies a fundamental scalability problem for the checkpointing solution.

To reduce the footprint of checkpoints and develop a scalable checkpointing solution that can support multiple Cosmos zones simultaneously, we need to aggregate checkpoints to fit them into the limited space offered by each Bitcoin block. Moreover, we need to make these checkpoints as compact as possible. This is where Babylon enters the game.

Babylon Architecture

Babylon is a PoS Cosmos zone that receives streams of checkpoints from multiple PoS chains as transaction data and posts a single checkpoint stream to Bitcoin on their behalf. The size of the checkpoints to Bitcoin from Babylon is minimized by using aggregate signatures for the Babylon validators and the frequency of these checkpoints is controlled by allowing the Babylon validators to change only once every epoch of many block times.

 For illustration, the full Babylon architecture consists of three blockchains: client PoS chains (e.g., consumer Cosmos zones as specified in the figure of the first blog post), Babylon and Bitcoin (Figure 3). To minimize the trust on Babylon to accurately checkpoint PoS blocks, validators of the respective PoS chains (we focus on one chain for clarity) download the Babylon blocks, and observe if their PoS checkpoint is contained in a Babylon block checkpointed by Bitcoin. This enables the PoS chains to detect discrepancies, for instance, if the Babylon validators create an unavailable block checkpointed by Bitcoin and lie about the PoS checkpoints being included in the unavailable block (Figure 7). The main components of the protocol are described as follows:

1) Checkpointing: Babylon proceeds in epochs during which the validator set of Babylon is fixed. Thus, only the last block of a Babylon epoch is checkpointed by Bitcoin. Checkpoint consists of the hash of the block and a single aggregate BLS signature that corresponds to signatures from 2/3 of the validator set that has signed the block for finalization. Babylon checkpoints also contain the epoch number.

PoS blocks can be assigned the timestamp of a Bitcoin block through the Babylon checkpoints. For instance, the first two PoS blocks on Figure 3 are checkpointed by Babylon blocks that are in turn checkpointed by the Bitcoin block with timestamp t_3. Hence, these PoS blocks are assigned the Bitcoin timestamp t_3.

2) Canonical PoS chain: Whenever there is a fork on the PoS chain, the fork whose first block has an earlier Bitcoin timestamp is adopted as the canonical PoS chain (Figure 4). If both forks have the same timestamp, the tie is broken in favor of the PoS block with the earlier checkpoint on Babylon.

3) Withdrawal rule: To withdraw its stake, a validator sends a withdrawal request to the PoS chain (Figure 4). A PoS block containing the withdrawal request is checkpointed by Babylon, which is then checkpointed by Bitcoin, and assigned the timestamp t_1. Once the Bitcoin block with timestamp t_1 becomes k deep, withdrawal is granted on the PoS chain. At this point, if the validators that have withdrawn their stake engage in a long range attack, the blocks on the attack chain (bottom chain on Figure 4) can only be assigned a Bitcoin timestamp that is later than t_1. This is because once the Bitcoin block with timestamp t_1 becomes k-deep, it cannot be reverted. Then, observing the order of these checkpoints on Bitcoin, PoS clients can distinguish the canonical chain from the attack chain, which is subsequently ignored.

4) Slashing rule: It is possible to slash the validators that have double signed conflicting PoS blocks if they have not withdrawn their stake at the time the attack is detected (Figure 5). The adversarial PoS validators know that if they wait until their withdrawal requests are granted to do a long range safety attack, they will not be able to confuse clients, which can look at Bitcoin to identify the canonical chain (Figure 4). Hence, they might fork the PoS chain as soon the blocks on the canonical PoS chain (top chain on Figure 5) are assigned a Bitcoin timestamp (e.g., t_2). These PoS validators then collaborate with adversarial Babylon validators and Bitcoin miners to fork Babylon and Bitcoin and replace the Bitcoin block with timestamp t_2 with another block of timestamp t_3. This changes the canonical PoS chain from the top chain to the bottom one in the view of a late-coming PoS client. Although this is a successful safety attack, it results in the slashing of the adversarial PoS validators’ stake as they have double-signed conflicting blocks (circled in red on Figure 5), yet have not withdrawn their stake.

5) Stalling rule for unavailable PoS checkpoints: PoS validators have to stall, i.e., freeze their PoS chain, upon observing an unavailable PoS checkpoint on Babylon. Here, an unavailable PoS checkpoint is a hash signed by 2/3 of the PoS validators, that supposedly corresponds to a PoS block that cannot be observed. If the PoS validators do not stall upon observing an unavailable checkpoint, then the adversary can reveal a previously unavailable attack chain (shown in shaded green on Figure 6), and change the canonical chain in the view of the late-coming clients. This is because the checkpoint of the shaded chain that is revealed later appears earlier on Babylon.

Stalling rule above exposes the reason why we require the PoS block hashes sent as checkpoints to be signed by the PoS validator set. If these checkpoints were not signed, then any adversary would be able to send an arbitrary hash, and claim that it is the hash of an unavailable PoS block checkpointed on Babylon. Then, the PoS validators would have to stall for a checkpoint that does not have any unavailable PoS chain in its pre-image! Note that creating an unavailable PoS chain is hard: It requires corrupting at least 2/3 of the PoS validators so that they finalize PoS blocks with their signatures, yet withhold the data from the honest validators. However, in the hypothetical attack above, the adversary stalled the PoS chain without corrupting even a single validator. To prevent such attacks, we require PoS checkpoints to be attested by 2/3 of the PoS validators; so that Babylon will have unavailable PoS checkpoints only if indeed 2/3 of the PoS validators are corrupted. Such an attack is highly unlikely due to the cost of corrupting the PoS validators, and does not affect the other PoS chains or Babylon itself.

6) Stalling rule for unavailable Babylon checkpoints: PoS and Babylon validators have to stall upon observing an unavailable Babylon checkpoint on Bitcoin. Here, an unavailable Babylon checkpoint is a hash with an aggregate BLS signature by 2/3 of the Babylon validators, that supposedly corresponds to a Babylon block that cannot be observed. If the Babylon validators do not stall, then the adversary can reveal a previously unavailable Babylon chain (shown in shaded blue on Figure 6), thus changing the canonical Babylon chain in the view of the late-coming clients. Similarly, if the PoS validators do not stall, then the adversary can reveal a previously unavailable PoS attack chain (shown in shaded green on Figure 7) along with the previously unavailable Babylon chain, thus changing the canonical PoS chain in the view of the late-coming clients. This is because the shaded Babylon chain revealed later has an earlier timestamp on Bitcoin, and contains the checkpoint of the PoS attack chain revealed later.

Just like the stalling rule for unavailable PoS checkpoints, the rule above exposes the reason why we require the Babylon block hashes sent as checkpoints to be accompanied by an aggregate BLS signature attesting to signatures by 2/3 of Babylon’s validator. If Babylon checkpoints were not signed, then any adversary would be able to send an arbitrary hash, and claim that it is the hash of an unavailable Babylon block checkpointed on Bitcoin. Then, the PoS validators as well as the Babylon validators would have to stall for a checkpoint that does not have any unavailable Babylon or PoS chain in its pre-image! Creating an unavailable Babylon chain requires corrupting at least 2/3 of the Babylon validators. However, in the hypothetical attack above, the adversary stalled all chains in the system without corrupting even a single Babylon or PoS validator. To prevent such attacks, we require Babylon checkpoints to be attested by an aggregate signature; so that there will be unavailable Babylon checkpoints only if indeed 2/3 of its validators are corrupted. Such a data availability attack is highly unlikely due to the cost of corrupting the Babylon validators. Unfortunately, once it happens, it affects all PoS chains by forcing them to stall.

Conclusion

Babylon reduces the unbonding period of PoS chains from weeks to a few hours by enabling these chains to use Bitcoin as a timestamping service, akin to the role of social consensus. As checkpointing PoS blocks on Bitcoin directly is unfeasible due to lack of space within Bitcoin blocks, Babylon as a separate PoS chain aggregates the checkpoints sent by the PoS chains and posts them to Bitcoin on their behalf. Babylon architecture prevents long range attacks through its k-deep withdrawal rule for unbonding validators, and guarantees slashing of the adversarial stake in the event of short-range safety attacks.

This article is also available on Substack:  Babylon for Fast Stake Unbonding (substack.com)