Confidential Transactions

Table of ContentsClose Close

Zether - What’s It All About
- How to Prove It: Gimme that Sigma Rizz
- Come on and Twist
Withdrawals and Deposits: How Can I Get My Money Out?
Front-Running: How I Learned to Stop Worrying and Love Being DDOSed
Efficient Discrete Logarithms With Assisted Marsupials
Key Rotation For Dummies
Conclusion

Private payments may be designed for two properties - anonymity and confidentiality. Anonymity hides one or both of the sender and recipient of a transaction. Confidentiality hides the amount sent. Two cryptocurrency projects which attempt to provide both confidentiality and anonymity are ZCash and Monero. In general, anonymity is a harder problem than confidentiality. The 2019 Zether paper designed for the account-based model used by blockchains such as Ethereum and Solana provides confidentiality without anonymity. Two projects I am aware of outside of the EVM have implemented variants of Zether. They have manifested as Solana’s Confidential Transfers and Aptos’ Veiled Coins. The former is a production-ready deployment integrated with Solana’s L1, while the latter is implemented as an example smart contract in Move not ready for production. This post will explain these two systems, explaining their designs and features, while also comparing them to the original Zether design.

Zether - What’s It All About

The Zether paper is based on the simple insight that confidentiality can be provided in the account-based model using a partially-homomorphic encryption protocol. If we represent two encrypted ciphertexts $c, d$ under the same public key by the encryptions $f (c)$ and $f (d)$ , partial homomorphism means that

$f (c) + f (d) = f (c + d)$

Say a user’s hidden encrypted balance is represented by $f (c)$ . Their incoming transaction may be represented by $f (Δ)$ . To update their balance without revealing either amount, we set their new balance to be $f (c) + f (Δ) = f (c + Δ)$ . Correspondingly, we set the sender’s balance $f (e)$ to be $f (e) - f (Δ) = f (e - Δ)$ .

This is pretty simple, but there are two main problems. One is that we must know that the sender is not sending more money than they actually have. This can be fixed by using range proofs to verify that the sender’s updated balance $b \in [0, 2^{λ}]$ , where $2^{λ}$ is some hardcoded max balance limit. Bulletproofs in particular have emerged over the past decade as the range proof protocol with the ideal tradeoff between proof size and verification time for blockchain protocols.

We need a second range proof as well, for reasons that are a bit more subtle. The encryption scheme we will choose will work over a cyclic group $G$ with order of some prime modulus $p$ . A malicious sender could “send” an amount $p - 1$ , which will update their balance $b$ to $b - (p - 1)$ . Modulo $p$ this gives us $b + 1$ , so that the sender now has more funds than they started with, before they sent their transaction. We can fix this by doing a range proof which verifies that their sent amount $Δ \in [0, 2^{λ^{'}}]$ , where $2^{λ^{'}}$ is a hardcoded max transfer amount.

The recipient here has their balance encrypted under a public encryption key, let’s call it $R$ . The sender also has their own encryption key $S$ . But the sent transction $f (Δ)$ must be encrypted under a user’s encryption key in order to add or subtract its amount from their balance. This means that it is impossible to both add a single ciphertext $f (Δ)$ to the recipient’s balance while also subtracting it from the sender’s balance.

This can be fixed. Denote $f_{X} (a)$ as being the encyption of value $a$ under the public key $X$ . We can then encrypt the transferred amount $Δ$ twice, as $f_{S} (Δ)$ and $f_{R} (Δ)$ . In updating user’s balances we then set the recipient’s balance to be $f_{R} (c) + f_{R} (Δ)$ and the sender’s balance to be $f_{S} (e) - f_{S} (Δ)$ .

To encrypt, the user computes the following ciphertext, consisting of two group elements: $(b G + r Y, r G)$ . To decrypt, a user must subtract $b G + r Y$ by $x (r G)$ to recover $b G$ . To then recover $b$ , the user must then solve the discrete logarithm problem for $b G$ . Ideally the transferred amounts are small enough that this is easy to do, which is certainly the case at 32 bits. We’ll discuss ways of getting around this later.

Note that for this to work, the value $b$ encrypted must lie within the order of the group $G$ . This is a reasonable assumption, as it is unlikely anyone will wish to make payments larger than the prime modulus of any secure groups currently in use for cryptographic applications.

How to Prove It: Gimme that Sigma Rizz

One core ingredient here is missing. We need some way of proving that $f_{R} (Δ)$ and $f_{S} (Δ)$ actually encrypt the same thing. This can be done with a simple zero-knowledge Sigma protocol. We can start to approach this problem with the following NP relation: given a group with two generators $G, H$ , two public keys $Y, Y^{'}$ , three ElGamal ciphertexts $(C, D)$ , $(C^{'}, D)$ and $(C_{1}, C_{2})$ , and private witness scalars $b, r, v, s k$ we prove that:

$D = r G$

$C = v G + r Y$

$C^{'} = v G + r Y^{'}$

$Y = s k G$ .

$C_{1} - C = b G + s k (C_{2} - D)$

Let’s break this down a bit. The sender has public key $Y$ which encrypts their incoming transacted amount $v$ as $(C, D)$ under randomness $r$ . The recipient has public key $Y^{'}$ which encrypts their outgoing transaction as $(C^{'}, D)$ . Note that the two ciphertexts sharing the same $D$ is not a typo - the second group element in the ElGamal ciphertext only encodes the randomness without using the public key, which allows us to reuse it to save space. We’ll explain what $b$ , and $(C_{1}, C_{2})$ are doing in a bit.

The first four equations in the relation are fairly easy to understand. The first, $C = v G + r Y$ , simply proves that the sender’s outgoing ciphertext correctly encrypts the outgoing amount $v$ under randomness $r$ . Similarly, $C^{'} = v G + r Y^{'}$ proves the recipient’s incoming ciphertext correctly encrypts the same value under the same randomness. Finally, $D = r G$ proves that the randomness is correctly committed to in the second element of both ciphertexts, and $Y = s k G$ proves that the sender has knowledge of their secret key $s k$ .

The fifth line is where things get a bit more interesting. $(C_{1}, C_{2})$ represents the ciphertext encrypting the sender’s current balance. Therefore, $(C_{1} - C, C_{2} - D)$ represents the sender’s updated balance. The right hand side of $C_{1} - C = b G + s k (C_{2} - D)$ is roughly speaking doing the opposite of decryption, as multiplying by $s k$ in the term $s k (C_{2} - D)$ transforms the commitment to some updated randomness $r^{'}$ over the generator $H$ into a commitment to $r^{'}$ using the sender’s public key $Y$ . When we add $b G$ to this, where $b$ is the sender’s correctly updated balance amount, we get the first group element of the user’s updated balance ciphertext, which is $C_{1} - C$ . Therefore, this equation proves that the sender’s updated balance is correctly formed. Specifically, it proves that their updated balance $b$ is correct relative to the other values also verified by the same proof. This then allows us to verify a range proof over this same $b$ .

It can be seen that the above relation can be easily proven by an extended version of the Schnorr identity protocol. In particular, on the prover side we sample 4 random scalars $x_{1}, x_{2}, x_{3}, x_{4}$ and then compute the group elements

$X_{1} \leftarrow x_{2} G$

$X_{2} \leftarrow x_{1} G + x_{2} Y$

$X_{3} \leftarrow x_{1} G + x_{2} Y^{'}$

$X_{4} \leftarrow x_{3} G + x_{4} (C_{2} - D)$

$X_{5} \leftarrow x_{4} G$

and sends each $X_{i}$ to the verifier. After receiving back a random challenge $ρ$ from the verifier, the prover then computes the scalars

$α_{1} \leftarrow x_{1} + ρ * v$

$α_{2} \leftarrow x_{2} + ρ * r$

$α_{3} \leftarrow x_{3} + ρ * b$

$α_{4} \leftarrow x_{4} + ρ * s k$

before sending each $α_{i}$ to the verifier.

To verify the proof, the verifier checks that the following equations hold:

$ρ D + X 1 = ? α_{2} G$

$ρ C + X 2 = ? α_{1} G + α_{2} Y$

$ρ C^{'} + X 3 = ? α_{1} G + α_{2} Y^{'}$

$ρ (C_{1} - C) + X_{4} = ? α_{3} G + α_{4} (C_{2} - D)$

$ρ Y + X_{5} = ? α_{4} G$

If all checks pass, the verification passes.

Completeness of the above is easy to verify. Knowledge soundness and zero-knowledge may be proven using arguments analogous to those used to prove security of the Schnorr identification protocol.

After the proof is verified on-chain, the sender’s and the recipient’s balances are updated as discussed before, with the transacted amount being homomorphically subtracted from the sender’s balance, and added to the recipient’s balance.

NOTE: The fact the sigma protocol uses $b$ as an input means that a user must know their current balance to send funds. This is significant in light of the earlier discussion about needing to solve a discrete log problem in the size of the message in order to decrypt, as it suggests that the sender must decrypt their balance before sending a new transaction.

Come on and Twist

A protocol using relation above, however, is actually insecure as presented. In particular, range proofs such as Bulletproofs must be done over two fixed generators $G, H$ over which the prover has no structural knowledge. Recall, however, that the first group element of an ElGamal ciphertext is done over $G$ and a variable base $Y$ . This variable base is the user’s public key for which they know the discrete logarithm relative to $G$ . Without going into the technical details here, this allows the user to forge range proofs. Needless to say, this is quite bad.

The Zether paper gets around this problem by implementing what they describe as $Σ$ -bullets, which is a modification of bulletproofs designed to allow for simultaneous range and sigma protocol proofs. However, it is quite complicated, and its use precludes the use of existing battle-tested and efficient open-source Bulletproofs implementations.

There are two other potential solutions for this which have a lower implementation complexity. The first and inferior solution is implemented by the Aptos Veiled Coins module. This is to both encrypt $v$ and $b$ using ElGamal in the exponent as done above, and to commit to these same values as Pedersen commits - which have the form $v G + r H$ when committing to value $v$ over randomness $r$ . Because the Pedersen commits have fixed bases, we can securely do range proofs over them. To prove that they encode the correct values, we add the following lines to the NP relation:

$c = v G + r H$

$c^{'} = b G + r_{b} H$

where $r_{b}$ is a piece of freshly sampled randomness. The corresponding addition to the sigma protocol required is straightforward.

The second and superior solution is implemented by Solana’s Confidential Transfers. Instead of using standard ElGamal encryption, we can use twisted ElGamal encryption, introduced in the PGC: Decentralized Confidential Payment System with Auditability paper in 2019. In this scheme, key generation is slightly different. While the secret key as before is any random scalar $s k$ , and we also fix two generators $G, H$ of a prime-order group, the public encryption key is computed as $P = s k^{- 1} H$ .

Encryption is also slightly different, being done as $(C, D) = (v G + r H, r Y)$ , where as above $G, H$ are fixed generators, $Y$ is a public encryption key, $v$ is the encrypted value, and $r$ is a piece of randomness. Note that in this encryption scheme, the first element of the ciphertext is identical to a Pedersen commitment, while the second commits to a piece of randomness using the public key $Y$ . In particular, this means that we can securely do range proofs over the first element of the ciphertext.

To decrypt using twisted ElGamal, we compute $C - s k D$ . It can be easily verified that this is correct.

Because we do not need extra Pedersen commitments, we both save on storage space and eliminate the need for the additional two lines in our sigma protocol. The protocol does need to be modified slightly to account for the change in encryption scheme, but this is not too difficult to work out and is omitted here for brevity.

Withdrawals and Deposits: How Can I Get My Money Out?

To deposit funds into a hidden balance, no proofs are needed. The deposited amount is simply encrypted under the user’s encryption key, then homomorphically added to their hidden balance, although this unavoidably reveals the amount deposited.

To withdraw an amount into a normal unencrypted balance requires another proof similar to but shorter than the proof for a fully concealed transaction. The NP relation required to be proven for a withdrawal is as follows. Given public group elements $(C_{1}, C_{2})$ , $H, G, Y$ , and scalar $v$ , and private scalars $b, r, s k$ , we prove that:

$C_{1} - v G = b G + s k C_{2}$

$Y = s k G$

where $(C_{1}, C_{2})$ is the sender’s current balance, $b$ is their updated balance amount, $Y$ is their public encryption key, and $s k$ is their secret key.

We also require a range proof over $b$ . As before we have the option of either adapting the relation for twisted ElGamal, or of adding the equation

$c = b G + r H$

to the NP relation, and computing the range proof over the Pedersen commitment $c$ .

At this point, we are done with the core of the protocol! However, there are a few problems remaining to address. The most significant of these is the front-running problem we have just unintentionally introduced.

Front-Running: How I Learned to Stop Worrying and Love Being DDOSed

Note that when sending funds, the sender must accurately input their updated balance and sent amount as part of the sigma protocol and range proofs. This introduces a new problem. If another user sends them funds and updates the sender’s balance before the sender’s proof is verified, the proof they just spent time validating is now invalid. In an extreme case, a malicious party could send to our would-be sender a constant stream of transactions, constantly updating their balance and effectively locking them out of their funds with a DDOS attack.

This problem is unsolved in the Veiled Coins module. Solana’s Confidential Transactions solves this by implementing the solution proposed in the Zether paper, which is to split a user’s balance into two balances - available and pending. The available balance does not update when a user receives a transacted amount. This amount rather is deposited into the user’s pending balance, over which proofs are never computed. Either at the user’s prompting, or automatically at pre-defined intervals, the pending balance, before being set to 0, can be homomorphically added to the user’s available balance, making their funds now available for transfer.

Efficient Discrete Logarithms With Assisted Marsupials

The next problem to address is the potential inefficiency of requiring a user to solve a discrete logarithm problem in the size of their balance every time they wish to send a transaction. For smaller 32-bit values, this is fine. At 64-bits, it is tractable, particularly with a specialized algorithm such as the one devised in the 2012 paper Computing small discrete logarithms faster by Bernstein and Lange, which uses a parallelized version of Pollard’s Kangaroo Algorithm to give $O (n^{1 / 3})$ computational complexity with an $O (n^{2 / 3})$ size precomputed table. Any larger than 64 bits, however, and things start to become very impractical. What are our options?

The solution implemented by Aptos’ Veiled Coins is to truncate a 64-bit amount by cutting off the highest and lowest 16 bits, to produce a 32-bit value. This works for specific coins where you know in advance that the lowest 16 bits are insignificant enough not to matter much, and that transactions using the highest 16 bits are large enough that they are rarely sent. However, this does produce a strictly worse user experience. In addition, it will not work in a generalized setting which covers multiple types of coin, each of which may have its own decimal point placement and patterns of transaction.

Solana’s Confidential Transfers, interestingly enough, solves this by allowing users to have infinitely large available balance amounts. This is done by additionally requiring a user to symmetrically encrypt their balance whenever they update it. Because symmetric encryption is very efficient regardless of the message size, this dramatically reduces the time needed for a user to access their balance, which may be confirmed by using it to recompute their available balance ciphertext. Presumably the supply of any one token will be smaller than the prime order of the group being used, which additionally prevents overflows.

NOTE: This symmetric encryption hint does introduce one new problem, which is that a software error - or even malicious software - by the API a user accesses to update their balance could cause their symmetric encryption to encrypt the wrong balance value. If their balance is large enough, this will make it impossible for them to decrypt using ElGamal. There is a way to deal with this - the user can still decrypt all sent and received transactions to derive their balance amount. However, if a user has sent and received many transactions this could be fairly time-consuming.

While a user’s pending balance in theory also has no limit, this is restricted by two details: one is that transacted amounts are limited in size by the range proofs detailed above, and another is a cap they have implemented on the number of incoming transactions a user is allowed to receive until they are required to roll their balance over. For example, say that incoming transactions are limited to 32 bits, and the amount of incoming transfers allowed before rolling over is $2^{16}$ . If a user receives $2^{16}$ incoming transfers without rolling over, their pending balance is now at most $2^{32} 2^{16} = 2^{48}$ , which provides a perfectly tractable discrete logarithm problem.

Confidential Transfers also increase the size of a user’s pending balance by splitting it into one lower bit chunk and one higher bit chunk, to achieve a 48-bit pending balance before overflows occur, correspondingly limiting transferred amounts to 48 bits. In parrticular, they represent 32 higher bits as a twisted ElGamal ciphertext $Y$ , and 16 lower bits as a twisted ElGamal ciphertext $X$ . When rolling over, they add $X + 2^{16} Y$ to the user’s available balance. In this design, the lower order chunk is smaller because it is more likely to contain overflows, reducing the average decryption times. And because the higher order chunk of transactions contain at most 32 bits, this implies that with a limit of $2^{16}$ incoming transfers, a user will at most need to decrypt 48 bit values.

Key Rotation For Dummies

A user’s private decryption key may be compromised, in which case they will need to rotate it to prevent loss of funds. In addition, the may simply wish to rotate their key regularly as a matter of good security hygiene. In either case they will need to prove that their currently encrypted balance under their previos ElGamal public key encrypts the same amount as their new ElGamal public key, so that given $(r H + v G, r^{'} P)$ and $(r^{'} H + v^{'} G, r^{'} P)$ , it is the case that $r = r^{'}$ . It is not difficult to derive a protocol for this similar to the one given above for full transfers.

Conclusion

Overall, the Zether protocol and its spinoffs provide a relatively simple and efficient way to perform confidential transactions in a trustless setting. Confidential Transfers on Solana provides an excellent instantiation of the core idea of Zether while addressing some practical problems around efficient decryption and front-running. Veiled Coins on Aptos provides an interesting toy example useful for educational purposes. However, the code for the latter is is very well written and commented, which makes it excellent for learning, in addition to reducing the likelihood of bugs. I highly recommend giving it a look.

Table of Contents