Vitalik: Necessary transformation of Ethereum - L2 scaling, wallet security and privacy

When Ethereum transforms from a young experimental technology to a mature technology stack that can truly bring an open, global and permissionless experience to ordinary users, the stack needs to go through three major technological transformations, roughly Simultaneously:

L2 scaling shift - all migrated to rollups

Wallet Security Shift - All Migrate to Smart Contract Wallets

Privacy Shift - Ensure privacy-preserving money transfers and ensure that all other tools being developed (social recovery, identity, reputation) are privacy-preserving

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

This is the triangle of ecosystem transformation. You can only choose 3 of 3.

Without the first, Ethereum would fail because each transaction would cost $3.75 ($82.48 if we had another bull run), and every mass market product would eventually forget the chain, and Take a centralized solution for everything.

Without the second, Ethereum would fail as users would be reluctant to store their funds (and non-financial assets) and all would move to centralized exchanges.

Without a third, Ethereum would fail, as all transactions (and POAPs, etc.) would be public for anyone to see, which would be an exorbitant sacrifice of privacy for many users, and everyone would move to have at least some hidden data centralized solution.

For the reasons mentioned above, these three transitions are critical. But they are also challenging because of the intense coordination required to solve them. Not only do the functionality of the protocol need to be improved, there are cases where fairly fundamental changes need to be made to the way we interact with Ethereum, requiring deep changes to applications and wallets.

These three shifts will revolutionize the relationship between users and addresses

In an L2-extended world, users will exist in many L2s. Are you a member of ExampleDAO, which is on Optimism? Then you have an account on Optimism! Do you hold CDP in ZkSync's stablecoin system? Then you have an account on ZkSync! Have you tried some apps that happen to be on Kakarot? Then you have an account on Kakarot! Gone are the days when users only had one address.

I have ETH in four places, according to my Brave Wallet view. Yes, Arbitrum and Arbitrum Nova are different. Don't worry, this will get more complicated over time!

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

Smart contract wallets add more complexity, making it more difficult to have the same address on L1 and various L2s. Today, most users are using externally owned accounts whose addresses are actually the hash of the public key used to verify the signature - so nothing changes between L1 and L2. However, with smart contract wallets, maintaining an address becomes more difficult. While a lot of work has been done trying to make addresses a hash of code equivalent across the network, notably CREATE2 and ERC-2470 singleton factories, it's been very difficult to do this perfectly. Some L2s (such as "type 4 ZK-EVMs") are not fully equivalent to EVMs, usually using Solidity or an intermediate assembly instead, preventing hash equivalence. Even if you could have hash equivalence, the possibility of wallets changing ownership through key changes has other non-intuitive consequences.

Privacy requires more addresses per user, and may even change the types of addresses we handle. If the private address proposal is widely used, instead of just a few addresses per user, or one address per L2, there might be one address per transaction. Other privacy schemes, even existing ones like Tornado Cash, change how assets are stored differently: many users' funds are stored in the same smart contract (and thus in the same address). To send funds to a specific user, the user needs to rely on the privacy scheme's own internal address system.

As we’ve seen, the three shifts weakened the “one user ~= one address” mental model in different ways, and some of these effects fed back into the complexity of implementing the shifts. Two particular complications are:

  1. If you want to pay someone, how do you get the information to pay them?

  2. If a user stores many assets in different places on different chains, how do they perform key replacement and social recovery?

Three transitions are related to on-chain payments (and identity)

I have coins on Scroll and I want to pay for coffee (if "I" is literally referring to me as the author of this article, then "coffee" is of course a metonym for "green tea"). You're selling me coffee, but you're only going to receive coins on Taiko. what do I do?

Basically there are two solutions:

  1. The receiving wallet (could be a merchant, or just a regular individual) strives to support each L2, with some automatic functionality to integrate funds asynchronously.

  2. The receiver provides their L2 and their address, and the sender's wallet automatically routes the funds to the target L2 through some sort of inter-L2 bridging system.

Of course, these solutions can be combined: the receiver provides the L2 list they are willing to accept, and the sender's wallet computes the payment, which may involve sending directly (if they're lucky), or via a bridged path across L2.

But that's just one example of the key challenge introduced by the three shifts: something as simple as paying someone starts to require more information than just a 20-byte address.

The switch to smart contract wallets has fortunately not burdened the address system much, but there are still some technical issues that need to be addressed in other parts of the application stack. Wallets need to be updated to ensure they are not just sending 21000 gas in transactions, but more importantly to ensure that the payment receiving side of the wallet not only tracks ETH transfers from EOAs, but also ETH sent by smart contract code. Applications that rely on the assumption of immutable ownership of addresses (e.g. NFTs that ban smart contracts to enforce royalties) will have to find other ways to achieve their goals. Smart contract wallets will also make some things easier - in particular, if someone only accepts non-ETH ERC20 tokens, they will be able to use an ERC-4337 payer to pay for gas with that token.

On the other hand, privacy again presents major challenges that we haven't really solved yet. The original Tornado Cash did not introduce these problems because it did not support internal transfers: users could only deposit into the system and withdraw. Once you can make internal transfers, users need to use the internal address scheme of the privacy system. In practice, a user's "payment message" needs to contain (i) some sort of "spending public key", a promise of a secret that the recipient can use to spend, and (ii) the sender sends an encrypted message that only the recipient can decrypt way to help recipients discover payments.

The privacy address protocol relies on the concept of a meta-address, which works in the following way: part of the meta-address is a blinded version of the sender's spending key, and the other part is the sender's encryption key (although minimal implementations can set this Both keys are the same).

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

The key lesson here is that in a privacy-focused ecosystem, users will have spending public keys and encryption public keys, and users' "payment information" will have to contain both keys. Besides paying, there are other good reasons to expand in this direction. For example, if we wanted encrypted email on Ethereum, users would need to publicly provide some form of encryption key. In an "EOA world", we could reuse account keys to achieve this, but in a world of secure smart contract wallets, we should probably have more explicit functionality to achieve this. This will also help make Ethereum-based identities more compatible with non-Ethereum decentralized privacy ecosystems, the most prominent example being PGP keys.

Three transformations and key recovery

In a world where a user may have multiple addresses, the default way to implement key changes and social recovery is to have the user perform recovery procedures on each address individually. This can be done with one click: wallets can contain software to perform recovery procedures on all users' addresses simultaneously. However, even with such UX simplification, there are three problems with naive multi-address recovery:

Unrealistic gas bills: This one speaks for itself.

Counterfactual Address: An address whose smart contract has not yet been published (actually, this means an account from which you have not yet sent funds). As a user, you have a potentially infinite number of counterfactual addresses: one or more on every L2, including L2s that don't yet exist, and a completely different set of infinite counterfactual addresses, derived from the steganographic address scheme.

Privacy: If a user intentionally has many addresses to avoid linking them together, they certainly don't want to publicly link all of them by restoring them at or around the same time!

Solving these problems is difficult. Fortunately, there is a rather elegant solution that performs quite well: an architecture that separates validation logic from asset holding.

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

Each user has a keystore contract that exists in one location (could be the mainnet or a specific L2). Users then have addresses on different L2s, where the verification logic for each address is a pointer to the keystore contract. Spending from these addresses will require a proof into the keystore contract showing the current (or more realistically, most recent) spending public key.

Proof can be achieved in several ways:

  1. Read-only L1 access directly in L2. L2 can be modified to give them a way to read the state of L1 directly. If the keystore contract is on L1, this will mean that contracts in L2 have "free" access to the keystore.

  2. Merkel branch. Merkle branches can prove L1 states to L2, or L2 states to L1, or you can combine the two to prove parts of one L2 state to another L2. The main weakness of Merkle proofs is the high gas cost due to the proof length: a proof may require 5 kB, although this will be reduced to less than 1 kB in the future thanks to Verkle trees.

  3. ZK-SNARKs. You can reduce data costs by using ZK-SNARKs of Merkle branches instead of the branches themselves. Off-chain aggregation techniques (e.g., based on EIP-4337) can be built so that a single ZK-SNARK verifies all cross-chain state proofs in one block.

  4. KZG Commitment. L2, or schemes built on top of it, can introduce a sequential addressing system that allows state proofs inside this system to be only 48 bytes long. Like ZK-SNARKs, a multi-proof scheme can combine all these proofs into a single proof for each block.

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

If we want to avoid doing a proof for each transaction, we can implement a more lightweight solution, only need to do a cross-L2 proof when recovering. Spending from an account will depend on a spending key whose corresponding public key is stored in the account, but recovery will require a transaction copying the current spending public key in the keystore. Funds in a counterfactual address are safe even if your old key is not: "activating" a counterfactual address, turning it into a working contract will require doing a cross-L2 proof that replicates the current spending public key. This thread on the Safe forums describes how a similar architecture might work.

To add privacy to such a scheme, we only need to encrypt the pointer, and then do all the proofs in ZK-SNARKs:

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

With more work (eg, using this work as a starting point), we can also strip most of the complexity of ZK-SNARKs and make a simpler KZG-based scheme.

These scenarios can get complicated. However, there are many potential synergies between these programs. For example, the concept of a "keystore contract" might also be a solution to the "address" challenge mentioned in the previous section: if we want users to have persistent addresses that don't change when users update their keys, we It is possible to put secret meta addresses, encryption keys, and other information into a keystore contract, and use the address of the keystore contract as the user's "address".

A lot of secondary infrastructure needs to be updated

Using ENS is expensive. Today, June 2023, things aren't too bad: transaction fees, while high, are comparable to ENS domain fees. Registering with zuzalu.eth cost me about $27, of which $11 is transaction fees. But if we have another bull market, fees will skyrocket. Even without the ETH price increase, the return of the gas fee to 200 gwei would raise the transaction fee for domain registration to $104. So if we want people to actually use ENS, especially for use cases like decentralized social media where users are asking for almost free registration (ENS domain fees are not an issue since these platforms provide subdomains for their users), we need ENS Runs on L2.

Fortunately, the ENS team is already on the move, and ENS on L2 is actually happening! ERC-3668 (also known as the "CCIP standard"), together with ENSIP-10, provides a method for automatically validating ENS subdomains on any L2. The CCIP standard requires setting up a smart contract that describes a method for verifying proofs of L2 data, and domain names (ecc.eth for Optinames, for example) can be placed under the control of such a contract. Once the CCIP contract controls ecc.eth on L1, accessing some subdomain.ecc.eth will automatically involve finding and verifying the L2 state of the proof (eg, Merkle branch) that actually stores that particular subdomain.

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

Actually obtaining proofs involves accessing a series of URLs stored in the contract, which admittedly feels like centralization, although I would argue that it actually isn't: it's a 1-of-N trust model (invalid proofs would be blocked by CCIP The verification logic in the contract's callback function captures that, as long as there is a URL that returns a valid proof, there is no problem). This list of URLs may contain dozens of URLs.

The work of the ENS CCIP is a success story and should be seen as a sign that the kind of radical reform we need is possible. But more application-level reforms are needed. Some examples include:

Many dapps rely on users to provide off-chain signatures. For Externally Owned Accounts (EOAs), it's easy. ERC-1271 provides a standardized way for smart contract wallets to do this. However, many dapps still don't support ERC-1271; they need to support it.

Dapps that use "Is this EOA?" to differentiate between users and contracts (eg, to prevent transfers or enforce royalties) will break. In general, I would advise against trying to find a purely technical solution; the question of figuring out whether a particular transfer of cryptographic control is a beneficial interest transfer is a difficult one that may not be resolved without resorting to some off-chain community-driven mechanisms. Next solve. Most likely, applications will have to rely less on blocking transfers and more on techniques like the Harberger tax.

How wallets interact with spending and encryption keys will need to improve. Currently, wallets typically use deterministic signatures to generate application-specific keys: a standard nonce (e.g., a hash of the application name) is signed with the EOA's private key, generating a nonce that cannot be generated without the private key. The deterministic value of , so it is technically safe. However, these techniques are "opaque" to the wallet, preventing wallets from implementing user interface level security checks. In a more mature ecosystem, signing, encryption, and related functions need to be handled more explicitly by wallets.

Light clients (eg Helios) will have to verify L2, not just L1. Today, light clients focus on checking the validity of the L1 header (using the light client sync protocol), and validating the L1 state and Merkle forks of transactions originating from the L1 header. Tomorrow, they also need to verify the proof of the L2 state originating from the state root stored in L1 (this more advanced version actually looks at the pre-confirmation of L2).

Wallets need to protect assets and data

Now, the wallet's business is to protect assets. Everything is on-chain, and the only thing the wallet needs to protect is the private keys that currently protect those assets. If you change keys, you can safely publish your previous private key on the Internet the next day. However, in a world of zero-knowledge proofs, this is no longer the case: wallets are not only protecting authentication credentials, they are protecting your data.

We saw the first signs of such a world with Zupass, the ZK-SNARK based identity system used at Zuzalu. Users have a private key, which they use to authenticate the system, which can be used to do basic proofs like "prove that I am a resident of Zuzalu, but don't reveal which one". However, the Zupass system also started to have other applications built on top of it, most notably Stamps (Zupass' version of POAPs).

One of my many Zupass stamps proving I am a proud member of Team Cat.

The key feature that stamps offer over POAPs is that stamps are private: you hold the data locally, and you only prove the stamp (or some computation on it) to them if you want them to have this information about you. But this increases the risk: if you lose this information, you lose your stamp.

Of course, the problem of holding data boils down to the problem of holding a cryptographic key: a third party (or even the chain) can hold an encrypted copy of the data. This has the convenient advantage that the action you take doesn't change the encryption key, so no interaction with the system that keeps your encryption key safe is required. But even then, if you lose your encryption key, you lose everything. Conversely, if someone sees your encryption key, they can see everything encrypted with that key.

Zupass' de-facto solution is to encourage people to store their keys on multiple devices (e.g., laptop and phone), since the chances of them losing all of them at the same time are slim. We can go a step further and use a secret share to store the key, splitting the key among multiple guardians.

This social recovery via MPC is not an adequate solution for wallets, as it means that not only current guardians, but previous guardians may collude to steal your assets, which is an unacceptable high risk. However, a breach of privacy is usually less of a risk than a complete loss of assets, and if someone requires a highly privacy-preserving use case, he can accept a higher risk of loss by not backing up those associated keys that require privacy-preserving actions.

To avoid overwhelming users with a complex system of multiple recovery paths, wallets that support social recovery may need to manage both asset recovery and encryption key recovery.

Back to the identity question

A common theme of these changes is that the concept of an "address" that you use on-chain as a cryptographic identifier that represents "you" has to change radically. "Instructions on how to interact with me" are no longer just an ETH address; they must contain some combination of multiple addresses on multiple L2s, secret meta addresses, encryption keys, and other data in some form.

One way to do this is to make ENS your identity: your ENS record can contain all this information, and if you send someone bob.eth (or bob.ecc.eth, or...), they can Find out and learn about all the things that pay and interact with you, including in more complex cross-cutting and privacy-preserving ways.

However, this ENS-centric approach has two weaknesses:

It binds too many things to your name. Your name is not you, your name is just one of your many attributes. You should be able to change your name without moving your entire identity profile and updating a bunch of records in many apps.

You can't have untrusted counterfactual names. A key UX feature of any blockchain is the ability to send coins to people who have not yet interacted with the chain. Without such functionality, there is a chicken-and-egg problem: interacting with the chain requires paying transaction fees, and paying fees requires... already owning coins. ETH addresses, including smart contract addresses with CREATE2, have this feature. The ENS name does not, because if two Bobs both decide they are bob.ecc.eth off-chain, there is no way to choose which one gets the name.

One possible solution is to put more stuff into the keystore contract mentioned in the architecture earlier in this post. The keystore contract can contain various information about you and how you interact with it (via CCIP, some of which can be off-chain), and users can use their keystore contract as a primary identifier. But the actual assets they receive will be stored in a variety of different places. Keystore contracts are not bound to names, they are counterfactual friendly: you can generate an address that can only be initialized by a keystore contract with some fixed initial parameters.

Another category of solutions has to do with abandoning the concept of user-oriented addresses, which is similar in spirit to the Bitcoin payment protocol. One idea is to rely more on direct communication channels between sender and recipient; for example, the sender could send a claim link (as an explicit URL or QR code) which the recipient could use Accept payments the way they want.

Vitalik: Ethereum needs to complete three transformations of L2, wallet and privacy

Whether it's the sender or recipient who acts first, relying more on wallets to directly generate up-to-date payment information in real-time reduces friction. Having said that, persistent identifiers are convenient (especially with ENS), and in practice the assumption of direct communication between sender and recipient is a very tricky problem, so we might look at to a combination of different technologies.

In all of these designs, it's critical to keep things both decentralized and understandable to users. We need to ensure that users can easily access the most up-to-date view of their current assets, as well as published information intended for them. These views should rely on open tools rather than proprietary solutions. It will take hard work to keep a more complex payment infrastructure from becoming an opaque "tower of abstraction" for developers to understand what's going on and adapt it to the new environment. Despite the challenges, achieving Ethereum's scalability, wallet security, and privacy for ordinary users is paramount. It's not just about technical feasibility, it's about actual accessibility for the average user. We need to meet this challenge.

Special thanks to Dan Finlay, Karl Floersch, David Hoffman, and the Scroll and SoulWallet teams for their feedback, reviews, and suggestions.

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments