Overview of Blockchain Research Progress
Blockchain is an important cornerstone solely for the transition from the Information Internet to the Value Internet and is one of the optional technologies for the modern digital currency system. Based on cryptographic technology, through a distributed multi-node "consensus" mechanism, it can record the entire process of value transfer (transaction) in a "complete and tamper-proof" manner. The specific technologies adopted by blockchain include cryptography, consensus protocols, game theory, data storage, P2P communication, etc., which is a fusion innovation of multiple existing technologies. This article provides a summary review of blockchain research progress from four aspects: consensus protocols, security and privacy protection mechanisms, scalability and efficiency, and system/protocol security analysis and evaluation.
I. Consensus Protocols
Consensus protocols are used to achieve availability and consistency in distributed systems. It is a key technology of blockchain. Its core indicators include the robustness of the consensus protocol (ability to tolerate faults and malicious nodes), efficiency (convergence speed, meaning the speed at which the system reaches consistency or "steady state"), and security (security bound of the abstract theoretical model of the protocol). Representative protocols include BFT-type consensus represented by PBFT, Nakamoto Consensus represented by PoW/PoS, and new types of hybrid consensus.
(I) BFT-type Consensus
BFT (Byzantine Fault-Tolerant) algorithms began to be studied in the 1980s, aiming to solve the so-called Byzantine Generals Problem. The most famous BFT-type algorithm is PBFT, which is a consistency algorithm based on message passing. In a weakly synchronous network, the algorithm can reach consistency after three phases, with a complexity of O(n^2). When consistency cannot be reached, these phases will be repeated until timeout. The advantages of PBFT are fast convergence, resource saving, and a theoretical security bound (theoretically allowing no more than 1/3 of malicious nodes, i.e., the total number of nodes is 3k + 1, and the algorithm can work normally when normal nodes exceed 2k + 1).
HoneyBadgerBFT, proposed by Andrew Miller in 2016, improved PBFT. Its process consists of two parts: Atomic Broadcast and Asynchronous Common Subset protocol. It uses N binary consensus protocol instances and decides a common subset based on their results. HoneyBadgerBFT can reach consensus under asynchronous networks and does not rely on any time assumptions about the network environment.
With the increase of nodes participating in consensus, the communication overhead of BFT-type consensus will rise sharply, and the speed of reaching consensus will decrease rapidly, making it difficult to support distributed systems with tens of thousands of nodes. In addition, nodes participating in consensus must first obtain voting rights, so additional mechanisms need to be designed for the joining and exiting process of nodes, increasing protocol complexity and implementation difficulty.
(II) Nakamoto Consensus
Bitcoin transformed the consensus voting process by introducing economic incentives, changing each ledger data change from a scheduled round of voting to a rolling indefinite voting: anyone can generate a block containing transactions (adding ledger data) and broadcast it. If others agree to include the block in the ledger, they will use the hash of the block as part of their own constructed block data to "confirm" the block; "confirmation" of a block also includes "confirmation" of all previous blocks of that block; voting weight is determined by the size of the workload, and the block with the large workload attached wins. The security of this type of consensus mechanism relies on specially designed economic incentives, such as Proof of Work (PoW) or Proof of Stake (PoS).
Bitcoin's Proof of Work is to find a block header hash that meets a specific difficulty value. Virtual currency projects after Bitcoin, in order to avoid the appearance of dedicated ASIC mining machines, began to design ASIC-resistant PoW algorithms. One idea is to increase the difficulty of ASIC chip design by connecting different hash functions in series, but it does not have the ability to resist ASICs.
Another idea is to design memory-consuming algorithms, such as Ethash based on Dagger-Hashimoto for Ethereum, Equihash based on the Generalized Birthday Paradox problem for Zcash, and Cuckoo Cycle based on bipartite graph cycle detection for æternity. These algorithms require a large amount of memory during calculation. As a mature product, memory has little room for optimization, so the cost advantage of designing dedicated ASIC chips is not significant.
To overcome the problems of high resource consumption and high operating costs of PoW, PeerCoin first proposed and implemented Proof of Stake (PoS) consensus protocols. Under the PoS protocol, the probability of a node obtaining the right to create a block depends on the size of the stake the node holds in the system.
PoS generally requires users to be online at all times, which poses a great challenge to applications. To solve this problem, DPoS (Delegated Proof of Stake) consensus was derived. Its core idea is to first select some nodes from the whole network nodes, ensure the validity of these nodes, and then conduct PoS consensus within this subset of nodes.
PoS consensus mechanisms have also aroused great interest in academia. Elaine Shi and others from Cornell University proposed a PoS consensus based on the Sleepy Model in 2017, and formalized it and analyzed its security, proving that the consensus system has good robustness in a distributed environment. Aggelos Kiayias and others from the University of Edinburgh also designed a PoS scheme named Ouroboros in 2017, and the results were published at the top international cryptography conference Crypto 2017.
(III) Hybrid Consensus
Elaine Shi et al. proposed a hybrid consensus scheme in 2017 that organically combines Nakamoto Consensus and BFT-type consensus. This scheme uses the PoW mechanism to select a Committee (responsible for transaction verification confirmation and block creation), and the Committee uses PBFT for consensus confirmation of transactions and blocks.
The Algorand protocol based on Verifiable Random Function (VRF) proposed by Silvio Micali et al. in 2017 starts from another angle. It uses the "cryptographic lottery" method to randomly determine the block creator, and then uses a weighted Byzantine protocol to reach network-wide consensus, which can be seen as a hybrid scheme of multi-level dynamic validation group BFT consensus and PoS. The situation of Algorand reaching consensus reduces to 3 types, guaranteeing with high probability that there is only a unique output. Compared with Sleepy and Ouroboros consensus models, it has better determinism and is less prone to forks.
II. Security and Privacy Protection Mechanisms
Security mechanisms are the most core and critical components of blockchains, while cryptographic primitives and cryptographic schemes are the supporting technologies for security mechanisms. In public chains, security mechanisms mainly include: privacy protection, consensus protocol security, smart contract security, digital account security (wallet private key protection), off-chain transaction security mechanisms, implementation security and upgrade mechanisms of cryptographic algorithms, etc.
(I) Privacy Protection
In public chains, sensitive information such as transaction data, addresses, and identities needs to be protected, while allowing bookkeeping nodes to verify the legitimacy of transactions; for consortium chains, while building privacy protection schemes, regulatory/authorization tracking needs to be considered. Privacy protection of transaction identity and content can be achieved by adopting efficient cryptographic primitives and schemes such as zero-knowledge proofs, commitments, and witness indistinguishability; privacy protection mechanisms based on cryptographic schemes such as ring signatures and group signatures, and privacy protection mechanisms based on hierarchical certificate mechanisms are also optional schemes; efficient homomorphic encryption schemes or secure multi-party computation schemes can also be adopted to achieve privacy protection of transaction content; coin mixing mechanisms can also be used to achieve simple privacy protection.
1. Coin Mixing Technology
Coin mixing technology refers to mixing multiple unrelated inputs and then outputting them, making it impossible for the outside world to associate transaction inputs with outputs, thereby indistinguishing the flow of digital currency. This is the simplest anonymity technique.
CoinJoin is a protocol-agnostic anonymous coin mixing technology. Users need to entrust a third party to construct a transaction that mixes multiple inputs. CoinJoin technology is not completely anonymous, that is, the third party providing the service can know the flow of the mixed currency transaction. The TumbleBit protocol is another currency mixing technology. This protocol is an off-chain channel mixing protocol that also requires third-party participation, but the third party cannot know the transaction details and merely provides services. TumbleBit is divided into two sub-protocols: Puzzle-Promise Protocol and RSA-Puzzle-Solver Protocol, requiring multiple interactions between the sender, receiver, and third party.
2. Ring Signature
Monero uses One-time Ring Signature technology to ensure transaction privacy, which has two major characteristics: Unlinkability and Untraceability.
In Monero, users have two pairs of main public and private keys, used to generate a series of one-time key pairs. These one-time keys are used during transactions and cannot be linked to the main public and private key pairs. During a transaction, the sender needs to use the one-time public and private keys to calculate a unique key image, and then select a public key set to perform a ring signature. Validators can verify the legitimacy of the signature, but cannot know the signer's public key. In the network, nodes need to maintain a table to record each used key image, otherwise double-spending problems will occur. Ring signatures can effectively improve anonymity without the need for any third-party collaboration. However, compared with elliptic curve signatures, the signature length of ring signatures is significantly increased, and the complexity of generating and verifying signatures is also greatly increased, which brings extra burden to the network.
3. Zero-Knowledge Proof
ZCash uses a zero-knowledge proof technology called zk-SNARK to ensure the confidentiality of the sender, receiver, and transaction amount. In ZCash, the sender transfers money by broadcasting a commitment and a nullifier to the entire network. zk-SNARK is used to prove the legitimacy of the commitment and nullifier to the network without revealing the identity of the sender.
zk-SNARK has two characteristics: Succinct, meaning the validator only needs a small amount of calculation to complete the verification; Noninteractive, meaning the prover and the validator only need to exchange a small amount of information. zk-SNARK can prove all polynomial verification problems. It provides a systematic method to convert any verification program into a polynomial verification problem called Quadratic Span Program (QSP). Therefore, arbitrarily complex verification problems can be proved by zk-SNARK. The disadvantage of zk-SNARK is that it requires a certain amount of calculation when calculating verification data. In addition, zk-SNARK has an initial parameter setting phase to generate an "absolutely confidential" random message. Using these initialized random messages can deceive validators, so the absolute confidentiality and security of this process must be ensured.
(II) Digital Account Security
Wallet private keys are directly related to account security and need to be properly protected. Keyless cryptographic algorithms (white-box schemes of standard algorithms or designing new white-box cryptographic algorithms) and code obfuscation technologies can be adopted to prevent adversaries from extracting core cryptographic algorithm and key information; or encryption algorithms based on authentication factors such as passwords, identities, and biometric features can be used to encrypt and store keys; technical solutions based on TEE (Trusted Execution Environment) and auxiliary hardware are also one of the optional solutions for ensuring digital account security.
(III) Implementation Security and Upgrade Mechanism of Cryptographic Algorithms
It is necessary to ensure the implementation security of cryptographic algorithms in the blockchain and build security mechanisms for when cryptographic algorithms are updated/upgraded. The implementation security of cryptographic algorithms includes software implementation security and hardware implementation security, avoiding cryptographic misuse, and effectively resisting side-channel attacks.
III. Scalability and Efficiency
Scalability aims to improve the overall performance efficiency, expand capacity, or expand functionality on the basis of distributed ledger protocols. Optional methods include: shortening the block generation interval, increasing block size, adopting a two-layer chain structure, introducing the Lightning Network, pruning data in blocks without affecting security, etc.
(I) Lightning Network
The Lightning Network is an off-chain expansion solution for Bitcoin, aiming to expand the scale and speed of Bitcoin transactions. The foundation of the Lightning Network is the establishment of bidirectional micropayment channels between trading parties. HTLC (Hashed Timelock Contract) defines the basic working mode of this bidirectional micropayment channel. During transfer, the transferor freezes a sum of money and provides a hash value. Within a certain period of time, if someone can provide the hash preimage, they can use the money. On this basis, pairwise off-chain micropayment channels are established, eventually expanding into an off-chain payment network. Once the off-chain network reaches a certain scale, after a user finds a node with many channels, they can connect to other users to complete off-chain transactions. Since there is no need to go on-chain, transactions in the Lightning Network are completed instantly, and only the final settlement needs to go on-chain.
Currently, both Bitcoin and Litecoin support the Lightning Network. However, in the Lightning Network scheme, there are still significant deficiencies in the establishment of off-chain networks and routing protocols. Andrew Miller et al. from the University of Illinois at Urbana-Champaign proposed a new Lightning Network protocol in 2017, further optimizing and improving the performance efficiency of the Lightning Network in off-chain network establishment and routing.
(II) Adopting Two-Layer Chain Structure
Bitcoin-NG adopts a two-layer chain structure. Its main idea is: miners solve hash puzzles and the blocks created thereby are called keyBlocks. Miners who create keyBlocks can publish a microBlock every short period of time before the next keyBlock appears. The security and robustness of the system are based on the PoW mechanism of keyBlocks, while the transaction throughput of the system is significantly improved by the frequent release of microBlocks.
However, there are two security risks in Bitcoin-NG: one is that it cannot effectively prevent selfish mining, and the second is that after a miner creates a keyBlock, he can publish a large number of microBlocks in a short period of time, thereby triggering a large number of forks in the system and ultimately having a great impact on the convergence of the consensus mechanism, while also greatly increasing the communication load of the system.
(III) MimbleWimble
MimbleWimble technology deletes all spent outputs in transactions, which can effectively compress the size of block data. MimbleWimble uses One-Way Aggregate Signature (OWAS) to hide amounts. The hiding formula is C = rG + vH, where C is the Pedersen commitment, G and H are unrelated fixed values generated with the elliptic curve encryption function (ECDSA), v is the amount, and r is a secret random blinding key. Thereafter, a range proof is needed to prove that the output is within the normal range of values. Users do not need to traverse the entire blockchain, only verify that the sum of inputs and the sum of unspent outputs of the entire blockchain are equal, proving that the entire blockchain is correct. Therefore, users can delete all spent outputs to effectively compress the size of block data. In addition, MimbleWimble also provides certain privacy and scalability, but this scheme cannot support complex Bitcoin scripts.
IV. Security Analysis and Evaluation of Blockchain/Protocol
In the security analysis and evaluation of blockchain (protocols), on one hand, it is necessary to establish abstract theoretical models for existing consensus protocols and continuously study the security of consensus protocols based on this model; on the other hand, it is necessary to study the security of blockchain under different attack methods (or scenarios), for example: under high synchrony and high asynchrony network conditions respectively, based on reasonable hardness assumptions, attacker's computational power, attack types and methods, etc., establish corresponding statistical analysis models, and give security bounds for consensus protocols to effectively resist corresponding attacks; analyze the security of the system when incentive mechanisms fail; analyze the security of software and hardware implementations of cryptographic schemes in the system, etc.
(I) Selfish Mining
Traditional views believe that Bitcoin has an incentive-compatible mechanism, that is, no one can maximize their own interests by harming collective interests. However, the proposal of selfish mining proves that this view is not completely correct. Under the condition that miners are rational people pursuing maximized interests, as long as a mining pool controls more than 1/3 of the network's computing power, it can launch a selfish mining attack, obtain greater benefits, and pose a threat to network security.
The analysis of selfish mining is based on the assumption of rational people, but in reality, people are often not completely rational, and there are multi-party games, so there are still certain difficulties in implementing selfish mining attacks.
(II) Partition Attack
In a P2P network, as long as a certain number of nodes are controlled, an Eclipse Attack can be carried out, thereby launching a 51% attack and controlling the entire network. This is a partition attack. Assuming there are only 3 nodes mining in the network, two of which have 30% of the network's computing power respectively, and the remaining one has 40% of the network's computing power. If the attacker can control the node with 40% computing power, he can isolate the other two nodes, making them unable to reach consensus. The final result is that the chain generated by the attacker will become the final blockchain through consensus. Therefore, under partition attacks, the attacker does not need to possess more than half of the computing power to launch a 51% attack. The prerequisite for launching this attack is that all nodes linked to the isolated nodes are controlled by the attacker. When the network scale is not large, this is easier to achieve.
(III) Big Data Analysis
Block data is public across the entire network, so it is easy to analyze them. Kumar Amrit et al. analyzed Monero's historical block data in 2017 and obtained the following results: more than 65% of inputs will produce cascading effects, affecting 22% of transactions being traced; outputs from the same transaction are usually aggregated together in the next transaction; the anonymity set of recently occurred outputs is likely to be the real spent outputs.
Therefore, despite Monero's good anonymity features, through data analysis, more than half of the transactions could still be traced and analyzed. This shows that parameter selection of the system and user usage habits can also lead to privacy exposure.
(IV) Security Analysis of Abstract Theoretical Models of Consensus Protocols
Aggelos Kiayias et al. constructed an abstract theoretical model of Bitcoin's PoW protocol in 2017, and proved the security of the abstract theoretical model by drawing on the idea of provable security in cryptography; Elaine Shi et al. proposed a PoS consensus model based on the Sleepy Model in 2017, and formalized it and analyzed its security, proving that the consensus system has good robustness in a distributed environment.
(This article was published in the 3rd issue of "China Information Security" magazine in 2018)