Hedera's Hashgraph Consensus Service
21 March 2022Having recently read a paper by the University College London’s Centre for Blockchain Technologies that compares the energy footprint of different consensus mechanisms, in which the Hedera network was concluded to have the lowest energy footprint, my curiosity piqued around Hedera, the only non-blockchain distributed ledger technology (DLT) amongst blockchain DLTs in the study (Ethereum 2.0, Algorand, Cardano, Polkadot, and Tezos) and more importantly, how it is allegedly more energy efficient than the other DLTs.
Self-proclaimed to be “the most used, sustainable, enterprise-grade public network for the decentralized economy,” Hedera certainly possesses attributes that set it apart from more well-known examples of DLTs like Bitcoin, Ethereum, and Hyperledger. Chief among them is the fact that it is a hashgraph and not a blockchain. Invented in 2016 by Leemon Baird, Hashgraph is his answer to achieving distributed consensus at scale, with the primary use being targeted at the private corporate sector. Put explicitly, the technology behind blockchain DLTs is blockchain; the technology behind hashgraph DLTs is the direct acyclic graph (DAG). The distinction lies in the way that information is disseminated and the consensus mechanism of the underlying technology.
Asynchronous Byzantine Fault Tolerant (ABFT)
The ability to reach a consensus is the crux of a concept called Byzantine Fault Tolerant, which describes a situation in which a distributed network of computing systems must be able to achieve an agreement on a broadcasted unit of information or state of the network, under the assumption that there exists faulty (due to miscommunication) or malicious nodes that will try to deceive the system with missing or incorrect information. Therefore, a system (a blockchain, a hashgraph, a DLT, etc.) is deemed to be Byzantine fault tolerant if it is able to operate without issue despite there being nodes that are unreliable. Where a BFT model presumes there to be a threshold of message latency, after which point the system may no longer reliably reach consensus in the distributed manner, the asynchronous property allows for messages to be delayed or lost indefinitely while still maintaining consensus functionality, which is a more verisimilar reflection of network reliability. Hence ABFT is deemed the highest degree of security.
Consensus is paramount in distributed technology because when there is no central authority, there needs to be a way for participants (nodes) in the system to come to an agreement on the state of the transactions in the network (i.e., verify transactions). Various consensus mechanisms exist, which at the time of writing are proof of work (PoW), economy-based, leader-based, voting-based, and virtual-voting, each with its own strengths and pitfalls when considering scalability, security and decentralization – the DLT trilemma. PoW and proof of stake (PoS) are two of the more well-known algorithms, partly due to the discourse regarding the computational effort required to achieve that consensus and the vulnerabilities that may belie an economic model which lacks the foundational mathematical proof behind cryptography. Briefly, PoW involves the use of miners (nodes) and mining rigs (sophisticated hardware) to solve complex mathematical problems, the difficulty of which can be adjusted in order to regulate the speed at which blocks are generated. Economy-based PoS revolves around the concept that nodes vote on which block to add next to the chain, with financial stakes tied to their decision and the amount of stake being directly correlated to the weight of the vote. If the block that the validator (node) chose isn’t part of the majority, they lose their stake, which incentivizes nodes to vote with the majority to achieve consensus.
Hashgraph Consensus - Gossip about Gossip Protocol with Virtual Voting
Hashgraph is the consensus mechanism of Hedera that aims to combat issues that presently plague consensus mechanisms, such as inefficiency (high latency == high costs), security vulnerabilities that arise from partitioning (also known as sharding, when a network splits its computational and storage workload across other networks in an attempt to increase scalability), questions around fairness (of access, of ordering), and the threat of malicious nodes and DDoS attacks primarily in leader-based models. To contextualize Hashgraph’s voting system, it is relevant to note that it is an iteration of pure voting-based systems, which have decades of history originating from distributed and fault-tolerant systems, reinforced by robust mathematical proofs, but due to their inefficiencies are impractical in the absence of a leader given the dependency on individual votes being sent out (and received). Existing hybrid voting-based algorithms coalesce the desirable democratic element of pure voting-based procedures with other mechanisms (leader-based, economy-based, etc.) to realize a sufficiently scalable system.
Hashgraph implements a voting system without the votes, as the votes are the most inefficient component of voting-based systems. The goal is to spread the transactions as fast and as far as possible, which is articulated in the form of a gossip algorithm, in which the awareness of transactions are given to random nodes, each of which continue to spread to other nodes exponentially quickly. Transactions are encased in an event (left graphic in Fig. 1); where blockchains have blocks, Hedera has events, and as such, the history of who communicated with whom and in what order is appended to the event in the form of a DAG (this broadcast chain is illustrated in Fig. 1). There needs no confirmation of when each node received each event as the receipt is the transmission of the broadcast chain itself, and all nodes know what all other nodes know without having to send additional, explicit messages. Given the network’s cognizance of all of this activity, the messages’ details included, the voting algorithm can run without having to communicate directly with the nodes themselves – virtual voting, the speed of which matching that of the internet bandwidth. When nodes (virtually) vote, the amount of HBAR (Hedera’s cryptocurrency) they hold affect the weight of their vote, thus exhibiting some semblance of an economy-based consensus protocol.
Event Analysis
In a distributed network, there is great value in knowing the timing and arrival orders of transactions as this is how its validity is assessed and its priority is established relative to other transactions. The timestamp is determined by noting when each node received each event and then sorting all of those times. The middle of that sorted list is a fair indication of when each event reached most (2/3) of the network.
As proof of authentication, the creator of an event signs it with a cryptographic digital signature, which corresponds to a pair of public and private keys, the former allowing public identification of the account and the latter being proof of connection to the public key. To create the signature, the private key is used in the signing algorithm to link the signature to the event and public key. With this information, from looking at the activity on the network, the signature cannot be forged, but the event can be verified. Key structure customizations exist in Hedera, for instance, requiring signatures from a defined minimum threshold of private keys.
Cryptography exemplifies a tenet of the intractability of mathematical problems: a problem can be solved in theory, given nearly unlimited time and resources, but is highly improbable in reality due to the unrealistic parameters required. Digital signature algorithms (DSAs) operate within the framework of public-key cryptography, generally based on the algebraic properties of discrete logarithms and modular exponentiation. Essentially, modular exponentiation is efficient to compute even with large integers, whereas discrete logarithms are computational intractable; this one-way function behaviour makes it suitable for cryptography, particularly when conceptualizing public and private keys as such one-way functions. Similarly, elliptic curve cryptography (EEC) is based on the algebraic properties of elliptic curves. Hedera supports two systems of DSAs: secp256k1 and ed25519, belonging to the digital signature schemes ECDSA (Elliptic Curve Digital Signature Algorithm) and EdDSA (Edwards-curve Digital Signature Algorithm) respectively.
Hedera also allows for signature verification using smart contracts, for which the signature verification functionality already exists in the Solidity language: ecrecover(bytes32 hash, uint8 v, bytes32 r, bytes32 s) returns (address)
The process is essentially a matter of comparing the resulting address with the expected one by taking parameters (v, r, s) that can be parsed from the signature and associating it with the data from the public key and event hash.
The history of communication is connected via two hashes in each event – a hashgraph – one hash for the last event the node created (self-parents) and the last event the node received (other-parent). These pairs of hashes point to earlier gossip structures, which when broadcasted to other nodes, is a representation of gossip about gossip. Hashing, which is a computational method of inputing an indeterminate data size into a one-way mathematical function to receive a fixed data size, exists for various reasons such as security, data retrieval, or in Hedera’s case, as a means of linking events together without occupying a significant amount of overhead (~500 bytes per event).