Peer-to-peer (P2P) networking
Cardano nodes and the interactions between them are combined together within a networking layer, which distributes information about transactions and block creation among all active nodes. In this way, the system validates and adds blocks to the chain and also verifies transactions. A distributed network of nodes must keep communication delays to a minimum, and be resilient enough to cope with failures or capacity constraints.
In the Byron federated system, nodes were connected by a static configuration that was defined in a topology file. Since the introduction of Shelley, the system has been functioning in a hybrid mode. Moving from the federated state of network connectivity to the hybrid one, the teams delivered a manually constructed P2P network of SPO relay nodes. This means that SPO block-producing nodes can connect to both trusted relay nodes and other SPO-run relay nodes. Hybrid connectivity is not automated, however, it enables the exchange of block and transaction information without relying on trusted relays only.
A trusted relay is the one that the SPO, wallet, or exchange accessing the network ‘trusts’. Though this role has mostly been performed by IOG, other entities, such as the Cardano Foundation, a wallet, or exchanges can also run trusted relays. Block-producing nodes can connect to any relays they deem trustable.
- Federated: introduced in the Byron development phase in 2017, trusted core and relay nodes maintained the network and connected users, wallets, and exchanges.
- Hybrid: since the Shelley development phase in 2020, block-producing nodes send and receive communications through trusted relays and/or a manual community-developed and managed tool called the topology updater.
- Dynamic P2P: provides automation and resilience to optimize network performance. SPO relays can automatically connect to each other through self-discovery and optimization.
To ensure efficient communication between nodes, it is desirable to enable automated connections of SPO relays to each other, so that less static configuration is needed. Dynamic P2P has been gradually enabled since the node v.1.35.6 release. For more details, see this blog post.
With the deployment of Dynamic P2P, networking keeps evolving with additions such as Ouroboros Genesis and peer sharing:
- Ouroboros Genesis: in development. Anyone running their own node or Daedalus wallet will connect to a fully decentralized and self-organized network. Genesis enables secure node bootstrapping and the removal of trusted relays. Read this blog post to learn more about the transition to Genesis.
- Peer sharing: in development. Peer sharing will facilitate the discovery of potential peers that are not registered on the chain within the overall Cardano node network. This phase will also allow anyone to contribute to running the network, rather than just using resources from SPOs.
Dynamic P2P capabilities
The network stack undergoes a number of improvements to achieve the desired network resilience. These improvements do not require a protocol change, but rather enable automated peer selection and communication between peers and nodes.
The P2P networking is enabled due to the use of the following components:
- Outbound governor (formerly known as P2P governor) ‒ manages the automatic initiation of outbound connections between peers and classifies them as cold, warm, or hot (known, ready, or in use). The governor establishes a connectivity map of the local network. One part of the governor is the churn governor, which optimizes the network graph by dropping the 20% lowest performing peers.
- Server ‒ accepts connections. There is a soft and a hard limit on the number of connections that influence the speed of connection acceptance.
- Inbound protocol governor ‒ runs and governs inbound network ‘mini-protocols’ on an accepted connection.
- Connection manager ‒ tracks the state of each network connection: each one might be used as a warm or hot outbound connection or used by the inbound side. Warm connections only keep connectivity while hot ones take an active part in the consensus protocol.
Next, we take a closer look at how the outbound governor works to ensure automated connectivity between peer nodes on the network.
Outbound governor functionality
The Cardano network consists of multiple peer nodes. Some peer nodes are more active than others, some have established connections, and some should be promoted to ensure the best system performance. Each block-producing node maintains a set of peers mapped into three categories:
- cold peers ‒ existing peers without an established network connection
- warm peers ‒ peers with an established bearer connection, which is only used for network measurements without implementing any of the node-to-node mini-protocols
- hot peers ‒ peers that have a connection, which is being used by all three node-to-node mini-protocols.
Newly discovered peers are initially added to the cold peer set. The outbound governor is then responsible for peer connection management: it defines which peers are beneficial for connection purposes, and which should be promoted or demoted between cold, warm, or hot sets.
The primary goal of the outbound governor is to maintain the target number of cold, warm, and hot peers. This builds and maintains a connectivity map of the local part of the network, and simplifies the process of connection definitions by handling them automatically.
To establish connectivity between nodes, the outbound governor engages in the following activities:
- promotion of cold peers to warm peers
- demotion of warm peers to cold peers
- promotion of warm peers to hot peers
- demotion of hot peers to warm peers.
The outbound governor needs to establish and maintain:
- a target number of cold peers (for example, 100)
- a target number of hot peers (for example, between 2–20)
- a target number of warm peers (for example, between 10–50)
- a set of warm peers that are sufficiently diverse in terms of hop distance and geographic locations
- a target churn frequency for hot or warm changes
- a target churn frequency for warm or cold changes
- a target churn frequency for cold or unknown changes.
Using 2–20 hot peers is cost-efficient, as peers exchange only their block headers. The block body, in turn, is typically requested only once, and it tends to follow the shortest path through the connectivity graph.
The purpose of warm peers is to provide access to those peers that can be quickly promoted to hot ones (in case any of the hot peers fail), and also provide candidates for the churn of hot peers.
The policy for selecting which warm peers to promote to hot relies on the upstream measurements. The purpose of a degree of churn between cold and warm peers is, in part, to discover the network distance between more peers and to allow potentially better warm peers to take over from existing hot peers. This enables further optimization and adjustments. Maintaining diversity in hop distances contributes to better block distribution times across the globally distributed network.
Overall, this approach follows a common pattern for probabilistic search or optimization that uses a balance of local optimization with some elements of higher-order disruption to avoid becoming trapped in a poor local optimum.
Peers maintain a limited set of information, which is based on their previous direct interactions. Cold peers, for instance, may not maintain any data, as they have not established interactions yet. Such data can be compared to a ‘reputation’ property, however, these details are purely local and are not shared among other nodes. Local peer reputation information is also updated when peer connections fail. The network does not maintain negative peer information for extended periods of time: to bound resources and because of the simplicity of Sybil attacks.
The implementation classifies exceptions that cause connection failures into three classes:
- internal node exceptions (for example, local disk corruption)
- network failures (for example, dropped TCP connections)
- adversarial behavior (for example, a protocol violation detected by the typed-protocols layer or by the consensus layer).
In the case of adversarial behavior, the peer can be immediately demoted from the hot, warm, or cold sets.