A peer-to-peer (P2P) computer network refers to an architecture whose nodes frequently serve as both a server and as a client. The primary objective of P2P systems is to eliminate the need for separate servers to manage the system. The configuration of the P2P network will change dynamically with nodes joining and leaving the network in an unpredictable manner. The nodes may differ in terms of factors, such as processing speed, bandwidth support, and storage capabilities. The term peer implies a level of equality between the nodes. There are various definitions and interpretations of a P2P network. They can be characterized as a decentralized, constantly changing, and self-regulated architecture. Servers tend to provide services, while clients request them. A P2P node usually does both. A pure P2P network will not have nodes designated as a client or server. In reality, these networks are rare. Most P2P networks rely on a central facility, such as a DNS server, for support. Certain networks may be a hybrid between the client/server architecture and a more pure P2P architecture where there is never a specific node acting as a “master” server. For example, a file sharing P2P may use the nodes of the network to download the files, while a server may provide additional supporting information.
P2P can be classified in several ways. We will use a couple of common classification categories that are useful in understanding the nature of P2P networks. One classification is based on how indexing, the process of finding a node, is performed:
• Centralized: This is when a central server keeps track of where the data is located among peers
• Local: This is when each peer keeps track of its own data
• Distributed: This is when the data references are maintained by multiple peers
One way of understanding a P2P network is to examine its characteristics. These include the following:
• Nodes that contribute resources to the system, including:
° Data storage
° Computational resources
• They provide support for a set of services
• They are very scalable and fault tolerant
• They support load balancing of resources
• They may support limited anonymity
The nature of P2P systems is that a user may not be able to access a specific node to use a service or resources. As nodes join and leave a system randomly, a specific node may not be available. The algorithm will determine how the system responds to requests.
The basics functions of a P2P system include:
• Enrollment of peers in a network
• Peer discovery—the process of determining which peer has the information
• Sending messages between peers
Not all peers perform all of these functions.
The resources of a P2P system are identified using a Globally Unique Identifier (GUID) that is usually generated using a secure hashing function, which we will examine in DHT components. The GUID is not intended to be human readable. It is a randomly generated value providing little opportunity for conflicts. The nodes of a P2P are organized using routing overlays. It is a type of middleware that routes requests to the appropriate node. The overlay refers to a network that is on top of the physical network as identified by resources using IP addresses. We can envision a network as composed on a series of IP-based nodes. However, an overlay is a subset of these nodes usually focusing on a single task. The routing overlay will take into consideration factors, such as the number of nodes between a user and a resource, and the bandwidth of the connection, to determine which node should fulfill a request. Frequently, a resource may be duplicated or even split across multiple nodes. A routing overlay will attempt to provide the optimal path to a resource. As nodes join and leave a system, the routing overlay needs to account for these changes. When a node joins a system, it may be asked to take on some responsibilities.
When a node leaves, other parts of the system may need to pick up some of the departing nodes responsibilities. In this chapter, we will explain various concepts, which are often embedded as part of a system. We will briefly overview different P2P application, which will be followed by a discussion of Java support for this architecture. The use of distributed hash tables is demonstrated, and an in-depth examination of FreePastry is presented, which will provide insight into how many of the P2P frameworks work. When applicable, we will illustrate how some of these concepts can be implemented manually. While these implementations are not needed to use the system, they will provide a more in-depth understanding of these underlying concepts.