Tuesday, November 3, 2009

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Summary

This paper describes, Chord, a lookup protocol for peer-to-peer applications. Chord uses DHT with consistent hashing, which provides load balancing and low overhead when a node joins or leaves the network. Chord is also scalable because in a N-node network, every node is only required to know about O(log N) nodes and requires O(log^2 N) messages when doing a lookup.

In the Chord protocol, every node and key is given an identifier using a base hash function. These identifiers determine which keys should reside on which node. After arranging all the nodes in order by their identifier in a clock-wise circle, a key is assigned to the first node whose identifier is equal to or follows the key's identifier. This is called the key's successor. Every node is also aware of its successor node.

When a node joins or leaves the network, only O(1/N) fraction of the keys need to moved to a new location. When a node n joins the network, it will be assigned some of the keys previously assigned to n's successor. And, when a node n leaves the network, its keys will be reassigned to its successor.

The basic idea of the algorithm, given that each node is only aware of its successor, is that when a node cannot resolve a lookup query itself, it will pass the query to its successor, which will then pass it to its successor until a node can resolve the lookup query. This will ensure correctness but will be inefficient. To improve its performance, each node can maintain additional routing information, allowing the node to pass each lookup query further than its successor and closer to the destination.

The authors simulated and implemented the Chord protocol, both of which confirmed that the latency grows with the number of nodes in the network.

Criticism & Questions

I enjoyed reading this paper. I think they did a good job of first explaining the overview of the algorithm and then going into details.

I did have one question about the algorithm. They explain that successor(k) is responsible for key k and what happens to keys when a node enters or leaves the network. However, I'm unsure of how the key-value entries are initially entered into each node's lookup table. Maybe the keys are assigned manually to whatever nodes exist when the application is set up?

No comments:

Post a Comment