Thursday, October 6, 2016

Learning notes on "Self-Organization in Peer-to-Peer Systems"

Self-Organization in Peer-to-Peer Systems
By:  Jonathan Ledlie, Jacob M. Taylor, Laura Serban, Margo Seltzer Harvard University

Paper Abstract:

This paper addresses the problem of forming groups in peer-to-peer (P2P) systems and examines what dependabil- ity means in decentralized distributed systems. Much of the literature in this field assumes that the participants form a local picture of global state, yet little research has been done discussing how this state remains stable as nodes enter and leave the system. We assume that nodes remain in the sys- tem long enough to benefit from retaining state, but not suf- ficiently long that the dynamic nature of the problem can be ignored. We look at the components that describe a system’s dependability and argue that next-generation decentralized systems must explicitly delineate the information dispersal mechanisms (e.g., probe, event-driven, broadcast), the ca- pabilities assumed about constituent nodes (bandwidth, up- time, re-entry distributions), and distribution of informa- tion demands (needles in a haystack vs. hay in a haystack [13]). We evaluate two systems based on these criteria: Chord [22] and a heterogeneous-node hierarchical group- ing scheme [11]. The former gives a failed request rate under normal P2P conditions and a prototype of the latter a similar rate under more strenuous conditions with an order of magnitude more organizational messages. This analysis suggests several methods to greatly improve the prototype.


Notes and synthesis:

In human cooperation we can build a super organization example in a dragon boat, a single member of a team can only row at a low speed but when the team is composed of more members and they are organized they can achieved greater speed. Peer-to-peer system and current algorithm used by this technology  is increasingly advantageous in a variety of situation.

Peer-to-peer system can be used in variety of services e.g. voice communication like Skype.  Delivering information in a greater scale and millions of users imagine a single video server that distributes content to a thousand or million of users will lead to performance degradation and greater possibility of failure. The reason for a distributed system is to attain redundancy and to have speed of light in the delivery of service.

This paper has contributed in the following:
1. implicit goals and assumptions about a particular decentralized system affects measures reliability
2. Introduced a self-organizing hierarchically-based P2P system
3. Take assumptions implicit in current P2P filesharing systems and evaluate the reliability of Chord and the hierarchical grouping system. 


In a system that is so complex such as  peer-to-peer file sharing system management  or a central coordination system is need such as the tracker  or super-peer. The aim of this paper is to have a system that is self-managing and self-maintain in the  event where nodes need to  entry or exit the system  a peer-to-peer network can adapt in any environment. There are components that contributes to the overall dependability of a decentralized systems is the type of information exchanged  across it. The information  exchange  is divided into the following five categories.
1. probe
2. event-driven point-to-point
3. event-driven-broadcast
4. continuous stream point-to-point
5. continuos stream broadcast

This paper suggest how  self-management peer-to-peer works and how system self-configures in the following steps: 

1. Assignment of  GUID (globally unique id) as identification by peers.

2. Using peer  GUID a virtual/logical ring is build according to guide design by the developer.

3. The ring has no connection yet just an organized virtually or logically using GUID.
4. Population of peer routing table by their neighbors instead of  storing thousand or millions of route in the peer network.

The peer network is build using a root node  who is responsible for calculating  a summary of all object in the group, maintaining summaries for each if its immediate children (which in turn maintain summaries for their children and will direct searches of the group.

Peer communication is important to establish a self-managing network, information that are used to control the entry, exit and node failures are join message, leave message, copy routing table and find closest in terms of peer crash or node failure.  
Crash are most  difficult event to handle on the peer-to-peer network because its dynamic in nature and we don't really know if the node is really down or notEach shaped as a tree. Every group has a root node


Root node is responsible for:
* Calculating a summary of all objects in the group
* Maintaining summaries for each of its immediate children (which in turn maintain summaries for their children)
* Directing searches of the group

Bloom Filters

Several and thousand of computer in the internet  wanted to communicate with each other they need some mechanism to be organised:

* assignment GUID - globally unique id as identification by peers
* Peer GUID virtual/logical ring according to guide
* ring has no connection yet just an organized virtually
* peer routing table, its not reasonable to store million of route/peer in the network





Reference:

https://www.eecs.harvard.edu/margo/papers/sigops02/

No comments: