shaghab: p2p Major Assignment

Third paper that we studied is "Novel Architectures for P2P Applications: the Continuous Discrete Approach". In this paper authors have talked about two architectures; one for constructing DHT and one for dynamic expander network. However for our presentation we emphasized solely on the first architecture which has one-dimensional universe i.e. a ring line. However, the first architecture also provides the base for the second architecture which has higher-dimension space i.e. plane.

Both architectures use the same approach which is to dynamically decompose a continuous space into discrete unit i.e. cell. Using the approach authors have first outlined a conceptual model for constructing dynamic data structures e.g. DHT. This conceptual model has a framework which they have titled "think continuously, act discretely". The idea is to visualize everything in continuous space but while do operations (e.g. join, lookup etc) in discrete way. First we will tell how network is constructed and then we will give example of lookup and join to elaborate how Distance Halving network would work

Construction is done, as mentioned as an approach earlier; i.e. by dynamically decomposing a continuous space into cells. Each cell corresponds to a processor. All the points in space between the current node and next node are handled by the current node. For example, let say there are two adjacent cells with value i and value j, where j > i and i, j are in [0, 1). Now all the values greater then i and less then j will be handled by i. This is same technique as used by some other structure overlay networks such as Chord, DKS etc. Note that the range is continuous hence each cell corresponds to a continuous sub-interval.

In continuous Distance Halving graph, each point x has two out-bounded pointers one pointing to point x/2 and second pointing to (x + 1)/2. For example, point 0.6 will be pointing to 0.3 and 0.8 similarly point 0.8 will be pointing to 0.4 and 0.9. In continuous graph there is only one in-bounded pointer that is from 2x (or 2x-1 is 2x >= 1). Hence 0.8 is pointed by 0.6 (1.6 -1) and 0.6 is pointed by 0.2. Hence, in continuous graph degree of each point is 3 i.e. 2 out + 1 in.

In discrete Distance Halving graph there are intervals, in general case an interval of length l is points to two intervals on left and right each of size l/2. Let say an interval starts at 0.6 and ends at 0.8; its left out-bounded interval would start from 0.3 (0.6/2) and will end at 0.4 (0.8/2) right interval will start from 0.8 and end at 0.9 – this is done by adding a 0.5 to 0.3 and 0.4 each. Important thing to note is that these left and right intervals may be covered by more then one cell each. For example, if there are adjacent cells at 0.3, 0.33, 0.37, and 0.41 then left interval is covered by three cells i.e. 0.3, 0.33 and 0.37. On average in discrete graph maximum degree is 6. Maximum degree is bounded by a smoothness which will be defined shortly.

When a new node wants to join it simply takes from its predecessor all the values that are between itself and its successor. Building upon the previous example, assume a new node k wants to join the network such that k is less then j and greater then i, where i and j are two adjacent nodes. Now, after joining, k will take responsibility for the interval that is between itself and j hence it will take from i all the nodes that are in k-interval[i].

In an ideal case each cell will be of same size however this case is too expansive[ii] to maintain so for practical purposes hence cells are allowed to grow within a variable range. Smoothness (ρ) is defined as the ratio between largest and smallest segment i.e. cell. For a discrete Distance Halving graph the maximal out-degree is at most ρ+2, and the maximal in-degree without the ring edges is at most 2ρ + 1. Smoothness of 1 means: that all cells are of same size. In this case, there will be 3 out-degrees and 3 in-degrees i.e. total degree is 6.

Discrete Distance Halving graph is isomorphic[iii] to De Bruijn graph or in other words Distance halving network has Deb Bruijn topology. Authors have mentioned that their construction method is inspired from De Bruijn. The one cosmetic resemblance one could note is that extreme ends have self-pointers. Extreme end in distance halving graph is the left most and right most node on the line while extreme end in De Bruijn node is the node with all zero and the node with all one.

Lookups are done in two ways and they are based on an interesting claim. Before telling the claim we will describe with example an interesting property. As mentioned earlier left and right interval is of half size. It’s intuitive to note that left of left will be one-fourth of original. For example, interval 0.6 to 0.8 will be halved to 0.3 to 0.4 when first left interval is taken and then it will be halved-again to 0.15 to 0.2 when left of left is taken. Same applies for right of right, left of right and right of left i.e. they all are one-fourth of the original interval. Generally speaking by keep on taking left and right in any possible random sequence interval will converge.

Now coming to claim, by interpolating the above mentioned property: lets assume that there are two cells x and y and the distance between them is d. If l(x) and l(y) are left of x and y, then based on the above property, distance between them will be half of the distance between x and y. Hence claim is that by taking any random sequence of left and right i.e. σt(x) and σt(y), interval will converge i.e. to a cell z where t is step and σt is prefix string of length t congaing left and right. e.g. lrll(x) is of t equal to 4.

For example we have cells at {0, 0.05, 0.1, 0.15… 0.90, 095} in space (0, 1]. Now take points 0.6 and 0.7. First left will be 0.3 and 0.35, and second left will be 0.15 and 0.175. In this example we converged to interval of cell at 0.15 by taking two left. Now, for the same example by taking two rights as well rr(0.6) = r(0.8) = 0.9 and rr(0.7) = r(0.85) = 0.925 one can reach convergence again, however, possibly to different cell i.e. in this case this time convergence took place at 0.9. We have given two convergences to show that any random sequence will work.

The above mentioned claim is used to look up a node y from x i.e. you first take a random sequence from x to reach a convergent point z then you back track on the path to z from y to reach y. For example, using the example in the paragraph before, 0.6 can reach 0.7 by taking this path 0.6 à 0.3 à 0.15 ~ 0.175 à 0.35 à 0.7. This kind of lookup is called Distance Halving Lookup. Greedy Lookup is the alternate option which is simply that: you take a sequence toward y from your current point until you converge at y. Example of greedy lookup for lookup of 0.7 from 0.6 in the same space as previous example is: rllr(0.7)[iv] is in 0.6. Greedy lookup results in same path while Distance Halving lookup results in, possibility of, multiple paths between cells hence Distance Halving Lookup is better for load balancing[v].

While preparing our presentation we also studied and discussed joining/leaving algorithms (to achieve smoothness) and dynamic caching for load balancing mentioned in section 3 and 4 however we will not add any more thing in this regard to keep this report short.

[i] K-interval mean the interval starting form k and ending up at k’s successor.
[ii] It expansive as when nth node will join we will have to change all n-1 previous nodes i.e. if we want to have all cells of same size.
[iii] The Distance Halving Network applies (continuous-discrete) approach to an infinite de Bruijn graph, where every point x in [0, 1) has edges to x/2, x/2 + 1/2 and 2x. [Ratajczak and Hellerstein, Deconstructing DHTs, 2003]
[iv] In greedy lookup rllr(0.7) = 0.85 -> 0.425 -> 0.2125 -> 0.60625 in coninuous space and rllr(0.7) = 0.85 -> 0.4 -> 0.2 -> 0.6 in discrete space
[v] Load is divided on different paths.

shaghab

Thursday, March 10, 2005

p2p Major Assignment

2 comments: