# Consistent Hash

## Monotonicity

The monotonicity means when we add a bucket, we need to remap all the keys. For s specify key is can only located in the origin bucket or the new bucket. It cannot reassign to another bucket.

## Incorrect

 Bucket1 Bucket2 Bucket3 Key1 Key2 Key3 Key4 Key5 Key6 Key7 Key8 Key9 Key10 Key11 Key12

 Bucket1 Bucket2 Bucket3 Bucket4 Key2 Key3 Key1 Key5 Key6 Key9 Key10 Key4 Key7 Key8 Key11

The above example does not meet the consistency. Because when we added a new bucket Bucket4 the Key1 is moved to Bucket2, this is incorrect.

## Correct

### Example1

 Bucket1 Bucket2 Bucket3 Key1 Key2 Key3 Key4 Key5 Key6 Key7 Key8 Key9 Key10 Key11 Key12

 Bucket1 Bucket2 Bucket3 Bucket4 Key1 Key2 Key3 Key5 Key6 Key7 Key9 Key10 Key4 Key8 Key11 Key12

The above example meets the monotonicity. Because when we added a new bucket Bucket4, all the keys are either in the old bucket or the new bucket. But they did not meet the Balance.

## Balance

The balance means the data can be distribute among all node with same probability

### Incorrect

 Bucket1 Bucket2 Bucket3 Key1 Key2 Key3 Key4 Key5 Key6 Key7 Key8 Key9 Key10 Key11 Key12

 Bucket1 Bucket2 Bucket3 Bucket4 Key1 Key2 Key5 Key6 Key9 Key10 Key3 Key4 Key7 Key8 Key11 Key12

The above example meets the monotonicity but does not meet the balance.

### Correct

 Bucket1 Bucket2 Bucket3 Key1 Key2 Key3 Key4 Key5 Key6 Key7 Key8 Key9 Key10 Key11 Key12

 Bucket1 Bucket2 Bucket3 Bucket4 Key1 Key2 Key3 Key7 Key5 Key6 Key11 Key10 Key9 Key4 Key8 Key12

This is example meets the balance and also the monotonicity

# Hash Ring

## Initialize Solution

 Pros 1 Easy to understand 2 We can remove any bucket if want at any position Cons 1 The distribution of data is not uniform in most case 2 If we remove a bucket all the data of current bucket will move to next bucket, In the worst it may led system avalanche. 3 The time complexity of find a bucket is O(n) 4 We need to store all the bucket info in our memory, so space complexity is also o(n)

## Optimization

 Pros 1 The distribution of data is more uniform compared to first one. 2 We can delete any bucket at any position. 3 We can give different weight to different bucket. Cons 1 The time complexity of find a bucket is O(k*n) 2 We need to store all the bucket info in our memory, so space complexity is also o(k*n) Notice : k is the virtual bucket count n the bucket size

# Jump Consistent Hash

## Initialize Solution

### Meets Monotonicity

This algorithm meets the monotonicity

 int ch(int key, int num_buckets) {     random.seed(key);     int b = 0;     for (int i = 1; i < num_buckets; i++)     {           if (random.nextInt() %2 == 1) b = i;     }     return b; }

#### Code analysis

The key is that we use the key as random seed , if you give the same key the random value list should always be the same.

Example:

 Random list 3 buckets 4 buckets 3 4 1 6 Final index : 2 Final index : 2 3 4 1 5 Final index : 2 Final index : 3

From above analysis we can notice that when we added a bucket. For a specific key the index is either located in the old bucket or new bucket it can not move to emailother buckets that already exists in the buckets list. So this algorithm meets the monotonicity

#### Probability Analysis

Suppose we have 3 buckets in total the probability of save to each node is.

 Probability Assign to 0 1 (odd or even) ½ (even) ½  (even) 1 *  ½ * ½   = ¼ Assign to 1 1 (odd or even) ½ (odd) ½  (even) 1 *  ½  * ½  = ¼ Assign to 2 1 (odd or even) 1 (odd or even) ½  (odd) 1 * 1 * ½   = ½

Form the chart we can see that the probability of save to each node is not same for a specificity key.

#### Complexity

Time: O(n)  Space: O(1)

### Meets both Monotonicity and Balance

In the above solution the probability of moving data from one bucket to next bucket is ½, so it does not meet the balance, because the probability of data distribution is not 1/n for n buckets. Here is an solution to optimized last algorithm

 int ch(int key, int n) {     random.seed(key);     int b = 0;     for (int i = 1; i < n ; i++)     {           if (random.nextDouble() < 1.0 / (i + 1)) b = i;     }     return b; }

#### Probability Analysis

Suppose we have 4 buckets in total, The probability of save to each bucket are:

 Probability Assign to 0 1(save to 0) ½ (not to 1) (not to 2) ¾ (not to 3) 1 *  ½ *  * ¾  = ¼ Assign to 1 1(save or not) ½ (save to 1) (not to 2) ¾ (not to 3) 1 *  ½ *  * ¾  = ¼ Assign to 2 1(save or not) 1 (save or not 1) (save to 2) ¾ (not to 3) 1 * 1 *  * ¾  = ¼ Assign to 3 1(save or not) 1(save or not) 1(save or not) ¼ (save to 3) 1 *  1 * 1 * ¼  = ¼

The above method has already met the monotonicity and balance

#### Complexity

Time: O(n), Space O(1)

#### Pros and Cons

 Pros 1 The space complexity is O(1) 2 The time complexity is O(n) Cons 1 The bucket list must have order 2 We can only delete bucket from the end of bucket list 3 All the bucket should be equal they cannot have different weight

## Optimization

### Meets Monotonicity

The last algorithm can both meet the monotonicity and balance but the time complexity of find an index is o(n), here we are going to optimized last algorithm from o(n) to o(logn). Because in most case ch(int key, int n) is keep unchanged( ), We can random generate a sequence to jump such as we generate a list:3,5,2 we will reassign the data in these indexes, so we can jump to these indexes directly.

#### Anatomy

We put all the data to bucket 0, and generate a random num. When we add bucket if bucket number is greater than current num keep the key to origin bucket otherwise assign the key to new bucket.

 int ch(int key, int n) { random.seed(key);    int id = 0, fence = 0;     while (fence < n)     {         id = fence;         fence = fence + rand();     }     return id; }

#### Code analysis

Since we use key as random seed the random sequence will always be same for a same key. Suppose the random sequence list is : 3, 5, 2

 Number of buckets Final Index Analysis 1 0 id=0, n=1, fence=0 id=0, n=1, fence=3 Return 0 2 0 ditto 3 0 ditto 4 3 id=0, n=4, fence=0 id=0, n=4, fence=3 Id=3, n=4, fence=8 Return 3 5 3 ditto 6 3 ditto 7 3 ditto 8 3 ditto 9 8 id=0, n=9, fence=0 id=0, n=9, fence=3 id=3, n=9, fence=8 id=8, n=9, fence=10 Return 8

#### Complexity

Time: O(logn), Space O(1)

#### Probability Analysis

The fence is randomly chosen so the probability of save a specific key to one bucket is not 1/n. Thus, this method meets the Monotonicity but does not meet the balance.

#### Pros and Cons

 Pros 1 The space complexity is O(1) 2 The time complexity is O(logn) Cons 1 The bucket list must have order 2 We can only delete bucket from the end of bucket list 3 All the bucket should be equal they cannot have different weight

### Meets both Monotonicity and Balance

Suppose current jump index is: b, then we can get

ch(key, b) ch(key, b+1)

ch(key, b+1)=b

Next jump index is: j , the we can get

ch(key, j)  ch(key,j+1)

ch(key, j+1)  j

ch(key, j+1)  ch(key, b+1)

it mens:

ch(key, j) = ch(key, b+1)

Since the probability of move from n bucket to n+1 bucket is  ,

For example,  P(ch(key, 10)= ch(key,11))=  ,  P(ch(key, 11)= ch(key,12))=  P(ch(key, 10)= ch(key,12))=

More general ch(key, n) = ch(key, m) is

For ib  P(ch(key,i)=ch(key,b+1)) =

It means:  r               ==>          i

 Int ch(int key, int n) {     random.seed(key);     int id = -1;     int fence = 0;     while (fence < n)     {         id = fence;         fence = floor( (id+1) / randDouble())     }     return id; }

#### Code analysis

Since we use key as random seed the random sequence will always be same for a same key. Suppose the random sequence list is : 0.5, 0.3, 0.8, 0.1

 Number of buckets Final Index Analysis 1 0 id=-1, n=1, fence=0, id=0, n=1, fence=floor(1/0.5)=5 Return 0 2 0 ditto 3 0 ditto 4 3 ditto 5 3 ditto 6 3 id=-1, n=6, fence=0, id=0, n=6, fence=floor(1/0.5)=5 id=5, n=6, fence=floor(6/0.3)=20 Return 5 7 3 ditto 8 3 ditto 9 8 ditto

#### Probability Analysis

Thus this algorithm both meet the Monotonicity and Balance.

#### Pros and Cons

 Pros 1 The space complexity is O(1) 2 The time complexity is O(logn) Cons 1 The bucket list must have order 2 We can only delete bucket from the end of bucket list 3 All the bucket should be equal they cannot have different weight