P
US11080204B2ActiveUtilityPatentIndex 61

Latchless, non-blocking dynamically resizable segmented hash index

Assignee: ORACLE INT CORPPriority: May 26, 2017Filed: Jun 7, 2019Granted: Aug 3, 2021
Est. expiryMay 26, 2037(~10.9 yrs left)· nominal 20-yr term from priority
Inventors:TEOTIA SIDDHARTHKUNCHITHAPADAM KRISHNALAHIRI TIRTHANKARKAMP JESSEGLEESON MICHAEL JLOAIZA JUAN RSWART GARRET FMACNAUGHTON NEIL J SSHERGILL KAM
G06F 12/126G06F 12/0864G06F 2212/621G06F 2212/1041G06F 16/9014G06F 12/128G06F 16/2255G06F 2212/313G06F 16/2453G06F 12/1018G06F 12/0868
61
PatentIndex Score
1
Cited by
69
References
20
Claims

Abstract

A hashing scheme includes a cache-friendly, latchless, non-blocking dynamically resizable hash index with constant-time lookup operations that is also amenable to fast lookups via remote memory access. Specifically, the hashing scheme provides each of the following features: latchless reads, fine grained lightweight locks for writers, non-blocking dynamic resizability, cache-friendly access, constant-time lookup operations, amenable to remote memory access via RDMA protocol through one sided read operations, as well as non-RDMA access.

Claims

exact text as granted — not AI-modified
We claim: 
     
       1. A method comprising:
 organizing a hash table based upon a particular hash function into a plurality of segments, wherein each segment of the plurality of segments includes a set of buckets; 
 inserting hash records into the hash table based on a first number of bucket-identifying bits; 
 dynamically resizing the hash table without blocking access to the hash table, wherein dynamically resizing the hash table includes:
 selecting a particular segment of the plurality of segments for resizing; 
 wherein the particular segment includes a first plurality of buckets; 
 adding to the hash table one or more companion segments of the particular segment; 
 wherein the one or more companion segments include a companion segment that includes a second plurality of buckets; and 
 redistributing hash records from one or more buckets in the first plurality of buckets to one or more buckets in the second plurality of buckets based on the particular hash function and a second number of bucket-identifying bits that is different than the first number of bucket-identifying bits; 
 
 while dynamically resizing the hash table, allowing one or more requesting entities to continue to perform lookups using the hash table and the particular hash function; 
 wherein the method is performed by one or more computing devices. 
 
     
     
       2. The method of  claim 1  wherein:
 the one or more companion segments include companion buckets for the buckets in the particular segment; 
 redistributing hash records includes, for each particular bucket in the first plurality of buckets in the particular segment, splitting the bucket by performing the steps of:
 identifying a set of candidate hash records that qualify as redistribution candidates for the particular bucket, wherein the set of candidate hash records that qualify as redistribution candidates for the particular bucket are hash records that correspond to key values that hash to the particular bucket using the particular hash function and the first number of bucket-identifying bits; 
 for each candidate hash record, determining a target bucket based on the second number of bucket-identifying bits; and 
 for those candidate hash records whose target bucket is not the particular bucket, moving the hash records to a bucket in one of the one or more companion segments. 
 
 
     
     
       3. The method of  claim 2  wherein identifying the set of candidate hash records that qualify as redistribution candidates for the particular bucket includes:
 inspecting overflow-indicating bits maintained for the particular bucket to identify hash records for which the particular bucket is a primary bucket; and 
 inspecting overflow-indicating bits maintained for a next bucket that immediately follows the particular bucket to identify hash records for which the next bucket is a secondary bucket. 
 
     
     
       4. The method of  claim 1  further comprising:
 using a resize index to track progress of the dynamic resizing of the hash table; 
 during dynamic resizing of the hash table, determining that a new hash record for a particular key is to be inserted into the hash table; 
 determining, based on the resize index and first number of bucket-identifying bits produced by hashing the particular key with the particular hash function, whether to:
 select a target bucket based on the first number of bucket-identifying bits produced by hashing the particular key; or 
 select the target bucket based on the second number of bucket-identifying bits produced by hashing the particular key; 
 
 storing the new hash record in the hash table based on the particular key mapping to the target bucket using the particular hash function. 
 
     
     
       5. The method of  claim 1  further comprising:
 using a resize index to track progress of the dynamic resizing of the hash table; 
 during dynamic resizing of the hash table, a requesting entity performing a lookup based on a particular key with the particular hash function by:
 determining, based on the resize index and the first number of bucket-identifying bits produced by hashing the particular key with the particular hash function, whether to:
 select a target bucket based on the first number of bucket-identifying bits produced by hashing the particular key; or 
 select the target bucket based on the second number of bucket-identifying bits produced by hashing the particular key; 
 
 
 performing the lookup based on the particular key mapping to the target bucket and the particular hash function. 
 
     
     
       6. The method of  claim 1  further comprising:
 after some but not all of the plurality of segments have been resized, halting the resizing of the hash table in response to determining that a load factor of the hash table satisfies certain conditions; and 
 resuming the resizing of the hash table responsive to detecting that that the load factor of the hash table ceases to satisfy certain conditions. 
 
     
     
       7. The method of  claim 1  further comprising, when a hash record is to be inserted into the hash table, performing the steps of:
 determining a target bucket based on a hash value produced by hashing a key value that corresponds to the hash record with the particular hash function; 
 determining that the target bucket does not have room for the hash record; 
 responsive to determining that the target bucket does not have room for the hash record, inspecting N buckets that immediately follow the target bucket to find room for the hash record; and 
 responsive to determining that the N buckets that immediately follow the target bucket do not have room for the hash record, not storing the hash record in the hash table. 
 
     
     
       8. The method of  claim 7  where N is 1. 
     
     
       9. The method of  claim 7  further comprising storing, for each bucket in the hash table, a set of overflow-indicating bits that indicate, for each hash record in the bucket, whether the bucket is a primary bucket for the hash record. 
     
     
       10. The method of  claim 1  further comprising determining the second number of bucket-identifying bits for a particular hash record by prepending, to the first number of bucket-identifying bits for the particular hash record, one or more bits from a tag value stored in the particular hash record. 
     
     
       11. A method comprising:
 organizing a hash table based upon a hash function into a plurality of segments, wherein each segment of the plurality of segments includes a set of buckets; 
 inserting hash records into the hash table based on a first number of bucket-identifying bits; 
 without changing the hash function used for the hash table:
 dynamically resizing the hash table without blocking access to the hash table, wherein dynamically resizing the hash table includes: 
 selecting a particular segment of the plurality of segments for resizing; 
 wherein the particular segment includes a plurality of first buckets; 
 adding to the hash table one or more companion segments of the particular segment; 
 wherein the one or more companion segments include a companion segment that includes a second plurality of buckets; 
 redistributing hash records from one or more buckets in the first plurality of buckets to one or more buckets in the second plurality of buckets based on a second number of bucket-identifying bits that is different than the first number of bucket-identifying bits; 
 
 while dynamically resizing the hash table, allowing one or more requesting entities to continue to perform lookups using the hash table and the hash function; 
 wherein the method is performed by one or more computing devices. 
 
     
     
       12. The method of  claim 11  wherein:
 the one or more companion segments include companion buckets for the buckets in the particular segment; 
 redistributing hash records includes, for each particular bucket in the first plurality of buckets in the particular segment, splitting the bucket by performing the steps of:
 identifying a set of candidate hash records that qualify as redistribution candidates for the particular bucket, wherein the set of candidate hash records that qualify as redistribution candidates for the particular bucket are hash records that correspond to key values that hash to the particular bucket using the hash function and the first number of bucket-identifying bits; 
 for each candidate hash record, determining a target bucket based on the second number of bucket-identifying bits; and 
 for those candidate hash records whose target bucket is not the particular bucket, moving the hash records to a bucket in one of the one or more companion segments. 
 
 
     
     
       13. The method of  claim 12  wherein identifying the set of candidate hash records that qualify as redistribution candidates for the particular bucket includes:
 inspecting overflow-indicating bits maintained for the particular bucket to identify hash records for which the particular bucket is a primary bucket; and 
 inspecting overflow-indicating bits maintained for a next bucket that immediately follows the particular bucket to identify hash records for which the next bucket is a secondary bucket. 
 
     
     
       14. The method of  claim 11  further comprising determining the second number of bucket-identifying bits for a particular hash record by prepending, to the first number of bucket-identifying bits for the particular hash record, one or more bits from a tag value stored in the particular hash record. 
     
     
       15. The method of  claim 11  further comprising:
 using a resize index to track progress of the dynamic resizing of the hash table; 
 during dynamic resizing of the hash table, determining that a new hash record for a particular key is to be inserted into the hash table; 
 determining, based on the resize index and first number of bucket-identifying bits produced by hashing the particular key with the hash function, whether to:
 select a target bucket based on the first number of bucket-identifying bits produced by hashing the particular key; or 
 select the target bucket based on the second number of bucket-identifying bits produced by hashing the particular key; 
 
 storing the new hash record in the hash table based on the particular key mapping to the target bucket using the hash function. 
 
     
     
       16. The method of  claim 11  further comprising:
 using a resize index to track progress of the dynamic resizing of the hash table; 
 during dynamic resizing of the hash table, a requesting entity performing a lookup based on a particular key with the hash function by:
 determining, based on the resize index and the first number of bucket-identifying bits produced by hashing the particular key with the hash function, whether to:
 select a target bucket based on the first number of bucket-identifying bits produced by hashing the particular key; or 
 select the target bucket based on the second number of bucket-identifying bits produced by hashing the particular key; 
 
 
 performing the lookup based on the particular key mapping to the target bucket and the hash function. 
 
     
     
       17. One or more non-transitory media storing instructions which, when executed by one or more computing devices, cause:
 organizing a hash table based upon a particular hash function into a plurality of segments, wherein each segment of the plurality of segments includes a set of buckets; 
 inserting hash records into the hash table based on a first number of bucket-identifying bits; 
 dynamically resizing the hash table without blocking access to the hash table, wherein dynamically resizing the hash table includes:
 selecting a particular segment of the plurality of segments for resizing; 
 wherein the particular segment includes a first plurality of buckets; 
 adding to the hash table one or more companion segments of the particular segment; 
 wherein the one or more companion segments include a companion segment that includes a second plurality of buckets; 
 redistributing hash records from one or more buckets in the first plurality of buckets to one or more buckets in the second plurality of buckets based on the particular hash function and a second number of bucket-identifying bits that is different than the first number of bucket-identifying bits; 
 
 while dynamically resizing the hash table, allowing one or more requesting entities to continue to perform lookups using the hash table and the particular hash function. 
 
     
     
       18. The one or more non-transitory media of  claim 17  wherein:
 the one or more companion segments include companion buckets for the buckets in the particular segment; 
 redistributing hash records includes, for each particular bucket in the first plurality of buckets in the particular segment, splitting the bucket by performing the steps of:
 identifying a set of candidate hash records that qualify as redistribution candidates for the particular bucket, wherein the set of candidate hash records that qualify as redistribution candidates for the particular bucket are hash records that correspond to key values that hash to the particular bucket using the particular hash function and the first number of bucket-identifying bits; 
 for each candidate hash record, determining a target bucket based on the second number of bucket-identifying bits; and 
 for those candidate hash records whose target bucket is not the particular bucket, moving the hash records to a bucket in one of the one or more companion segments. 
 
 
     
     
       19. The one or more non-transitory media of  claim 18  wherein identifying the set of candidate hash records that qualify as redistribution candidates for the particular bucket includes:
 inspecting overflow-indicating bits maintained for the particular bucket to identify hash records for which the particular bucket is a primary bucket; and 
 inspecting overflow-indicating bits maintained for a next bucket that immediately follows the particular bucket to identify hash records for which the next bucket is a secondary bucket. 
 
     
     
       20. The one or more non-transitory media of  claim 17  wherein the instructions further comprise instruction for:
 determining the second number of bucket-identifying bits for a particular hash record by prepending, to the first number of bucket-identifying bits for the particular hash record, one or more bits from a tag value stored in the particular hash record.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.