Storage of data with composite hashes in backup systems
Abstract
In one example, a method may include performance of a hash function on a digital sequence so as to generate a hash value that corresponds to the digital sequence. Next, the digital sequence may be broken into data pieces, and each data piece hashed to produce a corresponding hash value for each data piece. Then, a recipe may be produced that includes instructions which, when executed, may generate the digital sequence from the data pieces referenced by their corresponding hash values included in the recipe. Among other things, the hash values may enable reutilization of redundant data sequences by serving as pointers to the data pieces that the hash values respectively represent.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A method, comprising:
performing a hash function on a digital sequence so as to generate a hash value that corresponds to the digital sequence;
breaking the digital sequence into data pieces, and hashing each data piece to produce a corresponding hash value for each data piece; and
producing a recipe that includes instructions which, when executed, generate the digital sequence from the data pieces referenced by their corresponding hash values included in the recipe,
wherein the hash values enable reutilization of redundant data sequences by serving as pointers to the data pieces that the hash values respectively represent.
2. The method as recited in claim 1 , wherein each hash value is a deterministic and probabilistically unique identifier of its associated data piece.
3. The method as recited in claim 1 , wherein the digital sequence is a data structure.
4. The method as recited in claim 1 , wherein one of the data pieces is one of a composite, and an atomic.
5. The method as recited in claim 1 , wherein the recipe comprises a concatenation of data represented by the hash values.
6. The method as recited in claim 1 , wherein the digital sequence is broken into data pieces if the hash value corresponding to the digital sequence has not already been archived.
7. The method as recited in claim 1 , further comprising deriving a hash value from the recipe.
8. The method as recited in claim 1 , further comprising transmitting the recipe, two or more of the data pieces, and the hash values respectively associated with those data pieces, to a backup server.
9. The method of claim 1 , wherein the recipe comprises a composite or a directory element.
10. A method, comprising:
performing a hash function on a digital sequence so as to generate a hash value that corresponds to the digital sequence;
breaking the digital sequence into data pieces, and hashing each data piece to produce a corresponding hash value for each data piece;
producing a recipe that includes instructions which, when executed, generate the digital sequence from the data pieces referenced by their corresponding hash values included in the recipe, wherein the hash values enable reutilization of redundant data sequences by serving as pointers to the data pieces that the hash values respectively represent; and
transmitting a restore request to a backup server.
11. The method as recited in claim 10 , wherein the hash value that corresponds to the digital sequence obviates the need for the backup server to perform more than a single seek to restore the data identified in the request for backup.
12. The method as recited in claim 10 , wherein the restore request includes a hash value corresponding to data for which restoration has been requested.
13. The method as recited in claim 12 , wherein the hash value included in the restore request comprises a top level root hash.
14. The method as recited in claim 12 , wherein the hash value included in the restore request comprises a hash value that corresponds to a directory element.
15. The method as recited in claim 10 , wherein the digital sequence is broken into data pieces if the hash value corresponding to the digital sequence has not already been archived.
16. The method as recited in claim 10 , wherein the restore request comprises a request for backup from a particular date or time.
17. The method as recited in claim 10 , wherein the hash value that corresponds to the digital sequence obviates the need for the backup server to perform more than a single seek of a content addressed storage system (CAS) to restore the data identified in the request for backup.
18. A remote client, comprising:
client data; and
a high efficiency storage application that is operable to generate a backup of the client data, and to control the size of the backup of client data, and wherein the high efficiency storage application is configured to:
perform a hash function on a digital sequence of the client data so as to generate a hash value that corresponds to the digital sequence;
break the digital sequence into data pieces, and hash each data piece to produce a corresponding hash value for each data piece; and
produce a recipe that includes instructions which, when executed, generate the digital sequence from the data pieces referenced by their corresponding hash values included in the recipe,
wherein the hash values enable reutilization of redundant data sequences by serving as pointers to the data pieces that the hash values respectively represent.
19. The remote client as recited in claim 18 , wherein the hash value that corresponds to the digital sequence obviates the need for a backup server to perform more than a single seek to restore the data identified in the request for backup, the remote client being configured for communication with the backup server.
20. A client system that comprises:
the remote client of claim 18 , wherein the remote client is in the form of a server; and
a client node configured to communicate with the server, and the client node further configured to access the client data.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.