US10901993B2ActiveUtilityPatentIndex 36

Performing cache update adaptation

Assignee: AMADEUS SASPriority: Apr 3, 2018Filed: Apr 3, 2018Granted: Jan 26, 2021

Est. expiryApr 3, 2038(~11.7 yrs left)· nominal 20-yr term from priority

Inventors:CANIS LAURE BERTRAND JEROME HERER MAREK RONDEPIERRE THOMAS MURTADAK DIVENDAR UMESH PASQUIER-MEUNIER NICOLAS MORETTI REMI SAUCH FRANCIS

G06N 5/01G06N 7/01G06F 16/24539G06F 16/2379G06F 16/2455G06N 20/00G06F 16/951G06F 16/24552G06N 7/005

PatentIndex Score

Cited by

References

Claims

Abstract

A database system includes an original data source storing pieces of original data and a cache source storing pieces of cached data, each associated with an accuracy value. A method of processing queries in the system includes: for each randomly selected client query, retrieving a first piece of cached data matching the query, and retrieving a first piece of original data matching the query; for non-selected client queries, retrieving a second piece of cached data matching the query; evaluating the accuracy value of the second piece of cached data; if the accuracy value is below a given threshold, retrieving a second piece of original data matching the query, and updating the second piece of cached data by the second piece of original data; and adapting a probabilistic model based on the first piece of cached data and the first piece of original data using a machine learning algorithm.

Claims

exact text as granted — not AI-modified

The invention claimed is: 
     
       1. A method of processing queries in a database system, the database system comprising an original data source storing a plurality of pieces of original data and a cache source storing a plurality of pieces of cached data, each piece of cached data being associated with an accuracy value, the method comprising:
 randomly selecting queries from among a plurality of queries handled by the database system, at the time the respective query is received from a client; 
 for each of the queries randomly selected,
 retrieving a first piece of the cached data matching the randomly selected query from the cache source, and 
 retrieving a first piece of the original data matching the randomly selected query from the original data source; 
 
 for queries not being randomly selected,
 retrieving a second piece of the cached data matching the query from the cache source; 
 evaluating the accuracy value of the second piece of the cached data; 
 if the accuracy value is below a given threshold,
 retrieving a second piece of the original data matching the query from the original data source, and 
 updating the second piece of the cached data by the second piece of the original data; and 
 
 
 adapting a probabilistic model based on the first piece of the cached data and the first piece of the original data using a machine learning algorithm, 
 wherein the accuracy value is derived from the probabilistic model and indicates a probability that a piece of the cached data coincides with a corresponding piece of the original data. 
 
     
     
       2. The method of  claim 1 , further comprising:
 for the queries randomly selected,
 comparing the first piece of the cached data and the first piece of the original data; 
 updating the first piece of the cached data by the first piece of the original data, and returning the first piece of the original data to the client. 
 
 
     
     
       3. The method of  claim 1 , further comprising:
 for queries not being randomly selected,
 if the accuracy value is below the given threshold, returning the second piece of the original data as a result to the client; and 
 otherwise, returning the second piece of the cached data as the result to the client. 
 
 
     
     
       4. The method of  claim 1 , further comprising:
 for each of the queries randomly selected,
 sending information about the query, the first piece of the cached data and the first piece of the original data to a history database of the database system. 
 
 
     
     
       5. The method of  claim 1 , further comprising:
 comparing the first piece of the original data with the first piece of the cached data; 
 storing a first indicator value indicating whether the first piece of the original data coincides with the first piece of the cached data; 
 comparing the accuracy value of the first piece of the cached data and the given threshold; 
 storing a second indicator value indicating whether the accuracy value is below the given threshold; and 
 updating the given threshold based on pairs of the first and the second indicator values. 
 
     
     
       6. The method of  claim 5 , wherein updating the given threshold comprises:
 determining a polling rate and a polling efficiency based on the pairs of the first and second indicator values; and 
 updating the given threshold based on the polling rate and the polling efficiency. 
 
     
     
       7. The method of  claim 6 , wherein
 the polling rate is determined based on equation: (TP+FP)/(TP+FP+TN+FN), and 
 the polling efficiency is determined based on equation: TP/(TP+FP), 
 where:
 TP is a number of said pairs wherein the first indicator value indicates that the piece of the original data does not coincide with the piece of the cached data, and the second indicator value indicates that the accuracy value is below the given threshold, 
 TN is a number of said pairs wherein the first indicator value indicates that the piece of the original data coincides with the piece of the cached data, and the second indicator value indicates that the accuracy value is not below the given threshold, 
 FP is a number of said pairs wherein the first indicator value indicates that the piece of the original data coincides with the piece of the cached data, and the second indicator value indicates that the accuracy value is below the given threshold, and 
 FN is a number of said pairs wherein the first indicator value indicates that the piece of the original data does not coincide with the piece of the cached data, and the second indicator value indicates that the accuracy value is not below the given threshold. 
 
 
     
     
       8. A database system for processing queries, comprising:
 a memory containing an original data source storing a plurality of pieces of original data and a cache source storing a plurality of pieces of cached data, each piece of cached data being associated with an accuracy value 
 a processor coupled to the memory, the processor configured to:
 randomly select queries from among a plurality of queries handled by the database system, at the time the respective query is received from a client; 
 for each of the queries randomly selected,
 retrieve a first piece of the cached data matching the randomly selected query from the cache source, and 
 retrieve a first piece of the original data matching the randomly selected query from the original data source; 
 
 for queries not being randomly selected,
 retrieve a second piece of the cached data matching the query from the cache source; 
 evaluate the accuracy value of the second piece of the cached data; 
 if the accuracy value is below a given threshold,
 retrieve a second piece of the original data matching the query from the original data source, and 
 update the second piece of the cached data by the second piece of the original data; and 
 
 
 adapt a probabilistic model based on the first piece of the cached data and the first piece of the original data using a machine learning algorithm, 
 wherein the accuracy value is derived from the probabilistic model and indicates a probability that a piece of the cached data coincides with a corresponding piece of the original data. 
 
 
     
     
       9. The database system of  claim 8 , wherein the processor is further configured to:
 for the queries randomly selected,
 compare the first piece of the cached data and the first piece of the original data; 
 update the first piece of the cached data by the first piece of the original data, and return the first piece of the original data to the client. 
 
 
     
     
       10. The database system of  claim 8 , wherein the processor is further configured to:
 for queries not being randomly selected,
 if the accuracy value is below the given threshold, return the second piece of the original data as a result to the client; and 
 otherwise, return the second piece of the cached data as the result to the client. 
 
 
     
     
       11. The database system of  claim 8 , wherein the processor is further configured to:
 for each of the queries randomly selected,
 send information about the query, the first piece of the cached data and the first piece of the original data to a history database of the database system. 
 
 
     
     
       12. The database system of  claim 8 , wherein the processor is further configured to:
 compare the first piece of the original data with the first piece of the cached data; 
 store a first indicator value indicating whether the first piece of the original data coincides with the first piece of the cached data; 
 compare the accuracy value of the first piece of the cached data and the given threshold; 
 store a second indicator value indicating whether the accuracy value is below the given threshold; and 
 update the given threshold based on pairs of the first and the second indicator values. 
 
     
     
       13. The database system of  claim 12 , wherein the processor is configured to update the given threshold by:
 determining a polling rate and a polling efficiency based on the pairs of the first and second indicator values; and 
 updating the given threshold based on the polling rate and the polling efficiency. 
 
     
     
       14. The database system of  claim 13 , wherein the polling rate is determined based on equation: (TP+FP)/(TP+FP+TN+FN), and
 wherein the polling efficiency is determined based on equation: TP/(TP+FP), 
 where:
 TP is a number of said pairs wherein the first indicator value indicates that the piece of the original data does not coincide with the piece of the cached data, and the second indicator value indicates that the accuracy value is below the given threshold, 
 TN is a number of said pairs wherein the first indicator value indicates that the piece of the original data coincides with the piece of the cached data, and the second indicator value indicates that the accuracy value is not below the given threshold, 
 FP is a number of said pairs wherein the first indicator value indicates that the piece of the original data coincides with the piece of the cached data, and the second indicator value indicates that the accuracy value is below the given threshold, and 
 FN is a number of said pairs wherein the first indicator value indicates that the piece of the original data does not coincide with the piece of the cached data, and the second indicator value indicates that the accuracy value is not below the given threshold. 
 
 
     
     
       15. A computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method according to  claim 1 .

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.