P
US7512591B2ExpiredUtilityPatentIndex 80

System and method to improve processing time of databases by cache optimization

Assignee: IBMPriority: Dec 9, 2005Filed: Nov 9, 2006Granted: Mar 31, 2009
Est. expiryDec 9, 2025(expired)· nominal 20-yr term from priority
Inventors:BILDHAEUSER HANS-JUERGENKARN HOLGER
Y10S707/99932G06F 16/24539
80
PatentIndex Score
20
Cited by
23
References
14
Claims

Abstract

A system and method are disclosed to improve processing time of a database system by continuous automatic background optimization of a cache memory that is fragmented into a plurality of cache fragments. The system and method include collecting indicators about efficiency of individual cache fragments by at least one of measuring a cache hit ratio of each cache fragment, measuring a processing time that a CPU of the database system needs to prepare data in the individual cache fragments, and measuring execution time the CPU needs to process the data in accordance with a SQL query. The system and method include calculating and revising approximation curves for measured values of each cache fragment to find a combination of cache fragment sizes with a highest system throughput. The system and method include changing the sizes of the cache fragments to receive highest system throughput.

Claims

exact text as granted — not AI-modified
1. A computer implemented method for improved processing time of a database system by continuous automatic background optimization of a cache memory that is fragmented into a plurality of cache fragments, the computer implemented method comprising:
 collecting indicators about efficiency of individual cache fragments by at least one of measuring a cache hit ratio of each cache fragment, measuring a processing time that a central processor unit (“CPU”) of the database system needs to prepare data in the individual cache fragments, and measuring execution time the CPU needs to process the data in accordance with a Structured Query Language (“SQL”) query, wherein the hit ratio, HRCURRENT, of a cache fragment, CF i , is measured as an indicator for the efficiency of the cache memory of a buffer pool according to
   HRCURRENT=(GETPAGES−SYNCRPAGES)/GETPAGES*100 
 where GETPAGES is a total number of pages read from the buffer pool, SYNCRPAGES is a total number of pages read synchronously from the buffer pool and an approximation function HR(CF i , CFSIZE) for the cache fragment hit ratio is computed according to
   HR(CF i ,CFSIZE)=100*(1− e   b * CFSIZE/PI ), 
 
 where CFSIZE is a cache fragment size, PI is Ludolph's Constant, wherein for predicting the efficiency of the cache memory using the approximation function, the value for variable b for the exponential curve is determined, which describes a best buffer pool cache fragment behavior according to the measured hit ratio HRCURRENT, and by then knowing the variable b, the approximation function HR(CF i , CFSIZE) is used to determine at least one of a cache size and a cache fragmentation that provides a minimum of the processing times for the query processed; 
 
 calculating approximation curves for measured values of each cache fragment based on the efficiency indicators; 
 determining a combination of cache fragment sizes from the approximation curves that yields a highest system throughput; and 
 changing the sizes of the cache fragments to match the combination of cache fragment sizes that yields a highest system throughput. 
 
   
   
     2. The computer implemented method of  claim 1 , wherein the computer implemented method repeats automatically for each query. 
   
   
     3. The computer implemented method of  claim 1 , wherein the value for variable b for the exponential curve is determined by applying a Gauss error distribution curve on the measured hit ratio values HRCURRENT. 
   
   
     4. The computer implemented method of  claim 1 , wherein determining the value for variable b for the exponential curve by applying a Gauss error distribution curve, the values of the variables of the equation for the cache fragment hit ratio
   HR(CF i ,CFSIZE)=100*(1− e   b * CFSIZE/PI ) 
 
     are searched having a minimum square deviation from the measured hit ratios HRCURRENT according to
   SUM( j= 1 to  m )[( y   j −HR(CF i ,CFSIZE)) 2 ], 
 
     wherein y i  is a previously measured hit ratio HRCURRENT at a given buffer pool cache fragment size CFSIZE, m is the number of previously measured hit ratios and HR(CF i ,CFSIZE) is the predicted hit ratio for the given buffer pool cache fragment size at y i . 
   
   
     5. A computer program product comprising a computer readable storage medium having computer usable program code programmed for improved processing time of a database system by continuous automatic background optimization of a cache memory that is fragmented into a plurality of cache fragments, the operations of the computer program product comprising:
 measuring processing times that a central processor unit (“CPU”) requires to prepare data in individual cache fragments and to process the data in accordance with a Structured Query Language (“SQL”) query; 
 measuring indicators of efficiency of the cache memory; 
 computing an approximation function for the measured efficiency indicators for each cache fragment, wherein the approximation function describes a relation between hit ratio and a current size of a specific cache fragment, wherein the hit ratio, HRCURRENT, of a cache fragment, CF i , is measured as an indicator for the efficiency of the cache memory of a buffer pool according to
   HRCURRENT=(GETPAGES−SYNCRPAGES)/GETPAGES*100 
 where GETPAGES is a total number of pages read from the buffer pool, SYNCRPAGES is a total number of pages read synchronously from the buffer pool and an approximation function HR(CF i , CFSIZE) for the cache fragment hit ratio is computed according to
   HR(CF i ,CFSIZE)=100*(1− e   b * CFSIZE/PI ), 
 
 where CFSIZE is a cache fragment size, PI is Ludolph's Constant, wherein for predicting the efficiency of the cache memory using the approximation function, the value for variable b for the exponential curve is determined which describes a best buffer pool cache fragment behavior according to the measured hit ratio HRCURRENT, and by then knowing the variable b, the approximation function HR(CF i , CFSIZE) is used to determine at least one of a cache size and a cache fragmentation that provides a minimum of the processing times for the query processed; 
 
 making an efficiency prediction for the cache memory by using the approximation function, wherein the efficiency prediction predicts a total prepare time for all cache fragments as a sum of all approximation functions for a given combination of cache fragment sizes; 
 using the efficiency prediction to determine one of a cache memory size and a cache memory fragmentation, either of which provides a minimum of processing times for the query processed by searching for those combinations of cache fragments sizes which have a lowest total prepare time over all cache fragments; and 
 changing a size of one of the cache memory and the cache memory fragmentation dynamically according to the minimum determined. 
 
   
   
     6. The computer program product of  claim 5 , wherein changing one of the size of the cache memory and the cache memory fragmentation dynamically further comprises changing the cache memory fragmentation dynamically wherein a total available cache memory size for caching data is distributed among the cache fragments and a total size of the cache memory remains constant. 
   
   
     7. The computer program product of  claim 5 , wherein at least a first time the database is used, characteristics of the cache are set in accordance with experience values. 
   
   
     8. The computer program product of  claim 5 , wherein measuring processing times that the CPU requires to prepare data in individual cache fragments and to process the data in accordance with a SQL query and measuring values that are indicators for an efficiency of a cache memory occur continuously for each query. 
   
   
     9. The computer program product of  claim 5 , wherein using the efficiency prediction to determine one of a cache memory size and a cache memory fragmentation for a query further comprises using the efficiency prediction to determine one of a cache memory size and a cache memory fragmentation for a predetermined number of queries. 
   
   
     10. The computer program product of  claim 5 , wherein one of SQL queries and objects assigned to a particular query is grouped with at least one working set. 
   
   
     11. The computer program product of  claim 5 , wherein different approximation functions are used for different workloads of the database system. 
   
   
     12. A system for improved processing time of a database system by continuous automatic background optimization of a cache memory that is fragmented into a plurality of cache fragments, the system comprising:
 a central processor unit (“CPU”); 
 at least one disk storage device; 
 a database manager that receives a Structured Query Language (“SQL”) query and to access the cache memory in response to the query and to access the at least one disk storage device in response to the query and data required by the query not residing in the cache memory; and 
 a cache optimizer that
 measures processing times that the CPU requires to prepare data in the individual cache fragments and to process the data in accordance with the query; 
 measures indicators of efficiency of the cache memory; 
 computes an approximation function for the measured efficiency indicators for each cache fragment, wherein the approximation function describes a relation between hit ratio and a current size of a specific cache fragment wherein the hit ratio, HRCURRENT, of a cache fragment, CF i , is measured as an indicator for the efficiency of the cache memory of a buffer pool according to
   HRCURRENT=(GETPAGES−SYNCRPAGES)/GETPAGES*100 
 where GETPAGES is a total number of pages read from the buffer pool, SYNCRPAGES is a total number of pages read synchronously from the buffer pool and an approximation function HR(CF i , CFSIZE) for the cache fragment hit ratio is computed according to
   HR(CF i ,CFSIZE)=100*(1− e   b * CFSIZE/PI ), 
 
 where CFSIZE is a cache fragment size, PI is Ludolph's Constant, wherein for predicting the efficiency of the cache memory using the approximation function, the value for variable b for the exponential curve is determined, which describes a best buffer pool cache fragment behavior according to the measured hit ratio HRCURRENT, and by then knowing the variable b, the approximation function HR(CF i , CFSIZE) is used to determine at least one of a cache size and a cache fragmentation that provides a minimum of the processing times for the query processed; 
 
 makes an efficiency prediction for the cache memory by using the approximation function, wherein the efficiency prediction predicts a total prepare time for all cache fragments as a sum of all approximation functions for a given combination of cache fragment sizes; 
 uses the efficiency prediction to determine one of a cache memory size and a cache memory fragmentation, either of which provides a minimum of processing times for the query processed by searching for those combinations of cache fragments sizes which have a lowest total prepare time over all cache fragments; and 
 changes a size of one of the cache memory and the cache memory fragmentation dynamically according to the minimum determined. 
 
 
   
   
     13. The system of  claim 12 , further comprising a history containing indicators of efficiency of the cache memory. 
   
   
     14. The system of  claim 13 , wherein the history contains cache memory hit rates for individual cache fragments and the cache optimizer uses the cache memory hit rates to compute the approximation function.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.