P
US12379982B2ActiveUtilityPatentIndex 59

Methods and apparatus for runtime recovery of processor links

Assignee: INTEL CORPPriority: Sep 24, 2021Filed: Sep 24, 2021Granted: Aug 5, 2025
Est. expirySep 24, 2041(~15.2 yrs left)· nominal 20-yr term from priority
Inventors:LIU SHIJIEXU TAOZHU LEILI KEVIN YUFU
G06F 11/076G06F 11/0745G06F 11/0793G06F 2201/88G06F 11/3409G06F 11/3024G06F 11/3041G06F 11/0721
59
PatentIndex Score
0
Cited by
11
References
20
Claims

Abstract

Methods, apparatus, systems, and articles of manufacture are disclosed that perform runtime recovery of processor links. An example non-transitory computer readable medium comprises instructions that, when executed, causes a machine to at least determine an onset of an error based on health of a central processor unit (CPU) port, calculate a figure of merit (FOM) yield for each of a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients, select a preset coefficient based on the calculated FOM, and trigger a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A non-transitory computer readable medium comprising instructions that, when executed, cause a machine to at least:
 determine an onset of an error based on health of a central processor unit (CPU) port; 
 calculate a figure of merit (FOM) yield for a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients; 
 select a preset coefficient based on the calculated FOM yield; and 
 trigger a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port. 
 
     
     
       2. The non-transitory computer readable medium of  claim 1 , wherein the health of a CPU port is monitored using a status register and an error counter register and the error is an uncorrectable error (UCE). 
     
     
       3. The non-transitory computer readable medium of  claim 2 , wherein the status register is to measure any one or more of a link speed, link width, and transaction retry count of the CPU port. 
     
     
       4. The non-transitory computer readable medium of  claim 2 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value. 
     
     
       5. The non-transitory computer readable medium of  claim 4 , wherein the increasement rate is calculated by measuring a difference between a first link speed and a second link speed, divided by a time interval. 
     
     
       6. The non-transitory computer readable medium of  claim 1 , wherein the FOM yield is equal to a runtime of an adaptation task of the plurality of adaptation tasks. 
     
     
       7. The non-transitory computer readable medium of  claim 1 , wherein the plurality of adaptation tasks is performed on the lane of the CPU port using a second preset coefficient of a plurality of preset coefficients. 
     
     
       8. The non-transitory computer readable medium of  claim 1 , wherein the selected preset coefficient is determined by the greatest calculated FOM. 
     
     
       9. The non-transitory computer readable medium of  claim 1 , wherein the link recovery mechanism performed on the CPU port is a Peripheral Component Interconnect Express link training (PCIe) based link training mechanism. 
     
     
       10. A method to perform runtime recovery of processor links comprising:
 determining an onset of an error based on health of a central processor unit (CPU) port; 
 calculating a figure of merit (FOM) yield for a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients; 
 selecting a preset coefficient based on the calculated FOM yield; and 
 triggering a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port. 
 
     
     
       11. The method of  claim 10 , wherein the health of a CPU port is monitored using a status register and an error counter register and the error is an uncorrectable error (UCE). 
     
     
       12. The method of  claim 11 , wherein the status register is to measure any one or more of a link speed, link width, and transaction retry count of the CPU port. 
     
     
       13. The method of  claim 11 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value. 
     
     
       14. The method of  claim 10 , wherein the FOM yield is equal to a runtime of an adaptation task of the plurality of adaptation tasks. 
     
     
       15. The method of  claim 10 , wherein the plurality of adaptation tasks is performed on the lane of the CPU port using a second preset coefficient of a plurality of preset coefficients. 
     
     
       16. The method of  claim 10 , wherein the selected preset coefficient is determined by the greatest calculated FOM. 
     
     
       17. The method of  claim 10 , wherein the link recovery mechanism performed on the CPU port is a Peripheral Component Interconnect Express link training (PCIe) based link training mechanism. 
     
     
       18. An apparatus to perform runtime recovery of processor links comprising:
 interface circuitry; 
 machine readable instructions; and 
 programmable circuitry to at least one of instantiate or execute the machine readable instructions to: 
 surveil a central processing unit (CPU) port to determine whether any uncorrectable errors (UCE) are impending; 
 determine a figure of merit (FOM) yield of an adaptation task that is run on a lane of a failing CPU port using a preset coefficient; 
 establish a selected preset coefficient that yields the best performance for the CPU, as indicated by the FOM yield; and 
 initiate a link recovery process on the failing CPU port. 
 
     
     
       19. The apparatus of  claim 18 , wherein the programmable circuitry is to monitor health of a CPU port using a status register and an error counter register. 
     
     
       20. The apparatus of  claim 19 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.