Methods and apparatus for runtime recovery of processor links
Abstract
Methods, apparatus, systems, and articles of manufacture are disclosed that perform runtime recovery of processor links. An example non-transitory computer readable medium comprises instructions that, when executed, causes a machine to at least determine an onset of an error based on health of a central processor unit (CPU) port, calculate a figure of merit (FOM) yield for each of a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients, select a preset coefficient based on the calculated FOM, and trigger a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A non-transitory computer readable medium comprising instructions that, when executed, cause a machine to at least:
determine an onset of an error based on health of a central processor unit (CPU) port;
calculate a figure of merit (FOM) yield for a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients;
select a preset coefficient based on the calculated FOM yield; and
trigger a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port.
2. The non-transitory computer readable medium of claim 1 , wherein the health of a CPU port is monitored using a status register and an error counter register and the error is an uncorrectable error (UCE).
3. The non-transitory computer readable medium of claim 2 , wherein the status register is to measure any one or more of a link speed, link width, and transaction retry count of the CPU port.
4. The non-transitory computer readable medium of claim 2 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value.
5. The non-transitory computer readable medium of claim 4 , wherein the increasement rate is calculated by measuring a difference between a first link speed and a second link speed, divided by a time interval.
6. The non-transitory computer readable medium of claim 1 , wherein the FOM yield is equal to a runtime of an adaptation task of the plurality of adaptation tasks.
7. The non-transitory computer readable medium of claim 1 , wherein the plurality of adaptation tasks is performed on the lane of the CPU port using a second preset coefficient of a plurality of preset coefficients.
8. The non-transitory computer readable medium of claim 1 , wherein the selected preset coefficient is determined by the greatest calculated FOM.
9. The non-transitory computer readable medium of claim 1 , wherein the link recovery mechanism performed on the CPU port is a Peripheral Component Interconnect Express link training (PCIe) based link training mechanism.
10. A method to perform runtime recovery of processor links comprising:
determining an onset of an error based on health of a central processor unit (CPU) port;
calculating a figure of merit (FOM) yield for a plurality of adaptation tasks performed on a lane of the CPU port using a first preset coefficient of a plurality of preset coefficients;
selecting a preset coefficient based on the calculated FOM yield; and
triggering a link recovery mechanism, using the selected preset coefficient to initiate a link recovery process on the CPU port.
11. The method of claim 10 , wherein the health of a CPU port is monitored using a status register and an error counter register and the error is an uncorrectable error (UCE).
12. The method of claim 11 , wherein the status register is to measure any one or more of a link speed, link width, and transaction retry count of the CPU port.
13. The method of claim 11 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value.
14. The method of claim 10 , wherein the FOM yield is equal to a runtime of an adaptation task of the plurality of adaptation tasks.
15. The method of claim 10 , wherein the plurality of adaptation tasks is performed on the lane of the CPU port using a second preset coefficient of a plurality of preset coefficients.
16. The method of claim 10 , wherein the selected preset coefficient is determined by the greatest calculated FOM.
17. The method of claim 10 , wherein the link recovery mechanism performed on the CPU port is a Peripheral Component Interconnect Express link training (PCIe) based link training mechanism.
18. An apparatus to perform runtime recovery of processor links comprising:
interface circuitry;
machine readable instructions; and
programmable circuitry to at least one of instantiate or execute the machine readable instructions to:
surveil a central processing unit (CPU) port to determine whether any uncorrectable errors (UCE) are impending;
determine a figure of merit (FOM) yield of an adaptation task that is run on a lane of a failing CPU port using a preset coefficient;
establish a selected preset coefficient that yields the best performance for the CPU, as indicated by the FOM yield; and
initiate a link recovery process on the failing CPU port.
19. The apparatus of claim 18 , wherein the programmable circuitry is to monitor health of a CPU port using a status register and an error counter register.
20. The apparatus of claim 19 , wherein the error counter register is to increment a counter when any one or more increasement rate of a link speed degradation, link width degradation, or transaction retry count is greater than a threshold value.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.