P
US11249765B2ActiveUtilityPatentIndex 58

Performance for GPU exceptions

Assignee: ADVANCED MICRO DEVICES INCPriority: Aug 22, 2018Filed: Aug 22, 2018Granted: Feb 15, 2022
Est. expiryAug 22, 2038(~12.1 yrs left)· nominal 20-yr term from priority
Inventors:GUTIERREZ ANTHONY T
G06F 9/30036G06F 9/3865G06F 9/30087G06F 9/30189G06T 1/20G06F 9/3824G06F 9/3842G06F 9/30043G06F 9/3861G06F 9/30018
58
PatentIndex Score
0
Cited by
7
References
20
Claims

Abstract

Techniques for improving performance of accelerated processing devices (“APDs”) when exceptions occur are provided. In APDs, the very large number of parallel processing execution units, and the complexity of the hardware used to execute a large number of work-items in parallel, means that APDs typically stall when an exception occurs (unlike in central processing units (“CPUs”), which are able to execute speculatively and out-of-order). However, the techniques provided herein allow at least some execution to occur past exceptions. Execution past an exception generating instruction occurs by executing instructions that would not lead to a corruption while skipping those that would lead to a corruption. After the exception has been satisfied, execution occurs in a replay mode in which the potentially exception-generating instruction is executed and in which instructions that did not execute in the exception-wait mode are executed. A mask and counter are used to control execution in replay mode.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A method for executing instructions, comprising:
 identifying a first instruction capable of triggering an exception; 
 detecting execution of the first instruction; 
 detecting one or more instructions following, in program order, the first instruction; 
 after the detection of the execution of the first instruction, executing a first subset of the one or more instruction in an exception-wait mode, and skipping a second subset of the one or more instructions, the second subset of the one or more instructions comprising a set of skipped instructions that would corrupt execution; 
 detecting an end to the exception-wait mode; 
 executing the set of skipped instructions in a replay mode; and 
 resuming normal execution, 
 wherein the first instruction does not trigger an exception. 
 
     
     
       2. The method of  claim 1 , wherein executing the first subset of the one or more instructions in the exception-wait mode includes incrementing a counter for each instruction executed or skipped. 
     
     
       3. The method of  claim 2 , wherein executing in the replay mode includes decrementing the counter for each instruction executed or skipped in the replay mode, and the replay mode ends when the counter is zero. 
     
     
       4. The method of  claim 1 , wherein exception-wait mode ends upon executing an instruction that causes execution to stall. 
     
     
       5. The method of  claim 4 , wherein the instruction that causes the execution to stall comprises a dependency guard instruction. 
     
     
       6. The method of  claim 1 , wherein exception-wait mode ends upon detecting that either no exception is generated by the first instruction or that an exception generated by the instruction has been satisfied. 
     
     
       7. The method of  claim 1 , wherein execution in the replay mode occurs according to a mask. 
     
     
       8. The method of  claim 7 , further comprising generating the mask in the exception-wait mode. 
     
     
       9. The method of  claim 7 , wherein the mask is generated offline by a compiler. 
     
     
       10. The method of  claim 1 , further comprising:
 during exception-wait mode, upon detecting an instruction that would corrupt execution, stalling execution at the instruction that would corrupt execution. 
 
     
     
       11. The method of  claim 10 , wherein:
 detecting an end to the exception-wait mode comprises detecting that the instruction does not trigger an exception; and 
 resuming normal execution comprises responsive to the detecting, resuming execution with the stalled instruction. 
 
     
     
       12. A computing device for executing instructions, the computing device comprising:
 an execution unit; and 
 replay logic, 
 wherein the execution unit is configured to:
 identify a first instruction capable of triggering an exception, 
 detect execution of the first instruction, 
 detect one or more instructions following, in program order, the first instruction, 
 after the detection of the execution of the first instruction, execute a first subset of the one or more instructions in an exception-wait mode, and skip a second subset of the one or more instructions, the second subset of the one or more instructions comprising a set of skipped instructions that would corrupt execution; 
 detect an end to the exception-wait mode, 
 execute the set of skipped instructions in a replay mode; and 
 resume normal execution, 
 wherein the first instruction does not trigger an exception. 
 
 
     
     
       13. The computing device of  claim 12 , wherein executing the first subset of the one or more instructions in the exception-wait mode includes incrementing a counter for each instruction executed or skipped. 
     
     
       14. The computing device of  claim 13 , wherein executing in the replay mode includes decrementing the counter for each instruction executed or skipped in the replay mode, and the replay mode ends when the counter is zero. 
     
     
       15. The computing device of  claim 12 , wherein the execution unit is further configured to:
 during exception-wait mode, upon detecting an instruction that would corrupt execution, stall execution at the instruction that would corrupt execution. 
 
     
     
       16. The computing device of  claim 15 , wherein:
 detecting an end to the exception-wait mode comprises detecting that the first instruction does not trigger an exception; and 
 resuming normal execution comprises responsive to the detecting, resuming execution with the stalled instruction. 
 
     
     
       17. The computing device of  claim 12 , wherein execution in the replay mode occurs according to a mask. 
     
     
       18. The computing device of  claim 17 , wherein the execution unit is further configured to generate the mask in the exception-wait mode. 
     
     
       19. The method of  claim 1 , wherein:
 during the replay mode, instructions of a first type are executed, instructions of a second type are executed, and instructions of a third type are executed; 
 an instruction of the first type comprises an instruction that writes to a register read from by an instruction prior to the instruction of the first type; 
 an instruction of the second type comprises an instruction that writes to a register that is also written to by an instruction prior to the instruction of the second type, wherein the instruction prior to the instruction of the second type executes in the replay mode and executes prior to the instruction of the second type; and 
 an instruction of the third type comprises an instruction that has a data dependency on an instruction that is prior to the instruction of the third type, wherein the instruction that is prior to the instruction of the third type executes in the replay mode. 
 
     
     
       20. The computing device of  claim 12 , wherein:
 during the replay mode, instructions of a first type are executed, instructions of a second type are executed, and instructions of a third type are executed; 
 an instruction of the first type comprises an instruction that writes to a register read from by an instruction prior to the instruction of the first type; 
 an instruction of the second type comprises an instruction that writes to a register that is also written to by an instruction prior to the instruction of the second type, wherein the instruction prior to the instruction of the second type executes in the replay mode and executes prior to the instruction of the second type; and 
 an instruction of the third type comprises an instruction that has a data dependency on an instruction that is prior to the instruction of the third type, wherein the instruction that is prior to the instruction of the third type executes in the replay mode.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.