US10636112B2ActiveUtilityPatentIndex 41
Graphics processor register data re-use mechanism
Est. expiryMar 28, 2038(~11.7 yrs left)· nominal 20-yr term from priority
G06T 1/60G06T 15/005G06F 9/30123G06F 9/384G06F 9/4806G06F 9/462G06T 1/20G06F 8/441G06F 8/41G06F 8/45G06F 9/3851
41
PatentIndex Score
0
Cited by
2
References
15
Claims
Abstract
A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a plurality of execution units to process graphics context data and a register file having a plurality of registers to store the graphics context data; and register renaming logic to facilitate re-use of register data by partitioning a first part and a second part, the first part to include thread-independent code and the second part to include thread-dependent code.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A processing apparatus, comprising:
a graphics processing unit (GPU), including:
a plurality of execution units to process graphics context data; and
a register file having a plurality of registers to store the graphics context data; and
register re-use logic to facilitate re-use of register data by partitioning a shader program into a first part and a second part, the first part to include thread-independent code and the second part to include thread-dependent code, wherein
the partitioning is to include creating a first entry point in the shader program for the first part and a second entry point in the shader program for the second part,
a first thread is to invoke the shader program at the first entry point to execute the first part to perform invocation-independent operations including storing invocation-independent data in at least one of the plurality of registers, and
a second thread is to invoke the shader program at the second entry point to skip the first part and to execute the second part to perform invocation-dependent operations.
2. The apparatus of claim 1 , wherein the invocation-independent operations include reading the invocation-independent data from a memory.
3. The apparatus of claim 1 , wherein the invocation-independent operations include calculating the invocation-independent data by one of the plurality of execution units.
4. The apparatus of claim 1 , wherein the invocation-dependent operations include re-using the invocation-independent data.
5. The apparatus of claim 4 , wherein the register re-use logic is also to facilitate re-use of register data by preserving the invocation-independent data in the at least one of the plurality of registers between execution of the first thread and execution of the second thread.
6. The apparatus of claim 4 , further comprising register re-use tracking hardware to track register use to facilitate re-use of register data.
7. The apparatus of claim 4 , further comprising register re-use tracking hardware to track initial invocations of the shader program.
8. A method comprising:
partitioning, by a graphics program compiler, a shader program into a first part and a second part;
executing, by a first graphic processing unit (GPU) execution unit thread, the first part to populate at least one of a plurality of GPU registers with invocation-independent data; and
executing, by a second GPU execution unit thread, the second part to re-use the invocation-independent data from the at least one of a plurality of GPU registers.
9. The method of claim 8 , wherein the partitioning comprises:
creating a first entry point for the first part; and
creating a second entry point for the second part.
10. The method of claim 8 , further comprising determining whether an invocation of the shader program is an initial invocation or a subsequent invocation.
11. The method of claim 8 , further comprising preserving the independent-invocation data in the at least one of a plurality of GPU registers between executing the first part and executing the second part.
12. The method of claim 8 , wherein executing the first part further comprises at least one of:
reading invocation-independent data from a memory; and
calculating, by one of a plurality of GPU execution units, invocation-independent data.
13. The method of claim 9 , further comprising invoking, by the second thread, the shader program at the second entry point to skip invocation-independent operations in the first part and to perform invocation-dependent operations in the second part.
14. A system, comprising:
an application processing unit;
a graphics processing unit (GPU), including:
a plurality of execution units to process graphics context data, and
a GPU register file having a plurality of registers to store the graphics context data; and
register re-use logic to facilitate re-use of GPU register data by partitioning a shader program into a first part and a second part, the first part to include thread-independent code and the second part to include thread-dependent code, wherein
the partitioning is to include creating a first entry point in the shader program for the first part and a second entry point in the shader program for the second part,
a first thread is to invoke the shader program at the first entry point to execute the first part to perform invocation-independent operations including storing invocation-independent data in at least one of the plurality of registers, and
a second thread is to invoke the shader program at the second entry point to skip the first part and to execute the second part to perform invocation-dependent operations.
15. The system of claim 14 , further comprising a system memory in which to store the shader program.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.