P
USRE44494EExpiredUtilityPatentIndex 82

Processor having execution core sections operating at different clock rates

Assignee: SAGER DAVID JPriority: Nov 13, 1996Filed: Nov 24, 2004Granted: Sep 10, 2013
Est. expiryNov 13, 2016(expired)· nominal 20-yr term from priority
Inventors:SAGER DAVID JFLETCHER THOMAS DHINTON GLENN JUPTON MICHAEL D
G06F 9/3869G06F 15/7832G06F 1/08G06F 9/384G06F 1/02G06F 9/30145G06F 9/3838G06F 9/3836G06F 9/383G06F 9/3863G06F 1/06G06F 9/3856G06F 9/3858
82
PatentIndex Score
5
Cited by
25
References
55
Claims

Abstract

A processor including a first execution core section clocked to perform execution operations at a first clock frequency, and a second execution core section clocked to perform execution operations at a second clock frequency which is different than the first clock frequency. The second execution core section runs faster and includes a data cache and critical ALU functions, while the first execution core section includes latency-tolerant functions such as instruction fetch and decode units and non-critical ALU functions. The processor may further include an I/O ring which may be still slower than the first execution core section. Optionally, the first execution core section may include a third execution core section whose clock rate is between that of the first and second execution core sections. Clock multipliers/dividers may be used between the various sections to derive their clocks from a single source, such as the I/O clock.

Claims

exact text as granted — not AI-modified
We claim: 
     
       1. A microprocessor comprising:
 a first execution core section operating adapted to operate at a first clock frequency, the first execution core section including a first multiplier unit adapted to multiply a clock signal to obtain the first clock frequency; 
 a second execution core section operating adapted to operate at a second clock frequency, the second execution core section including a second multiplier unit adapted to multiply the clock signal to obtain the second clock frequency, which is different than the first clock frequency; and 
 an I/O ring clocked to perform input/output operations at an I/O frequency, which is the same frequency as the clock signal. 
 
     
     
       2. The microprocessor of  claim 1 , wherein the second execution core section operates at least in part concurrently with the first execution core section. 
     
     
       3. The microprocessor of  claim 1 , wherein:
 the second execution core section includes a data cache and critical arithmetic logic unit (ALU) functions; and 
 the first execution core section includes one or more of an instruction fetch, a decode unit, and non-critical ALU functions. 
 
     
     
       4. The microprocessor of  claim 3 , wherein the critical ALU functions comprise one or more of:
 an adder; or 
 a logic unit to perform AND and OR operations. 
 
     
     
       5. The microprocessor of  claim 4 , wherein the critical ALU functions further comprise:
 an address generation index register shifter. 
 
     
     
       6. The microprocessor of  claim 3 , wherein the second execution core section further includes a register file, and wherein the first execution core section further includes another register file. 
     
     
       7. The microprocessor of claim  3  1, wherein the first execution core section further includes a register file wherein the first clock frequency is at substantially 0 MHz when the first execution core section is powered down. 
     
     
       8. The microprocessor of  claim 7 , wherein:
 the I/O frequency is different than the first and second clock frequencies. 
 
     
     
       9. The microprocessor of  claim 8 , further comprising:
 a first clock divider/multiplier coupled to the I/O ring and the first execution core section to divide or multiply the I/O clock frequency to generate the first clock frequency; and 
 a second clock divider/multiplier coupled to the first and second execution core sections to divide or multiply the first clock frequency to generate the second clock frequency. 
 
     
     
       10. The microprocessor of  claim 1 , wherein the microprocessor comprises a single, monolithic chip. 
     
     
       11. The microprocessor of  claim 1 , wherein the second execution core section is disposed within the first execution core section. 
     
     
       12. The microprocessor of  claim 11 , wherein the first execution core section is disposed within the I/O ring. 
     
     
       13. The microprocessor of  claim 1 , wherein the first execution core section and the second execution core section are located on the same semiconductor die. 
     
     
       14. The microprocessor of  claim 1 , wherein the second clock frequency is a multiple N of the first clock frequency. 
     
     
       15. The microprocessor of  claim 1 , wherein the second clock frequency is faster than the first clock frequency. 
     
     
       16. The microprocessor of  claim 1 , wherein the first execution core section is more tolerant of instruction latency than the second execution core section. 
     
     
       17. The microprocessor of  claim 1 , further comprising:
 a replay architecture, the replay architecture causing an instruction to be re-executed. 
 
     
     
       18. The microprocessor of  claim 17 , wherein the instruction is re-executed if the instruction was incorrectly processed because of erroneous data speculation. 
     
     
       19. The microprocessor of claim  17  18, wherein an instruction depending on the instruction that was incorrectly processed because of erroneous data speculation is also re-executed. 
     
     
       20. The microprocessor of  claim 17 , wherein the instruction is re-executed if:
 the instruction was not correctly processed for any reason; or 
 input data used by the instruction is not known to be correct. 
 
     
     
       21. The microprocessor of  claim 17 , wherein the replay architecture includes:
 hit/miss logic to determine whether data speculation for an the instruction is correct; 
 a checker unit to receive the output of the hit/miss logic and to direct re-execution of the instruction; and 
 a delay unit, the delay unit to provide a copy of an instruction to the checker unit at substantially the same time as the checker unit receives the output of the hit/miss logic. 
 
     
     
       22. The microprocessor of  claim 21 , wherein the delay unit is incorporated as part of the checker. 
     
     
       23. The microprocessor of  claim 21 , wherein the checker is located within the second execution core section. 
     
     
       24. A method comprising:
 performing an I/O operation in an I/O ring of a microprocessor at a first clock frequency to access a data item from outside the microprocessor; 
 responsive to the I/O operation, performing a first execution operation upon the data item in a first execution sub-core of the microprocessor at a second clock frequency, wherein a clock is multiplied by a first multiplier unit associated with the first execution sub-core to obtain the second clock frequency; and 
 responsive to the first execution operation, performing a second execution operation in a second execution sub-core of the microprocessor at a third clock frequency, the third clock frequency being different wherein a clock is multiplied by a second multiplier unit associated with the second execution sub-core to obtain the third clock frequency, which is higher than the second clock frequency. 
 
     
     
       25. The method of  claim 24 , wherein an execution operation performed at the third clock frequency is performed at least in part concurrently with an execution operation performed at the second clock frequency. 
     
     
       26. The method of  claim 24 , further comprising:
 multiplying the first clock frequency to generate the second clock frequency; and 
 multiplying the second clock frequency to generate the third clock frequency. 
 
     
     
       27. The method of  claim 24 , wherein:
 execution operations performed at the second clock frequency include one or more of fetch, decode, and non-critical arithmetic logic unit (ALU) functions; and 
 execution operation performed at the third clock frequency include critical ALU functions. 
 
     
     
       28. The method of  claim 24 , further comprising re-executing an instruction if the instruction was incorrectly processed because of erroneous data speculation. 
     
     
       29. The method of  claim 28 , further comprising re-executing an instruction that depends on the instruction that was incorrectly processed. 
     
     
       30. The method of  claim 24 , further comprising re-executing an instruction if:
 the instruction was not correctly processed for any reason; or 
 input data used by the instruction is not known to be correct performing the second execution operation in the second execution sub-core while the first execution sub-core is powered down. 
 
     
     
       31. A method comprising:
 inputting an instruction through operation of a first portion of a microprocessor at a first periodic clock frequency; 
 multiplying with a first multiplication unit the first periodic clock frequency to obtain a second periodic clock frequency; 
 performing one or more fetch functions or decode functions associated with the instruction through operation of a second portion of the microprocessor at a the second periodic clock frequency; and 
 multiplying with a second multiplication unit the second periodic clock frequency to obtain a third periodic clock frequency; and  
 performing one or more critical arithmetic logic unit (ALU) functions associated with the instruction through operation of a third portion of the microprocessor at a the third periodic clock frequency, the second clock frequency being different than the third clock frequency. 
 
     
     
       32. The method of claim  21  31, wherein a function performed through operation of the second portion of the microprocessor at the second periodic clock frequency occurs at least in part concurrently with a function performed through operation of the third portion of the microprocessor at the third periodic clock frequency. 
     
     
       33. The method of  claim 31 , wherein the second portion of the microprocessor comprises a first execution core, and wherein the third portion of the microprocessor comprises a second execution core. 
     
     
       34. The method of claim  33  31, wherein the third portion of the microprocessor comprises a second execution core further comprising performing the one or more fetch functions or decode functions associated with the instruction through operation of a second portion of the microprocessor while the third portion of the microprocessor is powered down. 
     
     
       35. The method of  claim 34 , wherein the first portion of the microprocessor comprises an I/O section of the microprocessor. 
     
     
       36. A microprocessor comprising:
 a plurality of execution core sections, each execution core section operating being adapted to operate at a different clock frequency, the plurality of execution core sections operating at least in part concurrently with each other, wherein each plurality of execution core sections are to be associated with an independent clock multiplier to generate the different clock frequency; and  
 an I/O ring clocked to perform input/output operations at an I/O frequency. 
 
     
     
       37. The microprocessor of  claim 36 , wherein:
 a first execution core section of the plurality of execution core sections includes one or more of instruction fetch units, instruction decode units, and non-critical ALU functions; and 
 a second execution core section of the plurality of execution core sections includes a data cache and one or more critical arithmetic logic unit (ALU) functions. 
 
     
     
       38. The microprocessor of  claim 37 , wherein the critical ALU functions comprise one or more of:
 an adder; or 
 a logic unit for performing AND and OR operations. 
 
     
     
       39. The microprocessor of  claim 37 , wherein the critical ALU functions further comprise:
 an address generation index register shifter. 
 
     
     
       40. The microprocessor of  claim 37 , wherein the second execution core section further includes a register file. 
     
     
       41. The microprocessor of  claim 37 , wherein the first execution core section further includes a register file. 
     
     
       42. The microprocessor of  claim 36 , further comprising a plurality of clock divider/multipliers, each clock divider/multiplier to divide or multiple multiply a first clock frequency to provide a second clock frequency to an execution core section. 
     
     
       43. The microprocessor of  claim 36 , wherein the microprocessor comprises a single, monolithic chip. 
     
     
       44. The microprocessor of  claim 36 , wherein a first execution core section of the plurality of execution core sections is disposed within the I/O ring. 
     
     
       45. The microprocessor of  claim 44 , wherein each remaining execution core section of the plurality of execution core sections is disposed to be wholly within another execution core section. 
     
     
       46. The microprocessor of  claim 44 , wherein each of the execution core sections is more tolerant of instruction latency than any execution core sections disposed within it located on the same semiconductor die. 
     
     
       47. The microprocessor of  claim 36 , wherein each of the plurality of execution core sections is located on the same semiconductor die. 
     
     
       48. The microprocessor of  claim 47  further comprising a replay architecture, the replay architecture to cause an instruction to be re-executed, wherein the replay architecture includes:
 hit/miss logic to determine whether data speculation for an instruction is correct; 
 a checker unit to receive the output of the hit/miss logic and to direct re-execution of the instruction; and 
 a delay unit, the delay unit to provide a copy of an instruction to the checker unit at substantially the same time as the checker unit receives the output of the hit/miss logic. 
 
     
     
       49. The microprocessor of  claim 36 , further comprising:
 a replay architecture causing an instruction to be re-executed. 
 
     
     
       50. The microprocessor of  claim 49 , wherein the instruction is re-executed if the instruction was incorrectly processed because of erroneous data speculation. 
     
     
       51. The microprocessor of  claim 50 , wherein an instruction depending on the instruction that was incorrectly processed because of erroneous data speculation is also re-executed. 
     
     
       52. The microprocessor of claim  51  48, wherein the delay unit is incorporated as part of the checker unit. 
     
     
       53. The microprocessor of claim  46  48, wherein the instruction is re-executed if:
 the instruction was not correctly processed for any reason; or 
 input data used by the instruction is not known to be correct. 
 
     
     
       54. An integrated circuit comprising:
 a processor including,
 first multiplier logic adapted to multiply a common clock to generate a first frequency; 
 logic to perform input/output (I/O) operations at the first frequency; 
 second multiplier logic adapted to multiply the common clock to generate a second frequency; 
 a first core to operate at the second frequency; 
 third multiplier logic adapted to multiply the common clock to generate a third frequency; and 
 a second core to operate at the third frequency, wherein the first, the second, and the third frequencies are different frequencies.  
   
     
     
       55. The integrated circuit of claim 54, wherein the second core is nested within the first core.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.