P
US8972472B2ActiveUtilityPatentIndex 65

Apparatus and methods for hardware-efficient unbiased rounding

Assignee: KANTER OFIR AVRAHAMPriority: Mar 25, 2008Filed: Sep 17, 2008Granted: Mar 3, 2015
Est. expiryMar 25, 2028(~1.7 yrs left)· nominal 20-yr term from priority
Inventors:KANTER OFIR AVRAHAMBAR ILAN
G06F 7/49963
65
PatentIndex Score
4
Cited by
366
References
14
Claims

Abstract

A system and method for unbiased rounding away from, or toward, zero by truncating N bits from a M bit input number to provide a M−N bit number, and adding the equivalent value of ‘½’ to the M−N bit number unless the input number is negative, or positive, respectively, and the N truncated bits represent exactly ½. The method for rounding away from zero may include outputting a (M−N) bit truncated number if the M-bit input number is negative and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros; and otherwise, computing and outputting a sum of (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical operation on the most significant bit of the sequence of truncated bits and (b) the (M−N) bit truncated number.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A method for rounding two's complement represented signed numbers away from zero, the method comprising: providing an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; truncating N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; if the M-bit two's complement represented signed number is negative and the sequence of N truncated bits comprises a most significant bit of  1 , followed by zeros, outputting the (M−N) bit truncated number; and otherwise, computing and outputting a sum of (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical operation on the most significant bit of the sequence of truncated bits and (b) the (M−N) bit truncated number; rounding two's complement represented signed numbers away from zero by a circuit that essentially consists of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate, an inverter; wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate; wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       2. The method according to  claim 1  and also comprising providing special treatment for a largest positive number, represented by a ‘0’, followed by M−1 replicas of ‘1’, to prevent said largest positive number from wrapping around zero and rounding toward a lowest negative number. 
     
     
       3. The method according to  claim 2  comprising checking if the M-bit two's compliment represented signed number is the largest positive number by the (M−N) input NAND gate. 
     
     
       4. A method for rounding two's complement represented signed numbers toward zero, the method comprising: providing an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; truncating N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; if the M-bit two's complement represented signed number is positive and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros, outputting the (M−N) bit truncated number; and otherwise, computing and outputting a sum of (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical function on the most significant bit of said sequence of truncated bits and (b) the (M−N) bit truncated number; rounding two's complement represented signed numbers towards zero by a circuit that essentially consists of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate, an inverter; wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate; wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       5. A method according to  claim 4  and also comprising providing special treatment for a largest positive number, represented by a ‘0’, followed by M−1 replicas of ‘1’, to prevent said largest positive number from wrapping around zero and rounding toward a lowest negative number. 
     
     
       6. A system for rounding two's complement represented signed numbers away from zero, the system comprising: a receiver operative to receive an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; a truncator operative to truncate N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; a clipped and a selector operative, if the M-bit two's complement represented signed number is negative and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros, to output said (M−N) bit truncated number; and otherwise, to compute and to output a sum of (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical function on the most significant bit of said sequence of truncated bits and (b) the (M−N) bit truncated number; wherein the clipper and the selector essentially consist of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate, an inverter; wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate; wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       7. The system according to  claim 6  wherein selector is arranged to check if the M-bit two's compliment represented signed number is a largest positive number by a (M−N) input NAND gate arranged to receive an inverted most significant bit of the M-bit two's compliment represented signed number and to receive non-inverted second till (M−N−1)'th significant bits of the M-bit two's compliment represented signed number. 
     
     
       8. A system according to  claim 6  wherein the clipper is arranged to providing special treatment for a largest positive number, represented by a ‘0’, followed by M−1replicas of ‘1’, to prevent said largest positive number from wrapping around zero and rounding toward a lowest negative number. 
     
     
       9. A system for rounding two's complement represented signed numbers toward zero, the system comprising: a receiver operative to receive an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; a truncator operative to truncate N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; a clipper and a selector operative, if the M-bit two's complement represented signed number is positive and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros, to output said (M−N) bit truncated number; and otherwise, to compute and to output a (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical function on the most significant bit of said sequence of truncated bits and (b) the (M−N) bit truncated number; wherein the clipper and the selector essentially consist of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate an inverter wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate; wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       10. A system according to  claim 9  wherein the clipper provides special treatment for a largest positive number, represented by a ‘0’, followed by M−1 replicas of ‘1’, to prevent said largest positive number from wrapping around zero and rounding toward a lowest negative number. 
     
     
       11. A 2's complement arithmetic based hardware device including a system for rounding, wherein the system for rounding comprises a receiver operative to receive an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; a truncator operative to truncate N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; a clipped and a selector operative, if the M-bit two's complement represented signed number is negative and the sequence of N truncated bits comprises a most significant bit of 1 followed by zeros, to output said (M−N) bit truncated number; and otherwise, to compute and to output a sum of (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical function on the most significant bit of said sequence of truncated bits and (b) the (M−N) bit truncated number; wherein the clipper and the selector essentially consist of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate, an inverter; wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       12. A 2's complement arithmetic based hardware device including a system for rounding that comprises a receiver operative to receive an M-bit two's complement represented signed number to be rounded to an (M−N) bit two's-complement represented signed number; a truncator operative to truncate N bits from the right of the M-bit two's complement represented number, thereby to generate an (M−N) bit truncated number and thereby to define a sequence of N truncated bits; a clipper and a selector operative, if the M-bit two's complement represented signed number is positive and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros, to output said (M−N) bit truncated number; and otherwise, to compute and to output a (a) a number that has an equivalent value of one followed by (N−1) replicas of zero, the one provided by applying a logical function on the most significant bit of said sequence of truncated bits and (b) the (M−N) bit truncated number; wherein the clipper and the selector essentially consist of a (N−1) input NOR gate, a first NAND gate, a first AND gate, an adder, a (M−N) input NAND gate, an inverter; wherein the inverter is arranged to receive the most significant bit of the M-bit two's complement represented signed number and to invert it to provide an inverted signal; wherein the a (M−N) input NAND gate is arranged to receive the inverted signal and the second till (M−N−1)'th most significant bits of the M-bit two's complement represented signed number; wherein the (N−1) input NOR gate is arranged to receive (N−2) least significant bits of the M-bit two's complement represented signed number and having an output that is coupled to a first input of a first NAND gate: wherein the first NAND gate has a second input of the first NAND gate arranged to receive the most significant bit of the M-bit two's complement represented signed number; wherein the first AND gate is arranged to receive an output signal of the first NOR gate, a most significant bit of the sequence of truncated bits and an output signal of the (M−N) input NAND gate; wherein the adder is arranged to add an output signal of the first OR gate to the (M−N) bit truncated number. 
     
     
       13. A digital signal processing system including a 2's complement arithmetic based hardware device according to  claim 11 . 
     
     
       14. A digital signal processing system including a 2's complement arithmetic based hardware device according to  claim 12 .

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.