Apparatus and method for floating-point multiplication
Abstract
An apparatus and method for floating-point multiplication are provided. Two partial products are generated from two operand significands. An unbiased result exponent is determined from operand exponent values and leading zero counts, and a shift amount and direction for a product significand as needed for a predetermined minimum exponent value of a predetermined canonical format. First and second rounding values for injection into addition of the partial products are generated by shifting a predetermined rounding pattern by the shift amount in an opposite shift direction for the first rounding value and left shifting by one bit the first rounding value to give the second. The first and second partial products are added together with the first rounding value to give a first product significand, and are added together with the second rounding value to give a second product significand. These product significands are shifted by the shift amount in the shift direction and one is then selected in order to generate a formatted significand in the predetermined canonical format. The early injection rounding provides a faster floating-point multiplier.
Claims
exact text as granted — not AI-modifiedI claim:
1. Apparatus for floating-point multiplication comprising:
partial product generation circuitry to multiply significands of a first floating-point operand and a second floating-point operand to generate first and second partial products;
exponent calculation circuitry to calculate a value of an unbiased exponent of a result of the multiplication in dependence on exponent values and leading zero counts of the first and second floating-point operands and to determine a shift amount and a shift direction for a product significand generated by an addition operation on the first and second partial products, in dependence on a predetermined minimum exponent value of a predetermined canonical format;
rounding injection circuitry to generate first and second rounding values for injection into the addition operation, wherein the rounding injection circuitry comprises rounding shift circuitry to generate the first rounding value by shifting a predetermined rounding pattern by the shift amount in an opposite direction to the shift direction and to generate the second rounding value by left shifting by one bit the first rounding value;
first adder circuitry to add the first and second partial products together with the first rounding value for the addition operation to generate a first product significand;
second adder circuitry to add the first and second partial products together with the second rounding value for the addition operation to generate a second product significand;
significand shift circuitry to shift at least one of the first and second product significands by the shift amount in the shift direction; and
selection circuitry to select one of the first and second product significands in order to generate a formatted significand in the predetermined canonical format.
2. The apparatus as claimed in claim 1 , wherein the first adder circuitry has a configuration to generate the first product significand with one less bit than the second product significand generated by the second adder circuitry.
3. The apparatus as claimed in claim 1 , wherein the predetermined rounding pattern has a length matching the formatted significand, and comprises a set bit followed by unset bits when a rounding mode of the apparatus is round-to-nearest ties-to-even (RNE), and comprises all set bits when the rounding mode is round-up (RU), and
when the opposite direction is left and the rounding mode is round-up, a number of less-significant bit positions are set given by the shift amount, and
when the opposite direction is right, a number of most-significant bit positions are unset given by the shift amount.
4. The apparatus as claimed in claim 1 , comprising mask generation circuitry to generate an overflow mask identifying an overflow bit position of the second product significand, wherein the mask generation circuitry is arranged to generate the overflow mask by right shifting a predetermined mask pattern by the shift amount; and
comparison circuitry to apply the overflow mask to the second product significand to extract an overflow value at the overflow bit position, wherein the comparison circuitry is arranged to extract the overflow value before the significand shift circuitry shifts at least one of the first and second product significands.
5. The apparatus as claimed in claim 4 , wherein the predetermined mask pattern comprises a set bit at an unshifted overflow bit position of the second product significand when the second product significand is unshifted.
6. The apparatus as claimed in claim 1 , wherein the exponent calculation circuitry comprises right shift overflow determination circuitry to identify a right shift overflow condition when the shift direction is right, and either:
a most significant bit of the second product significand is set and the shift amount is two; or
a most-significant-but-one bit of the second product significand is set and the shift amount is one; and
the exponent calculation circuitry is responsive to the right shift overflow condition to set a value of a biased exponent of the result of the multiplication to one.
7. The apparatus as claimed in claim 4 , wherein the significand shift circuitry comprises:
left shift circuitry to left shift the first and second product significands by the shift amount to give first and second left-shifted product significands;
right shift circuitry to right shift the second product significand by the shift amount to give a right-shifted product significand, wherein the left shift circuitry and the right shift circuitry are arranged to perform their respective shifting in parallel with one another; and
the selection circuitry is responsive to the shift direction and the overflow value to select as the formatted significand one of the first left-shifted product significand, the second left-shifted product significand and the right-shifted product significand, and to select a predetermined number of most significant bits in the formatted significand to output.
8. The apparatus as claimed in claim 7 , wherein the first and second rounding values are set to zero and the apparatus has a configuration to forward an unrounded result of the multiplication to an adder as part of a fused multiply-add.
9. The apparatus as claimed in claim 4 , wherein the mask generation circuitry is arranged to generate a last bit mask identifying a last bit position of a last bit of the formatted significand within the first product significand and wherein the mask generation circuitry is arranged to generate the last bit mask comprising shifting a predetermined last bit mask pattern by the shift amount in an opposite direction to the shift direction, and further comprising:
comparison circuitry to apply the last bit mask to the first product significand to extract a last bit value at the last bit position.
10. The apparatus as claimed in claim 9 , wherein the mask generation circuitry is arranged to generate a left-shift last bit mask and a right-shift last bit mask, wherein the comparison circuitry is responsive to the shift direction being a left direction to extract the last bit value using the left-shift last bit mask and is responsive to the shift direction being a right direction to extract the last bit value using the right-shift last bit mask.
11. The apparatus as claimed in claim 10 , wherein the mask generation circuitry is arranged to generate two left-shift last bit masks, wherein the mask generation circuitry is arranged to generate a second left-shift last bit mask by right shifting by one bit a first left-shift last bit mask, and
the comparison circuitry is arranged to apply the first left-shift last bit mask to the first product significand and to apply the second left-shift last bit mask to the second product significand, and
the comparison circuitry is responsive to the overflow value indicating an overflow of the second product significand to select the last bit value extracted using the first left-shift last bit mask and is responsive to the overflow value indicating no overflow of the second product significand to select the last bit value extracted using the second left-shift last bit mask.
12. The apparatus as claimed in claim 11 , wherein the predetermined last bit mask pattern comprises, when the shift direction is the left direction, an unset bit followed by a predetermined number of set bits, wherein the predetermined number of set bits is one more than a number of bits of the formatted significand in the predetermined canonical format, and
the mask generation circuitry is arranged to generate the first left-shift last bit mask from a base shifted mask generated by right shifting the predetermined last bit mask pattern by the shift amount and prepending the right shifted predetermined last bit mask pattern with a number of unset bits given by the shift amount.
13. The apparatus as claimed in claim 10 , wherein the predetermined last bit mask pattern comprises, when the shift direction is the right direction, a sequence of unset bits followed by a predetermined number of set bits, wherein the predetermined number of set bits is the number of bits of the formatted significand in the predetermined canonical format, and
the mask generation circuitry is arranged to generate the right-shift last bit mask from a base shifted mask generated by left shifting the predetermined last bit mask pattern by the shift amount and appending the left shifted predetermined last bit mask pattern with a number of set bits given by the shift amount.
14. The apparatus as claimed in claim 9 , wherein the mask generation circuitry is arranged to generate a guard bit mask, wherein the guard bit mask has a bit set at a guard bit position which is one position below the position of the last bit of the formatted significand within the first product significand,
and the comparison circuitry is arranged to apply the guard bit mask to the first product significand to extract a guard bit value at the guard bit position.
15. The apparatus as claimed in claim 14 , wherein the mask generation circuitry is arranged to generate a sticky bit mask, wherein the sticky bit mask has bits set at all bit positions below the guard bit position,
and the comparison circuitry is arranged to apply the sticky bit mask to the first product significand to extract a set of sticky bit values and to calculate an overall sticky bit value as a logical OR of the set of sticky bit values.
16. The apparatus as claimed in claim 14 , further comprising correction circuitry responsive to the rounding mode of the apparatus being round-to-nearest ties-to-even (RNE) to calculate a corrected guard bit value as an inverse of the guard bit value and to calculate a corrected last bit value as the last bit value logical ANDed with a logical OR of the guard bit value and the overall sticky bit value.
17. The apparatus as claimed in claim 16 , further comprising inexact detection circuitry which is responsive to a rounding mode of the apparatus not being round-up (RU) to generate an inexact flag when the corrected guard bit value or the overall sticky value are non-zero.
18. The apparatus as claimed in claim 4 , further comprising inexact detection circuitry which is responsive to a rounding mode of the apparatus being round-up (RU) to set an inexact flag, when the overflow value is set, when the first rounding value is not bit-identical with a corresponding lower portion of the first product significand,
and which is responsive to the rounding mode of the apparatus being round-up (RU) to set the inexact flag, when the overflow value is not set, when the second rounding value is not bit-identical with a corresponding lower portion of the second product significand.
19. The apparatus as claimed in claim 1 , further comprising underflow detection circuitry to set an underflow flag when the formatted significand is zero and a biased exponent of the result of the multiplication is one and an inexact flag is set.
20. The apparatus as claimed in claim 1 , further comprising flush-to-zero flag generation circuitry to set a flush-to-zero flag when a biased exponent of the result of the multiplication is zero and either:
the exponent does not overflow,
or the exponent overflows due to rounding.
21. A method of operating a data processing apparatus to perform floating-point multiplication comprising:
multiplying, by partial product generation circuitry, significands of a first floating-point operand and a second floating-point operand to generate first and second partial products;
calculating, by exponent calculation circuitry, a value of an unbiased exponent of a result of the multiplication in dependence on exponent values and leading zero counts of the first and second floating-point operands and determining a shift amount and a shift direction for a product significand generated by an addition operation on the first and second partial products, in dependence on a predetermined minimum exponent value of a predetermined canonical format;
generating, rounding injection circuitry, first and second rounding values for injection into the addition operation, wherein generating the first and second rounding values comprises generating the first rounding value by shifting a predetermined rounding pattern by the shift amount in an opposite direction to the shift direction and generating the second rounding value by left shifting by one bit the first rounding value;
adding, by adder circuitry, the first and second partial products together with the first rounding value for the addition operation to generate a first product significand;
adding, by adder circuitry, the first and second partial products together with the second rounding value for the addition operation to generate a second product significand;
shifting, by significand shift circuitry, at least one of the first and second product significands by the shift amount in the shift direction; and
selecting, selection circuitry, one of the first and second product significands in order to generate a formatted significand in the predetermined canonical format.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.