US6157906AExpiredUtilityPatentIndex 68

Method for detecting speech in a vocoded signal

Assignee: MOTOROLA INCPriority: Jul 31, 1998Filed: Jul 31, 1998Granted: Dec 5, 2000

Est. expiryJul 31, 2018(expired)· nominal 20-yr term from priority

Inventors:NICHOLLS RICHARD BRENT WONG CHIN PAN KARANJA MARTIN THUO DORAN PATRICK JOSEPH GRAHAM DAVID JAMES

G10L 25/78

PatentIndex Score

Cited by

References

Claims

Abstract

A digital signal processor (100) receives a digitally vocoded signal (102), and calculates a staggered average value (404) from the frame energy of each received frame, or the product of the frame energy and a voicing value. While the staggered average value is above a threshold voice indicator value, speech is declared present.

Claims

exact text as granted — not AI-modified

What is claimed is:

1. A method for detecting speech in a vocoded signal, comprising the steps of: receiving a vocoded signal having a succession of frames, each frame containing audio information and a corresponding frame energy value; calculating a staggered average value derived from the frame energy value by: comparing a current frame energy value with a present staggered average value; if the current frame energy value is greater than the present staggered average value, setting the staggered average value equal to the current frame energy value; and if the current frame energy value is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor; providing a threshold voice indicator value; and declaring speech present when the staggered average value is greater than the threshold voice indicator value.

2. A method for detecting speech as defined in claim 1, wherein in the step of calculating, the averaging factor has a form of y(n)=a·y(n-1)+(1-a)·x(n), where: y(n) is the current staggered average value; a is a scaling factor having a value from zero to one; y(n-1) is the present staggered average value; and x(n) is the current frame energy value.

3. A method for detecting speech as defined in claim 2, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.

4. A method for detecting speech as defined in claim 3, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.

5. A method for detecting speech as defined in claim 1, wherein the vocoded signal comprises a voicing value with each frame, in the step of calculating the staggered average value, the staggered average value is the product of the frame energy value and the voicing value.

6. A method for detecting speech as defined in claim 5, wherein the step of calculating a staggered average comprises: comparing a product of a current frame energy value and a current voicing value with a present staggered average value; if the product is greater than the present staggered average value, setting the staggered average value equal to the product; and if the product is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor.

7. A method for detecting speech as defined in claim 6, wherein in the step of calculating, the averaging factor has the form of y[n]=a·y(n-1)+(1-a)·x(n), where: y(n) is the current staggered average value; a is a scaling factor having a value from zero to one; y(n-1) is the present staggered average value; and x(n) is the product of the current frame energy value and the current voicing value.

8. A method for detecting speech as defined in claim 6, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.

9. A method for detecting speech as defined in claim 8, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.

10. A method for detecting speech as defined in claim 1, wherein in the step of declaring speech, the threshold voice indicator value is a constant value.

11. A method for detecting speech as defined in claim 1, wherein the step of providing a threshold voice indicator value comprises calculating a running average of the frame energy when the staggered average value is below a previous threshold voice indicator value and a voicing value corresponding to the frame energy value indicates an unvoiced frame.

12. A method for detecting speech in a vocoded signal, comprising the steps of: receiving a vocoded signal having a succession of frames, each frame containing audio information and a corresponding frame energy value and a voicing value; calculating a staggered average value derived from a product of the frame energy value and the voicing value by: comparing a current frame energy value with a present staggered average value; if the current frame energy value is greater than the present staggered average value, setting the staggered average value equal to the current frame energy value; and if the current frame energy value is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor; providing a threshold voice indicator value; and declaring speech present when the staggered average value is greater than the threshold voice indicator value.

13. A method for detecting speech as defined in claim 12, wherein in the step of calculating, the averaging factor has the form of y[n]=a·y(n-1)+(1-a)·x(n), where: y(n) is the current staggered average value; a is a scaling factor having a value from zero to one; y(n-1) is the present staggered average value; and x(n) is the product of the current frame energy value and the current voicing value.

14. A method for detecting speech as defined in claim 13, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.

15. A method for detecting speech as defined in claim 14, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.

16. A method for detecting speech as defined in claim 14, wherein in the step of declaring speech, the threshold voice indicator value is a constant value.

17. A method for detecting speech as defined in claim 14, wherein the step of providing a threshold voice indicator value comprises calculating a running average of the frame energy when the staggered average value is below a previous threshold voice indicator value and a voicing value corresponding to the frame energy value indicates an unvoiced frame.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.