US7962340B2ExpiredUtilityPatentIndex 82

Methods and apparatus for buffering data for use in accordance with a speech recognition system

Assignee: NUANCE COMMUNICATIONS INCPriority: Aug 22, 2005Filed: Aug 22, 2005Granted: Jun 14, 2011

Est. expiryAug 22, 2025(expired)· nominal 20-yr term from priority

Inventors:COMERFORD LIAM D FRANK DAVID CARL LEWIS BURN L RACHEVKSY LEONID VISWANATHAN MAHESH

G10L 15/28

PatentIndex Score

Cited by

References

Claims

Abstract

Techniques are disclosed for overcoming errors in speech recognition systems. For example, a technique for processing acoustic data in accordance with a speech recognition system comprises the following steps/operations. Acoustic data is obtained in association with the speech recognition system. The acoustic data is recorded using a combination of a first buffer area and a second buffer area, such that the recording of the acoustic data using the combination of the two buffer areas at least substantially minimizes one or more truncation errors associated with operation of the speech recognition system.

Claims

exact text as granted — not AI-modified

1. A method for processing acoustic data in accordance with a speech recognition system, the method comprising acts of:
 recording acoustic data in at least one recording medium; 
 detecting, at a first time, an indication to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium; 
 determining whether the acoustic data recorded at the first time at which the indication to start speech recognition processing was detected is likely to be in a silence region or a speech region; 
 if it is determined that the acoustic data recorded at the first time is likely to be in a speech region, analyzing at least some of the recorded acoustic data before the first location to determine a starting location for speech recognition processing; and 
 if it is determined that the acoustic data recorded at the first time is likely to be in a silence region, analyzing the recorded acoustic data only after the first location to determine a starting location for speech recognition processing. 
 
     
     
       2. The method of  claim 1 , further comprising:
 starting speech recognition processing of the recorded acoustic data at the first location without accessing the recorded acoustic data before the first location to determine a starting location for speech recognition processing, when it is determined that the acoustic data recorded at the first time is likely to be in a silence region. 
 
     
     
       3. The method of  claim 1 , further comprising:
 identifying at least one silence-to-speech transition and at least one speech-to-silence transition in the recorded acoustic data, wherein the act of determining whether the acoustic data recorded at the first time is likely to be in a silence region or a speech region comprises determining whether a last transition before the first location is a silence-to-speech transition or a speech-to-silence transition. 
 
     
     
       4. The method of  claim 1 , further comprising:
 stopping speech recognition processing of the recorded acoustic data at a first speech-to-silence transition after the first location. 
 
     
     
       5. The method of  claim 1 , wherein the act of recording acoustic data in the at least one recording medium comprises:
 analyzing a portion of acoustic data recorded in the at least one recording medium to detect at least one silence-to-speech transition or at least one speech-to-silence transition in the portion of acoustic data recorded in the at least one recording medium; and 
 storing an indication of a location of the detected at least one silence-to-speech transition or at least one speech-to-silence transition. 
 
     
     
       6. The method of  claim 1 , wherein the at least one recoding medium comprises a circular buffer and a linear buffer, and wherein the act of recording acoustic data in the at least one recording medium comprises:
 recoding acoustic data in the linear buffer; 
 detecting whether the linear buffer is full; and 
 when it is detected that the linear buffer is full, appending at least some content of the linear buffer to the circular buffer and setting a current write position to a start of the linear buffer. 
 
     
     
       7. At least one computer readable memory encoded with instructions that, when executed, perform a method for processing acoustic data in accordance with a speech recognition system, the method comprising acts of:
 recording acoustic data in at least one recording medium; 
 detecting, at a first time, an indication to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium; 
 determining whether the acoustic data recorded at the first time at which the indication to start speech recognition processing was detected is likely to be in a silence region or a speech region; 
 if it is determined that the acoustic data recorded at the first time is likely to be in a speech region, analyzing at least some of the recorded acoustic data before the first location to determine a starting location for speech recognition processing; and 
 if it is determined that the acoustic data recorded at the first time is likely to be in a silence region, analyzing the recorded acoustic data only after the first location to determine a starting location for speech recognition processing. 
 
     
     
       8. The at least one computer readable memory of  claim 7 , wherein the method further comprises:
 starting speech recognition processing of the recorded acoustic data at the first location without accessing the recorded acoustic data before the first location to determine a starting location for speech recognition processing, when it is determined that the acoustic data recorded at the first time is likely to be in a silence region. 
 
     
     
       9. The at least one computer readable memory of  claim 7 , wherein the act of recording acoustic data in the at least one recording medium comprises:
 analyzing a portion of acoustic data recorded in the at least one recording medium to detect at least one silence-to-speech transition or at least one speech-to-silence transition in the portion of acoustic data recorded in the at least one recording medium; and 
 storing an indication of a location of the detected at least one silence-to-speech transition or at least one speech-to-silence transition. 
 
     
     
       10. The at least one computer readable memory of  claim 7 , wherein the at least one recoding medium comprises a circular buffer and a linear buffer, and wherein the act of recording acoustic data in the at least one recording medium comprises:
 recoding acoustic data in the linear buffer; 
 detecting whether the linear buffer is full; and 
 when it is detected that the linear buffer is full, appending at least some content of the linear buffer to the circular buffer and setting a current write position to a start of the linear buffer. 
 
     
     
       11. The at least one computer readable memory of  claim 7 , wherein the method further comprises:
 identifying at least one silence-to-speech transition and at least one speech-to-silence transition in the recorded acoustic data, wherein the act of determining whether the acoustic data recorded at the first time is likely to be in a silence region or a speech region comprises determining whether a last transition before the first location is a silence-to-speech transition or a speech-to-silence transition. 
 
     
     
       12. The at least one computer readable memory of  claim 7 , wherein the method further comprises:
 stopping speech recognition processing of the recorded acoustic data at a first speech-to- silence transition after the first location. 
 
     
     
       13. A system for processing acoustic data in accordance with a speech recognition system, the system comprising:
 at least one memory for storing executable instructions; 
 at least one processor programmed by the executable instructions to:
 record acoustic data in at least one recording medium; 
 detect, at a first time, an indication to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium; 
 determine whether the acoustic data recorded at the first time at which the indication to start speech recognition processing was detected is likely to be in a silence region or a speech region; 
 if it is determined that the acoustic data recorded at the first time is likely to be in a speech region, analyze at least some of the recorded acoustic data before the first location to determine a starting location for speech recognition processing; and 
 if it is determined that the acoustic data recorded at the first time is likely to be in a silence region, analyze the recorded acoustic data only after the first location to determine a starting location for speech recognition processing. 
 
 
     
     
       14. The system of  claim 13 , wherein the at least one processor is further programmed to:
 start speech recognition processing of the recorded acoustic data at the first location without accessing the recorded acoustic data before the first location to determine a starting location for speech recognition processing, when it is determined that the acoustic data recorded at the first time is likely to be in a silence region. 
 
     
     
       15. The system of  claim 13 , wherein the at least one processor is further programmed to:
 identify at least one silence-to-speech transition and at least one speech-to-silence transition in the recorded acoustic data; and 
 determine whether the acoustic data recorded at the first time is likely to be in a silence region or a speech region at least in part by determining whether a last transition before the first location is a silence-to-speech transition or a speech-to-silence transition. 
 
     
     
       16. The system of  claim 13 , wherein the at least one processor is further programmed to:
 stop speech recognition processing of the recorded acoustic data at a first speech-to-silence transition after the first location. 
 
     
     
       17. The system of  claim 13 , wherein the at least one processor is further programmed to record acoustic data in the at least one recording medium at least in part by:
 analyzing a portion of acoustic data recorded in the at least one recording medium to detect at least one silence-to-speech transition or at least one speech-to-silence transition in the portion of acoustic data recorded in the at least one recording medium; and 
 storing an indication of a location of the detected at least one silence-to-speech transition or at least one speech-to-silence transition. 
 
     
     
       18. The system of  claim 13 , wherein the at least one recoding medium comprises a circular buffer and a linear buffer, and wherein the at least one processor is further programmed to record acoustic data in the at least one recording medium at least in part by:
 recoding acoustic data in the linear buffer; 
 detecting whether the linear buffer is full; and 
 when it is detected that the linear buffer is full, appending at least some content of the linear buffer to the circular buffer and setting a current write position to a start of the linear buffer.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.