P
US5500920AExpiredUtilityPatentIndex 98

Semantic co-occurrence filtering for speech recognition and signal transcription applications

Assignee: XEROX CORPPriority: Sep 23, 1993Filed: Sep 30, 1994Granted: Mar 19, 1996
Est. expirySep 23, 2013(expired)· nominal 20-yr term from priority
Inventors:KUPIEC JULIAN M
G10L 15/1815G10L 15/22
98
PatentIndex Score
466
Cited by
39
References
22
Claims

Abstract

A system and method for automatically transcribing an input question from a form convenient for user input into a form suitable for use by a computer. The question is a sequence of words represented in a form convenient for the user, such as a spoken utterance or a handwritten phrase. The question is transduced into a signal that is converted into a sequence of symbols. A set of hypotheses is generated from the sequence of symbols. The hypotheses are sequences of words represented in a form suitable for use by the computer, such as text. One or more information retrieval queries are constructed and executed to retrieve documents from a corpus (database). Retrieved documents are analyzed to produce an evaluation of the hypotheses of the set and to select one or more preferred hypotheses from the set. The preferred hypotheses are output to a display, speech synthesizer, or applications program. Additionally, retrieved documents relevant to the preferred hypotheses can be selected and output.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. An automated transcription disambiguation method comprising the steps of: providing an input question having first and second words to a processor in a form subject to misinterpretation by the processor;   generating a plurality of hypotheses with the processor, the hypotheses including alternative interpretations of at least one of the first and second words due to possible misinterpretations of the input question by the processor;   producing with the processor an initial evaluation of the hypotheses;   gathering confirming evidence for the hypotheses by searching with the processor in a text corpus for co-occurrences of hypothesized first and second words for the hypotheses;   automatically and explicitly selecting with the processor from among the plurality of hypotheses a preferred hypothesis as to both of the first and second words based at least in part on the initial evaluation and at least in part on the gathered confirming evidence; and   outputting a transcription result from the processor, the transcription result representing the selected preferred hypothesis.   
     
     
       2. In the operation of a system comprising a processor, an input transducer, an output facility, and a corpus comprising at least one document comprising words represented in a first form, a method for transcribing an input question by transforming the input question from a sequence of words represented in a second form, subject to misinterpretation by the processor, into a sequence of words represented in the first form, the method comprising the steps of: accepting the input question into the system, the question comprising a sequence of words represented in the second form;   converting the input question into a signal with the input transducer;   converting the signal into a sequence of symbols with the processor;   generating a set of hypotheses from the sequence of symbols with the processor, the hypotheses of the set comprising sequences of words represented in the first form, the set of hypotheses including alternative interpretations of at least one of the words to account for possible misinterpretation of the input question;   producing with the processor an initial evaluation of the hypotheses;   automatically constructing a query from hypotheses of the set with the processor;   executing the constructed query by searching with the processor in the corpus for co-occurrences of hypothesized words for the hypotheses;   analyzing the co-occurrences and the initial evaluation with the processor to produce a revised evaluation of the hypotheses of the set;   automatically and explicitly selecting a preferred hypothesis from the set with the processor responsively to the revised evaluation, the preferred hypothesis comprising a preferred sequence of words in the first form and thus a preferred transcription of the sequence of words of the input question; and   outputting the preferred hypothesis with the output facility.   
     
     
       3. The method of claim 2 wherein: the corpus includes a plurality of documents;   the step of executing the constructed query includes retrieving documents containing the co-occurrences;   the step of automatically and explicitly selecting the preferred hypothesis further comprises selecting with the processor a preferred set of documents, the preferred set of documents comprising a subset of the retrieved documents that are relevant to the preferred hypothesis, and   the step of outputting the preferred hypothesis further comprises outputting with the output facility at least a portion of a document belonging to the preferred set of documents.   
     
     
       4. The method of claim 3 further comprising the steps, performed after the step of outputting at least a portion of a document belonging to the preferred set of documents, of: accepting a relevance feedback input into the system, the relevance feedback input comprising a sequence of words represented in the second form, the sequence of words including a relevance feedback keyword and a word that occurs in the outputted document;   converting the relevance feedback input into an additional query with the processor; and   executing the additional query with the processor to retrieve an additional document from the corpus.   
     
     
       5. The method of claim 2 wherein: the step of automatically and explicitly selecting the preferred hypothesis further comprises selecting a plurality of preferred hypotheses with the processor; and   the step of outputting the preferred hypothesis further comprises outputting the selected plurality of preferred hypotheses with the output facility.   
     
     
       6. The method of claim 2 wherein: the step of accepting an input question further comprises accepting information into the system, the information concerning the locations of word boundaries between words of the question; and   the step of converting the signal into a sequence of symbols further comprises specifying subsequences of the sequence of symbols with the processor according to the locations of word boundaries thus accepted.   
     
     
       7. The method of claim 2 wherein the step of generating a set of hypotheses from the sequence of symbols further comprises generating hypothesized locations of word boundaries with the processor. 
     
     
       8. The method of claim 2 wherein the step of converting the input question into a signal comprises converting spoken input into an audio signal with an audio transducer. 
     
     
       9. The method of claim 2 wherein the step of constructing a query from hypotheses of the set comprises constructing a Boolean query with a proximity constraint. 
     
     
       10. The method of claim 2 wherein the step of generating a set of hypotheses from the sequence of symbols comprises detecting a keyword with the processor to prevent inclusion of the keyword in hypotheses of the set. 
     
     
       11. The method of claim 10 wherein the step of constructing a query from hypotheses of the set comprises constructing a query from hypotheses of the set with the processor, the query being responsive to the detected keyword. 
     
     
       12. The method of claim 2 wherein the step of constructing a query from hypotheses of the set comprises constructing an initial query with the processor and prior to the outputting step automatically constructing a reformulated query with the processor, the reformulated query comprising a reformulation of the initial query. 
     
     
       13. The method of claim 2 wherein the step of outputting the preferred hypothesis comprises visually displaying the preferred hypothesis. 
     
     
       14. The method of claim 2 wherein the step of outputting the preferred hypothesis comprises synthesizing a spoken form of the preferred hypothesis. 
     
     
       15. The method of claim 2 wherein the step of outputting the preferred hypothesis comprises providing the preferred hypothesis to an applications program. 
     
     
       16. The method of claim 15 further comprising the step of accepting the preferred hypothesis into the applications program as textual input to the applications program. 
     
     
       17. The method of claim 2 wherein the step of producing an initial evaluation comprises determining an initial evaluation measurement for each hypothesis. 
     
     
       18. In a system comprising a processor, a method for processing an input utterance comprising speech, the method comprising the steps of: accepting the input utterance into the system;   producing a phonetic transcription of the input utterance with the processor;   responsively to the phonetic transcription, generating with the processor a set of hypotheses, the hypotheses of the set being hypotheses as to a first word contained in the input utterance and further as to a second word contained in the input utterance, the set of hypotheses including alternative interpretations of at least one of the words to account for the error-prone nature of speech analysis;   determining with the processor an initial evaluation measurement for each hypothesis;   automatically constructing an information retrieval query with the processor, the query comprising the set of hypotheses and a proximity constraint;   executing the constructed query in conjunction with an information retrieval subsystem comprising a text corpus; and   responsively to the results of the executed query with respect to each hypothesis of the set of hypotheses, and taking into consideration the initial evaluation measurements of the hypotheses, automatically and explicitly selecting with the processor from among the hypotheses of the set a preferred hypothesis, the preferred hypothesis including the first and second words.   
     
     
       19. The method of claim 18 wherein the step of generating a set of hypotheses comprises matching portions of the phonetic transcription against a phonetic index with the processor. 
     
     
       20. In a system comprising a processor, an error-prone input facility, and an information retrieval subsystem, said information retrieval subsystem comprising a natural-language text corpus, a method for accessing documents of the corpus, the method comprising the steps of: transcribing a question with the error-prone input facility and the processor, the question comprising a sequence of words;   selecting a subset of words of the sequence with the processor;   forming with the processor a plurality of hypotheses about the selected subset of words, the hypotheses of the plurality representing possible alternative transcriptions of the question to account for the error-prone nature of the input facility;   producing with the processor an initial evaluation of the hypotheses;   automatically constructing a co-occurrence query with the processor, the co-occurrence query being based on hypotheses of the plurality;   executing the co-occurrence query in conjunction with the information retrieval subsystem to retrieve a set of documents;   analyzing the initial evaluation and documents of the retrieved set with the processor to produce a revised evaluation of the hypotheses;   responsively to the revised evaluation, automatically and explicitly selecting with the processor a preferred hypothesis representing a preferred transcription of the sequence of words of the question;   evaluating documents of the retrieved set with the processor with respect to the selected hypothesis to determine a relevant document; and   outputting from the system the relevant document thus determined.   
     
     
       21. An automated system for producing a preferred transcription of a question presented in a form prone to erroneous transcription, comprising: a processor;   an input transducer, coupled to the processor, for accepting an input question and producing a signal therefrom;   converter means, coupled to the input transducer, for converting the signal to a string comprising a sequence of symbols;   hypothesis generation means, coupled to the converter means, for developing a set of hypotheses from the string, each hypothesis of the set comprising a sequence of word representations, the set of hypotheses representing a set of possible alternative transcriptions of the input question to account for the likelihood of erroneous transcription;   initial scoring means, coupled to the hypothesis generation means, for determining an initial score for each hypothesis;   query construction means, coupled to the hypothesis generation means, for automatically constructing at least one information retrieval query using hypotheses of the set;   a corpus comprising documents, each document comprising word representations;   query execution means, coupled to the query construction means and to the corpus, for retrieving from the corpus documents responsive to said at least one query;   analysis means, coupled to the query execution means, for generating an analysis of the retrieved documents and evaluating the hypotheses of the set based on the initial scores and the analysis to determine a preferred hypothesis from among the hypotheses of the set, the preferred hypothesis representing a preferred transcription of the sequence of words of the input question; and   output means, coupled to the analysis means, for outputting the preferred hypothesis.   
     
     
       22. A speech processing apparatus comprising: input means for transducing a spoken utterance into an audio signal;   means for converting the audio signal into a sequence of phones;   means for analyzing the sequence of phones to generate a plurality of hypotheses comprising sequences of words, the hypotheses representing possible alternative transcriptions of the spoken utterance to account for the error-prone nature of speech analysis;   means for determining an initial evaluation measurement for each hypothesis;   means for automatically constructing a query using the hypotheses of the plurality;   information retrieval means, coupled to a corpus of documents and to the constructing means, for retrieving documents of the corpus relevant to the constructed query;   means for automatically and explicitly ranking the hypotheses of the plurality according to confirming evidence found in the retrieved documents and further according to the initial evaluation measurements previously determined; and   means for outputting a subset of the hypotheses thus ranked, each hypothesis of the subset comprising a sequence of words representing a possible transcription of the spoken utterance.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.