USRE37684EExpiredUtilityPatentIndex 91

Computerized system for teaching speech

Assignee: DIGISPEECH ISRAEL LTDPriority: Jan 21, 1993Filed: May 9, 1997Granted: Apr 30, 2002

Est. expiryJan 21, 2013(expired)· nominal 20-yr term from priority

Inventors:SHPIRO ZEEV GRONER GABRIEL F ORDENTLICH ERIK

G09B 19/06G09B 19/04G09B 7/04G06F 9/00

PatentIndex Score

Cited by

References

Claims

Abstract

Apparatus for interactive speech training having an audio specimen generator for playing a pre-recorded reference audio specimen to a user for attempted repetition by the user and a reference audio specimen library in which reference audio specimens are stored and to which the audio specimen generator has access. The audio specimen library contains a multiplicity of recordings of speaker dependent audio specimens produced by a plurality of speech models. A speaker independent parameter database stores a plurality of speaker independent references which are different from the reference audio specimens stored in the reference audio specimen library. The speaker independent references are classified according to at least one of age, gender or dialect, but is independent of other speaker characteristics within each category. An audio specimen scorer scores a user's repetition audio specimen by comparison of at least one parameter of the user's repetition audio specimen with a speaker independent reference.

Claims

exact text as granted — not AI-modified

We claim:

1. Apparatus for interactive speech training comprising:
an audio specimen generator for playing a pre-recorded reference audio specimen to a user for attempted repetition thereby;
a reference audio specimen library in which reference audio specimens are stored and to which the audio specimen generator has access, wherein said audio specimen library comprises a multiplicity of recordings of speaker dependent audio specimens produced by a plurality of speech models;
a speaker independent parameter database storing a plurality of speaker independent references which are different from the reference audio specimens stored in said reference audio specimen library; and
an audio specimen scorer for scoring a user&#39;s repetition audio specimen by comparison of at least one parameter of the user&#39;s repetition audio specimen with a speaker independent reference, said speaker independent reference being characterized in that it is classified in a category according to at least one of age, gender and dialect categories.

2. Apparatus according to claim 1 wherein the audio specimen scorer comprises:
a reference-to-response comparing unit for comparing at least one feature of a user&#39;s repetition audio specimen to at least one feature of the reference audio specimen; and
a similarity indicator for providing an output indication of the degree of similarity between at least one repetition audio specimen feature and at least one reference audio specimen feature.

3. Apparatus according to claim 2 and also comprising a user response memory to which the reference-to-response comparing unit has access, for storing a user&#39;s repetition of a reference audio specimen.

4. Apparatus according to claim 2 wherein said reference-to-response comparing unit comprises a volume/duration normalizer for normalizing the volume and duration of the reference and repetition audio specimens.

5. Apparatus according to claim 2 wherein said reference-to-response comparing unit comprises a parameterization unit for extracting audio signal parameters from the reference and repetition audio specimens.

6. Apparatus according to claim 5 and wherein said reference-to-response comparing unit also comprises means for comparing the reference audio specimen parameters to the repetition audio specimen parameters.

7. Apparatus according to claim 6 wherein said means for comparing comprises a parameter score generator for providing a score representing the degree of similarity between the audio signal parameters of the reference and repetition audio specimens.

8. Apparatus according to claim 7 wherein said output indication comprises a display of said score.

9. Apparatus according to claim 2 wherein said output indication comprises a display of at least one audio waveform.

10. Apparatus according to claim 1 and also comprising a prompt sequencer operative to generate a sequence of prompts to a user.

11. Apparatus according to claim 1 wherein the plurality of speech models differ from one another in at least one of the following characteristics:
sex;
age; and
dialect.

12. Apparatus according to claim 1 and also comprising a conventional personal computer.

13. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non-speaker independent reference audio specimens are stored;
a speaker independent parameter database in which speaker independent references are stored;
a prompt sequencer operative to generate a sequence of prompts including said reference audio specimens to a user, prompting the user to produce a corresponding sequence of audio specimens, wherein said sequence of prompts branches in response to user performance; and
a reference-to-response comparing unit for comparing at least one feature of each of the sequence of audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database, said speaker independent reference being characterized in that it is classified in a category according to at least one of age, gender and dialect categories.

14. Apparatus according to claim 13 wherein the reference to which an individual user-generated audio specimen is compared comprises a corresponding stored reference audio specimen.

15. Apparatus according to claim 13 wherein the sequence of prompts is at least partly determined by a user&#39;s designation of his native language.

16. Apparatus according to claim 13 wherein the prompt sequencer comprises a multi-language prompt sequence library in which a plurality of prompt sequences in a plurality of languages is stored and wherein the prompt sequencer is operative to generate a sequence of prompts in an individual one of the plurality of languages in response to a user&#39;s designation of the individual language as his native language.

17. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non-speaker independent reference audio Specimens are stored;
a speaker independent parameter database in which speaker independent references are stored;
a prompt sequencer operative to generate a sequence of prompts, including said non-speaker independent references, to a user, prompting the user to produce a corresponding sequence of audio specimens, wherein the sequence of prompts is at least partly determined by a user&#39;s designation of his native language; and
a speaker independent reference-to-response comparing unit for comparing at least one feature of each of the sequence of audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database, said speaker independent reference being characterized in that it is classified in a category according to at least one of age, gender and dialect categories.

18. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non-speaker independent reference audio specimens are stored;
a speaker independent parameter database in which speaker independent references are stored:
apparatus for receiving and storing audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and
a speaker independent reference-to-response comparing unit for comparing at least one feature of the audio specimen generated by the user, to a speaker independent reference from said speaker independent parameter database, said speaker independent reference being characterized in that it is classified in a category according to at least one of age, gender and dialect categories, the comparing unit comprising:
an audio specimen segmenter for segmenting a user-generated audio specimen into a plurality of segments; and
a segment comparing unit for comparing at least one feature of at least one of the plurality of segments to a speaker independent reference from said speaker independent parameter database.

19. Apparatus according to claim 18 wherein said audio specimen segmenter comprises a phonetic segmenter for segmenting a user-generated audio specimen into a plurality of phonetic segments.

20. Apparatus according to claim 19 wherein at least one of the phonetic segments comprises a phoneme.

21. Apparatus according to claim 19 wherein at least one of the phonetic segments comprises a syllable.

22. Apparatus according to claim 20 wherein the phoneme comprises a vowel.

23. Apparatus according to claim 20 wherein the phoneme comprises a consonant.

24. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non-speaker independent reference audio specimens are stored;
a speaker independent parameter database in which speaker independent references are stored:
an audio specimen recorder for recording audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and
a speaker-independent audio specimen scorer for scoring a user-generated audio specimen based on at least one speaker-independent reference parameter from said speaker independent parameter database, said speaker independent reference parameter being characterized in that it is classified in a category according to at least one of age, gender and dialect categories.

25. Apparatus according to claim 24 wherein at least one speaker-independent parameter comprises a threshold value for the amount of energy at a predetermined frequency.

26. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non-speaker independent reference audio specimens are stored;
a speaker independent parameter database in which speaker independent references are stored;
apparatus for receiving and storing audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and
a speaker independent reference-to-response comparing unit for comparing at least one feature of the audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database, said speaker independent reference being characterized in that it is classified in a category according to at least one of age, gender and dialect categories.

27. Apparatus for interactive speech training comprising:
an audio specimen generator for playing a pre - recorded reference audio specimen to a user for attempted repetition thereby;

a reference audio specimen library in which reference audio specimens are stored and to which the audio specimen generator has access, wherein said audio specimen library comprises a multiplicity of recordings of speaker dependent audio specimens produced by at least one speech model;

a speaker independent parameter database storing a plurality of speaker independent references which are different from the reference audio specimens stored in said reference audio specimen library; and

an audio specimen scorer for scoring a user&#39;s repetition audio specimen by comparison of at least one parameter of the user&#39;s repetition audio specimen with a speaker independent reference.

28. Apparatus according claim 27 wherein the audio specimen scorer comprises:
a reference - to - response comparing unit for comparing at least one feature of a user&#39;s repetition audio specimen to at least one feature of the reference audio specimen; and

a similarity indicator for providing an output indication of the degree of similarity between at least one repetition audio specimen feature and at least one reference audio specimen feature.

29. Apparatus according to claim 28 and also comprising a user response memory to which the reference- to - response comparing unit has access, for storing a user&#39;s repetition of a reference audio specimen.

30. Apparatus according to claim 28 wherein said reference- to - response comparing unit comprises a volume/duration normalizer for normalizing the volume and duration of the reference and repetition audio specimens.

31. Apparatus according to claim 28 wherein said reference- to - response comparing unit comprises a parameterization unit for extracting audio signal parameters from the reference and repetition audio specimen.

32. Apparatus according to claim 31 and wherein said reference- to - response comparing unit also comprises means for comparing the reference audio specimen parameters to the repetition audio specimen parameters.

33. Apparatus according to claim 32 wherein said means for comparing comprises a parameter score generator for providing a score representing degree of similarity between the audio signal parameters of the reference and repetition audio specimens.

34. Apparatus according to claim 33 wherein said output indication comprises a display of said score.

35. Apparatus according to claim 28 wherein said output indication comprises a display of at least one audio waveform.

36. Apparatus according to claim 27 and also comprising a prompt sequencer operative to generate a sequence of prompts to a user.

37. Apparatus according to claim 27 wherein the plurality of speech models differ from one another in at least one of the following characteristics:

sex;

age; and

dialect.

38. Apparatus according to claim 27 and also comprising a conventional personal computer.

39. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non - speaker independent reference audio specimens are stored;

a speaker independent parameter database in which speaker independent references are stored;

a prompt sequencer operative to generate a sequence of prompts including said reference audio specimens to a user, prompting the user to produce a corresponding sequence of audio specimens, wherein said sequence of prompts branches in response to user performance; and

a reference - to - response comparing unit for comparing at least one feature of each of the sequence of audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database.

40. Apparatus according to claim 39 wherein the reference to which an individual user- generated audio specimen is compared comprises a corresponding stored reference audio specimen.

41. Apparatus according to claim 39 wherein the sequence of prompts is at least partly determined by the user&#39;s designation of his native language.

42. Apparatus according to claim 39 wherein the prompt sequencer comprises a multi- language prompt sequencer library in which a plurality of prompt sequences in a plurality of languages is stored and wherein the prompt sequencer is operative to generate a sequence of prompts in an individual one of the plurality of languages in response to a user&#39;s designation of the individual language as his native language.

43. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non - speaker independent reference audio specimens are stored;

a speaker independent parameter database in which speaker independent references are stored;

a prompt sequencer operative to generate a sequence of prompts, including said non - speaker independent references, to a user, prompting the user to produce a corresponding sequence of audio specimens, wherein the sequence of prompts is at least partly determined by a user&#39;s designation of his native language; and
a speaker independent reference - to - response comparing unit for comparing at least one feature of each of the sequence of audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database.

44. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non - speaker independent reference audio specimens are stored;

a speaker independent parameter database in which speaker independent references are stored;

apparatus for receiving and storing audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and

a speaker independent reference - to - response comparing unit for comparing at least one feature of the audio specimen generated by the user, to a speaker independent reference from said speaker independent parameter database, the comparing unit comprising:
an audio specimen segmenter for segmenting a user - generated audio specimen into a plurality of segments; and

a segment comparing unit for comparing at least one feature of at least one of the plurality of segments to a speaker independent reference from said speaker independent parameter database.

45. Apparatus according to claim 44 wherein said audio specimen segmenter comprises a phonetic segmenter for segmenting a user- generated audio specimen into a plurality of phonetic segments.

46. Apparatus according to claim 45 wherein at least one of the phonetic segments comprises a phoneme.

47. Apparatus according to claim 45 wherein at least one of the phonetic segments comprises a syllable.

48. Apparatus according to claim 46 wherein the phoneme comprises a vowel.

49. Apparatus according to claim 46 wherein the phoneme comprises a consonant.

50. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non - speaker independent reference audio specimens are stored;

a speaker independent parameter database in which speaker independent references are stored;

an audio specimen recorder for recording audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and

a speaker - independent audio specimen scorer for scoring a user - generated audio specimen based on at least one speaker - independent reference parameter from said speaker independent parameter database.

51. Apparatus according to claim 50 wherein at least on speaker- independent parameter comprises a threshold value for the amount of energy at a predetermined frequency.

52. Apparatus for interactive speech training comprising:
a reference audio specimen database in which non - speaker independent reference audio specimens are stored;

a speaker independent parameter database in which speaker independent references are stored;

apparatus for receiving and storing audio specimens generated by a user in response to reference audio specimens from said reference audio specimen database; and

a speaker independent reference - to - response comparing unit for comparing at least one feature of the audio specimens generated by the user, to a speaker independent reference from said speaker independent parameter database.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.