P
US4924518AExpiredUtilityPatentIndex 72

Phoneme similarity calculating apparatus

Assignee: TOSHIBA KKPriority: Dec 23, 1986Filed: Dec 16, 1987Granted: May 8, 1990
Est. expiryDec 23, 2006(expired)· nominal 20-yr term from priority
Inventors:UKITA TERUHIKO
G10L 15/00
72
PatentIndex Score
11
Cited by
13
References
15
Claims

Abstract

Continuous speech is input to an acoustic analyzer comprising a filter bank which filters (acoustically analyzes) the input speech and outputs from each filter a characteristic parameter vector of the input speech. The acoustic analyzer also calculates a steadiness parameter which is proportional to a reciprocal of a change in the spectrum of the input speech and represents steadiness of the input speech. the feature parameter vector is input to an initial similarity calculator, and a similarity (initial similarity) to each reference phoneme pattern stored in a reference phoneme pattern memory is calculated using a multiple similarity method. The initial similarity is input to a similarity normalizing circuit, and is normalized (weighted) based on the steadiness parameter in order to reflect information representing steadiness or unsteadiness of a phoneme into the similarity. For a steady reference phoneme, weighting is increased when the steadiness parameter increases and is decreased when the steadiness parameter decreases. For an unsteady reference phoneme, weighting is increased when the steadiness parameter decreases and is decreased when the steadiness parameter increases.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A phoneme similarity calculating apparatus comprising: means for calculating a feature parameter of input speech for every frame of a predetermined time;   means for calculating a steadiness parameter indicating steadiness of the input speech for each frame in accordance with a change in the feature parameter;   means for calculating a phoneme similarity of the feature parameter for each phoneme based on matching between the feature parameter and a reference phoneme pattern for each phoneme; and   normalizing means for correcting said phoneme similarity calculated by said phoneme similarity calculating means in accordance with the steadiness parameter such that said phoneme similarity is increased if the steadiness parameter increases when the reference phoneme is a steady one, and is decreased if the steadiness parameter increases when the reference phoneme is an unsteady one.   
     
     
       2. An apparatus according to claim 1, in which said normalizing means comprises: means for multiplying a first weighting function which is proportional to the steadiness parameter with the phoneme similarity when the reference phoneme is a steady one, and   means for multiplying a second weighting function which is inversely proportional to the steadiness parameter with the phoneme similarity when the reference phoneme is an unsteady one.   
     
     
       3. An apparatus according to claim 1, in which said phoneme similarity calculating means calculates a distance with respect to each reference phoneme pattern as the phoneme similarity, and said normalizing means multiplies a weighting function which is decreased upon an increase in steadiness parameter with the distance for a steady phoneme, and multiplies a weighting function which is increased upon an increase in steadiness parameter with the distance for an unsteady phoneme.   
     
     
       4. An apparatus according to claim 3, in which said similarity calculating means calculates the following Mahalanobis distance D.sup.(c) [x(t)] as the similarity: ##EQU6## where μ(c) is an average vector of training samples belonging to a (c)th category, and Σ(c) is a covariance matrix of the training samples belonging to a phoneme c. 
     
     
       5. An apparatus according to claim 1, in which said feature parameter calculating means comprises a filter bank having a plurality of bandpass filters, having different center frequencies, for acoustically analyzing the input speech. 
     
     
       6. An apparatus according to claim 1, in which said steadiness parameter calculating means calculates steadiness parameter v(t) of a (t)th frame as follow: ##EQU7## where ∥x∥ is the square root of the sum of the squares of the vector components for a vector x at the (t)th frame, x(t) is a feature parameter vector of the (t)th frame, and T is an arbitrary period falling within a range of about 10 to 50 msec. 
     
     
       7. An apparatus according to claim 1, in which said similarity calculating means comprises a reference phoneme pattern memory for storing an eigen value and an eigen vector of a covariance matrix of a sample pattern vector associated with each phoneme as a reference pattern of each phoneme, and a circuit for calculating similarity S.sup.(c) [x(t)] for each phoneme c in accordance with feature parameter vector x(t) as follows; ##EQU8## where λm(c) is an (m)th eigen value of the covariance matrix for phoneme c, Σ is a summation in connection with m, x[d](t) is a feature parameter vector of the (t)th frame, and φm(c) is an eigen vector when the square root of the sum of the squares of the vector components for a vector x at the (t)th frame is normalized to 1. 
     
     
       8. An apparatus according to claim 1, in which said normalizing means corrects the phoneme similarity using different normalizing functions associated with the steadiness parameter in accordance with categories of phonemes. 
     
     
       9. A phoneme similarity calculating apparatus comprising: means for inputting continuous speech;   means for dividing the input continuous speech at every predetermined time into divided segments;   means for calculating a steadiness parameter in accordance with a rate of a change in power spectrum of the divided segments over time;   means for calculating a feature parameter vector by frequency-analyzing the divided segments;   means for calculating phoneme information indicating a degree of similarity between the feature parameter vector and each reference phoneme pattern; and   means for correcting each phoneme information in accordance with the steadiness parameter as follows: at a portion with high steadiness of input speech, a similarity for a steady reference phoneme being increased and a similarity for an unsteady reference phoneme being decreased, and at a portion with low steadiness of input speech, a similarity for a steady reference phoneme being decreased and a similarity for an unsteady reference phoneme being increased.   
     
     
       10. An apparatus according to claim 9, in which said correcting means comprises: means for multiplying a first weight function increasing with an increase in the steadiness parameter and the phoneme information when the phoneme is a steady one, and for multiplying a second weight function decreasing with an increase in the steadiness parameter and the phoneme information when the phoneme is not a steady one.   
     
     
       11. An apparatus according to claim 9, in which said feature parameter vector calculating means comprises a filter bank having a plurality of bandpass filters, having different center frequencies, for acoustically analyzing the input speech. 
     
     
       12. An apparatus according to claim 9, in which said steadiness parameter calculating means calculates steadiness parameter v(t) of a (t)th frame as follows: ##EQU9## where ∥x∥ is the square root of the sum of the squares of the vector components for a vector x at the (t)th frame of vector x, x(t) is a feature parameter vector of the (t)th frame, and T is an arbitrary period falling within a range of about 10 to 50 msec. 
     
     
       13. An apparatus according to claim 9, in which said similarity calculating means comprises a reference phoneme pattern memory for storing an eigen value and an eigen vector of a covariance matrix of a sample pattern vector associated with each phoneme as a reference pattern of each phoneme, and a circuit for calculating similarity S.sup.(c) [x(t)] for each phoneme c in accordance with feature parameter vector x(t) as follows; ##EQU10## where λm(c) is an (m)th eigen value of the covariance matrix for phoneme c, Σ is a summation in connection with m, x(t) is a feature parameter vector of the (t)th frame, and φm(c) is an eigen vector when the square root of the sum of the squares of the vector components for a vector x at the (t)th frame is normalized to 1. 
     
     
       14. An apparatus according to claim 9,, in which said phoneme information calculating means calculates the following Mahalanobis distance D.sup.(c) [x(t)] as the similarity: ##EQU11## where μ(c) is an average vector of training sample belonging to a (c)th category, and Σ(c) is a covariance matrix of the training samples belonging to a phoneme c. 
     
     
       15. An apparatus according to claim 9, in which said correcting means corrects the phoneme similarity using different normalizing functions associated with the steadiness parameter in accordance with categories of phonemes.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.