US8972244B2ActiveUtilityPatentIndex 38

Sampling and optimization in phrase-based machine translation using an enriched language model representation

Assignee: XEROX CORPPriority: Jan 25, 2013Filed: Jan 25, 2013Granted: Mar 3, 2015

Est. expiryJan 25, 2033(~6.6 yrs left)· nominal 20-yr term from priority

Inventors:DYMETMAN MARC AZIZ WILKER FERREIRA VENKATAPATHY SRIRAM

G06F 40/40G06F 40/44G06F 17/2818G06F 17/28

PatentIndex Score

Cited by

References

Claims

Abstract

Rejection sampling is performed to acquire at least one target language translation for a source language string s in accordance with a phrase-based statistical translation model p(x)=p(t, a|s) where t is a candidate translation, a is a candidate alignment comprising a biphrase sequence generating the candidate translation t, and x is a sequence representing the candidate alignment a. The rejection sampling uses a proposal distribution comprising a weighted finite state automaton (WFSA) q (n) that is refined responsive to rejection of a sample x* obtained in a current iteration of the rejection sampling to generate a refined WFSA q (n+1) for use in a next iteration of the rejection sampling. The refined WFSA q (n+1) is selected to satisfy the criteria p(x)≦q (n+1) (x)≦q (n) (x) for all x∈X and q (n+1) (x*)<q (n) (x*) where the space X is the set of sequences x corresponding to candidate alignments a that generate candidate translations t for the source language string s.

Claims

exact text as granted — not AI-modified

The invention claimed is: 
     
       1. A non-transitory storage medium storing instructions executable by an electronic data processing device to perform rejection sampling to acquire at least one accepted target language translation for a source language string s in accordance with a phrase-based statistical translation model
     p ( x )= p ( t,a|s ) 
 where t is a candidate translation, a is a candidate alignment comprising a source language-target language biphrase sequence generating the candidate translation t, and x is a sequence representing the candidate alignment a, the rejection sampling using a proposal distribution comprising a weighted finite state automaton (WFSA) q (n)  that is refined responsive to rejection of a sample x* obtained in a current iteration of the rejection sampling to generate a refined WFSA q (n+1)  for use in a next iteration of the rejection sampling wherein the refined WFSA q (n+1)  is selected to satisfy the criteria:
     p ( x )≦ q   (n+1) ( x )≦ q   (n) ( x ) for all  x∈X ; and
 
     q   (n+1) ( x *)&lt; q   (n) ( x *); 
 
 where the space X is the set of sequences x corresponding to candidate alignments a that generate candidate translations t for the source language string s. 
 
     
     
       2. The non-transitory storage medium of  claim 1 , wherein:
 the sample x* is rejected if it does not satisfy a no overlap constraint requiring that the alignment a represented by the sample x* consumes each word of the source language string s exactly once, and 
 the refinement performed when the no overlap constraint is not satisfied comprises (i) identifying a sub-sequence a k  . . . a k+l  of the sample x* that violates the no overlap constraint and (ii) refining WFSA q (n)  to WFSA q (n+1)  such that WFSA q (n+1)  does not accept any sequence x that contains the sub-sequence a k  . . . a k+l . 
 
     
     
       3. The non-transitory storage medium of  claim 2 , wherein:
 the rejection sampling sets p(x*)=0 if the sample x* does not satisfy the no overlap constraint, and 
 the refinement comprising operations (i) and (ii) is performed responsive to the condition p(x*)=0. 
 
     
     
       4. The non-transitory storage medium of  claim 1 , wherein the sequence x represents the alignment a as a sequence of triples each including: a field E denoting exactly one word of the translation t, a field B denoting the biphrase generating E, and a field N denoting the count of words of the source language string s consumed by the biphrases of the alignment a up to and including the biphrase generating E. 
     
     
       5. The non-transitory storage medium of  claim 4 , wherein:
 the sample x* is rejected if it does not satisfy a no overlap constraint requiring that the alignment a represented by the sample x* consumes each word of the source language string s exactly once as determined based in part on whether the value of N for the last triple of the sample x* equals the number of words in s. 
 
     
     
       6. The non-transitory storage medium of  claim 1 , wherein:
 the rejection sampling obtains the sample x* by random sampling of the space X; 
 the rejection sampling accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw; and 
 the refined proposal distribution q (n+1)  is selected to satisfy the criteria:
     p ( x )≦ q   (n+1) ( x )≦ q   (n) ( x ) for all  x∈X;  
 
     q   (n+1) ( x *)&lt; q   (n) ( x *); and 
   a norm ∥ q   (n+1) ∥ α  is minimized where α&lt;∞.
 
 
 
     
     
       7. The non-transitory storage medium of  claim 6 , wherein the rejection sampling acquires a plurality of accepted samples and terminates when a cumulative acceptance rate of the rejection sampling exceeds a threshold. 
     
     
       8. The non-transitory storage medium of  claim 6 , where α=1. 
     
     
       9. The non-transitory storage medium of  claim 1 , wherein:
 the rejection sampling obtains the sample x* such that q*=q (n) (x*) maximizes q (n)  over the space X; 
 the rejection sampling accepts or rejects x* based on a difference between or ratio of q* and p(x*); and 
 the refined proposal distribution q (n+1)  is selected to satisfy the criteria:
     p ( x )≦ q   (n+1) ( x )≦ q   (n) ( x ) for all  x∈X;  
 
     q   (n+1) ( x *)&lt; q   (n) ( x *); and 
   a norm ∥ q   (n+1) ∥ α =max{ q   (n+1) ( x )} is minimized.
 
 
 
     
     
       10. The non-transitory storage medium of  claim 1 , wherein the non-transitory storage medium stores instructions executable by an electronic data processing device to perform sampling of the phrase-based statistical translation model p(t, a|s) by accumulating a set of accepted samples x* over a plurality of iterations of the rejection sampling. 
     
     
       11. The non-transitory storage medium of  claim 10 , wherein each iteration obtains the sample x* by random sampling of the space X and accepts or rejects x* based on comparison of a ratio p(x*)/q(x*) with a random draw. 
     
     
       12. The non-transitory storage medium of  claim 10 , wherein the rejection sampling terminates when a cumulative acceptance rate of the rejection sampling exceeds a threshold. 
     
     
       13. An apparatus comprising:
 a non-transitory storage medium as set forth in  claim 1 ; and 
 an electronic data processing device configured to execute instructions stored on the non-transitory storage medium. 
 
     
     
       14. A method of generating at least one target language translation for a source language string s in accordance with a phrase-based statistical translation model p(x)=p(t, a|s) where t is a candidate translation, a is a candidate alignment comprising a source language-target language biphrase sequence generating the candidate translation t, and x is a sequence representing the candidate alignment a, the method comprising:
 performing iterative rejection sampling of p(x) by generating a sample x* and accepting or rejecting the sample x* based on a comparison of p(x*) and q(x*) where q is a proposal distribution comprising a weighted finite state automaton, the at least one target language translation corresponding to an accepted sample x*; 
 during the iterative rejection sampling, identifying a sample x* that includes a sub-sequence a k  . . . a k+l  violating a no overlap constraint requiring that the alignment a consumes each word of the source language string s exactly once; and 
 refining q so that it does not accept any sequence x that contains the sub-sequence a k  . . . a k+l , the refined q being used in subsequent iterations of the iterative rejection sampling; 
 wherein the iterative rejection sampling including the identifying and refining operations is performed by an electronic data processing device. 
 
     
     
       15. The method of  claim 14 , further comprising setting p(x*)=0 responsive to identifying a sample x* that includes a sub-sequence a k  . . . a k+l  violating the no overlap constraint, the refining being performed responsive to rejection of a sample x* for which p(x*)=0. 
     
     
       16. The method of  claim 14 , wherein x represents the alignment a as a sequence of triples including: a field E denoting exactly one word of the translation t, a field B denoting the biphrase generating E, and a field N denoting the count of words of the source language string s consumed by the biphrases of the alignment a up to and including the biphrase generating E, and the identifying includes:
 identifying a sample x* as including a sub-sequence a k  . . . a k+l  violating the no overlap constraint based on the value of N for the last triple in the sample x* being different from the number of words in the source language string s. 
 
     
     
       17. The method of  claim 14 , wherein the performing of iterative rejection sampling comprises performing the OS* algorithm. 
     
     
       18. An apparatus comprising:
 an electronic data processing device; and 
 a non-transitory storage medium storing instructions executable by the electronic data processing device to perform a method including:
 performing rejection sampling of a phrase-based statistical translation model p(x)=p(t, a|s) where t is a candidate translation, a is a candidate alignment comprising a source language-target language biphrase sequence generating the candidate translation t, and x represents the candidate alignment a as a sequence of triples each including: a field E denoting exactly one word of the translation t, a field B denoting the biphrase generating E, and a field N denoting the count of words of the source language string s consumed by the biphrases of the alignment a up to and including the biphrase generating E; and 
 during the rejection sampling, identifying a sample x* as including a sub-sequence a k  . . . a k+l  violating a no overlap constraint requiring that the alignment a consumes each word of the source language string s exactly once based on the value of N for the last triple in the sample x* being different from the number of words in the source language string s. 
 
 
     
     
       19. The apparatus of  claim 18 , wherein the method further includes, responsive to identifying a sample x* that includes a sub-sequence a k  . . . a k+l  violating the no overlap constraint:
 refining a proposal distribution used in the rejection sampling such that the proposal distribution no longer includes sequences x containing the sub-sequence a k  . . . a k+l . 
 
     
     
       20. The apparatus of  claim 18 , wherein the method further includes, responsive to identifying a sample x* that includes a sub-sequence a k  . . . a k+l  violating the no overlap constraint:
 refining a proposal distribution comprising a weighted finite state automaton q used in the rejection sampling such that q no longer accepts any sequence x that contains the sub-sequence a k  . . . a k+l .

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.