P
US9026427B2ActiveUtilityPatentIndex 42

Method and apparatus for pruning side information including directed edges not possessing shortest expanded length for grammar-based compression

Assignee: NGUYEN NGUYENPriority: Oct 30, 2009Filed: Oct 30, 2009Granted: May 5, 2015
Est. expiryOct 30, 2029(~3.3 yrs left)· nominal 20-yr term from priority
Inventors:NGUYEN NGUYENYANG EN-HUI
G06F 40/30G06F 16/951H03M 7/30G06F 17/30864G06F 17/2785
42
PatentIndex Score
0
Cited by
11
References
20
Claims

Abstract

A computer-implemented method for generating side information for grammar-based data compression systems, such as YK compression systems, is described. An admissible grammar (G) for an input sequence (A(S 0 )) having a finite set of terminal symbols is obtained. A graph representation of the admissible grammar (G) is then constructed. An edge having a lowest weight (expansion frequency), or one not possessing the shortest distance and or shortest expanded sequence length, is then pruned from the graph representation to generate a pruned graph representation. A pruned grammar (G′) is then derived by removing the occurrence corresponding to the pruned edge from the grammar G and the starting variable (S 0,i ) of the pruned grammar (Gi) is then expanded to generate the side information.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A computer-implemented method for generating side information for grammar-based data compression systems, the method comprising:
 obtaining an admissible grammar (G) for an input sequence (A(S 0 )) having a finite set of terminal symbols, the admissible grammar having a finite set of variables (S(j)), including a starting variable (S 0 ) representing the input sequence (A(S 0 )), and a production rule for each variable in the finite set of variables (S(j))); 
 constructing a graph representation of the admissible grammar (G), the graph representation including nodes for each variable in the finite set of variables and each terminal symbol in the finite set of terminal symbols, including a root node representing the starting variable (S 0 ) and directed edges linking the nodes, each directed edge being directed to a node and representing an instance of a variable or a terminal symbol corresponding to the node to which it is directed in the admissible grammar (G) as defined by the production rules, and each directed edge having an expansion frequency, wherein the expansion frequency of directed edges originating from the root node has a value of ‘1’, and expansion frequencies of directed edges emanating from each non-root node are determined in accordance with summing of expansion frequencies of directed edges input to the non-root node; 
 pruning, from the graph representation, a directed edge having a lowest expansion frequency to generate a pruned graph representation; 
 deriving a pruned grammar (Gi); 
 expanding a starting variable (S 0,i ) of the pruned grammar (Gi) to generate the side information; and 
 storing the side information in a computer memory device; 
 wherein constructing the graph representation of the admissible grammar further comprises determining, for each non-terminal node, a shortest path from the root node to the non-terminal node, and assigning a shortest distance (SD) value to each non-terminal node in accordance with its respective shortest path; 
 wherein, when two or more directed edges are identified as having the same lowest expansion frequency, pruning the directed edge having the lowest expansion frequency comprises applying tie-breaking rules to select the directed edge to prune; 
 wherein the tie-breaking rules comprise at least one of selecting the one of the two or more directed edges with the greatest shortest distance (SD) value, selecting the one of the two or more directed edges leading to a node with the shortest expanded sequence length. 
 
     
     
       2. The method of  claim 1  further comprising updating expansion frequencies of each directed edge in accordance with the pruned representation. 
     
     
       3. The method of  claim 2  wherein the steps of pruning the directed edge and deriving the pruned grammar are repeated iteratively until a stopping condition is met. 
     
     
       4. The method of  claim 3  wherein the stopping condition is the length of the expanded starting variable (s 0,i ) of the pruned grammar (G i ) being less than or equal to a predetermined length. 
     
     
       5. The method of  claim 1  wherein obtaining the admissible grammar (G) comprises obtaining an irreducible grammar. 
     
     
       6. The method of  claim 5  wherein obtaining the irreducible grammar comprises applying a Yang-Kieffer (YK) grammar transformation to the input sequence (A(s 0 )). 
     
     
       7. The method of  claim 1  wherein, when two or more directed edges are identified as having the same lowest expansion frequency, pruning the directed edge representing an instance of a variable having the least significant position in the input sequence (A(s 0,i-1 )), wherein s 0,0  is s 0 . 
     
     
       8. The method of  claim 1  wherein pruning the directed edge further comprises pruning orphan nodes and orphan edges to generate the pruned graph representation. 
     
     
       9. The method of  claim 1  further comprising updating the shortest distance (SD) values in accordance with the pruned graph representation. 
     
     
       10. The method of  claim 1  wherein deriving the pruned grammar (G) comprises deriving new production rules by removing pruned occurrences from current production rules. 
     
     
       11. A non-transitory computer-readable medium embodying code which, when processed by one or more processors, causes the one or more processors to implement a method of generating side information for grammar-based data compression systems, the method comprising:
 obtaining an admissible grammar (G) for an input sequence (A(S 0 )) having a finite set of terminal symbols, the admissible grammar having a finite set of variables (S(j)), including a starting variable (S 0 ) representing the input sequence (A(S 0 )), and a production rule for each variable in the finite set of variables (S(j)); 
 constructing a graph representation of the admissible grammar (G), the graph representation including nodes for each variable in the finite set of variables and each terminal symbol in the finite set of terminal symbols, including a root node representing the starting variable (S 0 ) and directed edges linking the nodes, each directed edge being directed to a node and representing an instance of a variable or a terminal symbol corresponding to the node to which it is directed in the admissible grammar (G) as defined by the production rules, and each directed edge having an expansion frequency, wherein the expansion frequency of directed edges originating from the root node has a value of ‘1’, and expansion frequencies of directed edges emanating from each non-root node are determined in accordance with summing of expansion frequencies of directed edges input to the non-root node; 
 pruning, from the graph representation, a directed edge having a lowest expansion frequency to generate a pruned graph representation; 
 deriving a pruned grammar (Gi); and 
 expanding a starting variable (S 0,i ) of the pruned grammar (Gi) to generate the side information; 
 wherein constructing the graph representation of the admissible grammar further comprises determining, for each non-terminal node, a shortest path from the root node to the non-terminal node, and assigning a shortest distance (SD) value to each non-terminal node in accordance with its respective shortest path; 
 wherein, when two or more directed edges are identified as having the same lowest expansion frequency, pruning the directed edge having the lowest expansion frequency comprises applying tie-breaking rules to select the directed edge to prune; 
 wherein the tie-breaking rules comprise at least one of selecting the one of the two or more directed edges with the greatest shortest distance (SD) value, selecting the one of the two or more directed edges leading to a node with the shortest expanded sequence length. 
 
     
     
       12. The medium of  claim 11  further comprising updating expansion frequencies of each directed edge in accordance with the pruned representation. 
     
     
       13. The medium of  claim 12  wherein the steps of pruning the directed edge and deriving a pruned grammar are repeated iteratively until a stopping condition is met. 
     
     
       14. The medium of  claim 13  wherein the stopping condition is the length of the expanded starting variable (s 0,i ) of the pruned grammar (G i ) being less than or equal to a predetermined length. 
     
     
       15. The medium of  claim 11  wherein obtaining the admissible grammar (G) comprises obtaining an irreducible grammar. 
     
     
       16. The medium of  claim 15  wherein obtaining the irreducible grammar comprises applying a Yang-Kieffer (YK) grammar transformation to the input sequence. 
     
     
       17. The medium of  claim 11  wherein, when two or more directed edges are identified as having the same lowest expansion frequency, pruning the directed edge representing an instance of a variable having the least significant position in the input sequence (A(s 0,i-1 )), wherein s 0,0  is s 0 . 
     
     
       18. The medium of  claim 14  wherein pruning the directed edge further comprises pruning orphan nodes and orphan edges to generate the pruned graph representation. 
     
     
       19. The medium of  claim 11  further comprising updating the shortest distance (SD) values in accordance with the pruned graph representation. 
     
     
       20. The medium of  claim 11  wherein deriving the pruned grammar (G) comprises deriving new production rules by removing pruned occurrences from current production rules.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.