USRE47889EActiveUtilityPatentIndex 73
System and method for segmenting text lines in documents

Assignee: III HOLDINGS 6 LLCPriority: Jul 10, 2009Filed: Jul 1, 2016Granted: Mar 3, 2020
Est. expiryJul 10, 2029(~3 yrs left)· nominal 20-yr term from priority
Inventors:SAUND ERIC
G06V 30/155G06K 9/00456G06K 9/00449G06K 9/00442G06K 9/346G06V 30/10G06V 30/413G06V 30/412
PatentIndex Score
Cited by
101
References
Claims
Abstract

Methods and systems of the present embodiment provide segmenting of connected components of markings found in document images. Segmenting includes detecting aligned text. From this detected material an aligned text mask is generated and used in processing of the images. The processing includes breaking connected components in the document images into smaller pieces or fragments by detecting and segregating the connected components and fragments thereof likely to belong to aligned text.
Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A method of classifying marking types on images of a document, the method comprising:
 supplying the document containing the images to a segmenter; 
 segmenting the images received by the segmenter into fragments, the segmenting including identifying neatly written or printed text by grouping selected feature points along predetermined orientations, the feature points including local extrema of bounding contours of connected components, and subtracting enclosing boundary boxes of text lines from remaining document material to fragment connected components that are part of the text lines and part of extraneous markings; 
 supplying the fragments to a two-stage classifier, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each fragment, wherein the two-stage classifier is trained from groundtruth images whose pixels are labeled according to known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood fragment of the each fragment, and wherein the first and second category scores to each of the fragments are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements; and 
 
 assigning a same label to all pixels in a fragment when the fragment is classified by the two-stage classifier. 
 
     
     
       2. The method according to  claim 1  wherein the segmenting of the image lines images is directed to segregation of the text lines believed to be aligned text, from other aspects of the images. 
     
     
       3. The method according to  claim 1 , wherein the document images are electronic images stored in an electronic memory and are segmented by a processor associated with the electronic memory. 
     
     
       4. A method of classifying marking types on images of a document, the method comprising:
 supplying the document containing the images to a segmenter; 
 segmenting the images received by the segmenter into fragments, the segmenting including identifying neatly written or printed text by grouping selected feature points along predetermined orientations, the feature points including local extrema of bounding contours of connected components, and subtracting enclosing boundary boxes of text lines from remaining document material to fragment connected components that are part of the text lines and part of extraneous markings; 
 supplying the fragments to a two-stage classifier, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each fragment, wherein the two-stage classifier is trained from groundtruth images whose pixels are labeled according to known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood fragment of the each fragment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements; and 
 
 assigning a same label to all pixels in a fragment when the fragment is classified by the two-stage classifier; 
 wherein the segmenting includes processing the images to find text lines, the processing comprising:
 detecting upper and lower extrema of the connected components; 
 identifying upper and lower contour extrema of the detected upper and lower extrema of the connected components; 
 grouping the identified upper and lower contour extrema; 
 identifying upper contour point groups and lower contour point groups; 
 fitting the grouped upper and lower point groups; 
 filtering out the fitted and grouped upper and lower point groups that are outside a predetermined alignment threshold; 
 forming upper and lower alignment segments for the upper and lower point groups that remain after the filtering operation; 
 matching as pairs the upper and lower segments that remain after the filtering operation; and 
 forming text line bounding boxes based on the pairs of matched upper and lower segments that remain after the filtering operation, the bounding boxes identifying connected components believed to be aligned text. 
 
 
     
     
       5. A system of classifying marking types on images of a document, the system comprising:
 a segmenter operated on a processor and configured to receive the document containing the images, the segmenter segmenting the images into fragments of foreground pixel structures that are identified as being likely to be of the same marking type by finding connected components, and dividing at least some of the connected components to obtain image fragments, the connected components being isolated, continuous regions of foreground pixels, the segmenter segmenting the images by:
 identifying neatly written or printed text by grouping selected feature points along predetermined orientations, the feature points being local extrema of bounding contours of the connected components; and 
 subtracting enclosing boundary boxes of text lines from remaining document material to fragment connected components that are partly part of the text lines and partly part of extraneous markings; and 
 
 a two-stage classifier operated on a processor and configured to receive the fragments, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each received fragment, wherein the two-stage classifier is trained from ground truth images whose pixels are labeled according to known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood fragment of the each fragment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements, 
 
 the two-stage classifier assigning a same label to all pixels in a fragment when the fragment is classified by the two-stage classifier. 
 
     
     
       6. The system according to  claim 5  wherein the segmenter is further configured to find text lines including:
 detecting upper and lower extrema of the connected components; 
 identifying the upper and lower contour extrema of the detected upper and lower extrema of the connected components; 
 grouping the identified upper and lower contour extrema; 
 identifying upper contour point groups and lower contour point groups; 
 fitting the grouped upper and lower point groups; 
 filtering out the fitted and grouped upper and lower point groups that are outside a predetermined alignment threshold; 
 forming upper and lower alignment segments for the upper and lower point groups that remain after the filtering operation; 
 matching as pairs the upper and lower segments that remain after the filtering operation; and 
 forming text line bounding boxes based on the pairs of matched upper and lower segments that remain after the filtering operation, the bounding boxes identifying connected components believed to be aligned text. 
 
     
     
       7. The method according to  claim 5 , wherein the document images are electronic images stored in an electronic memory and are segmented by a processor associated with the electronic memory. 
     
     
       8. The system according to  claim 5  further including a scanner to receive a hardcopy document containing images, the scanner converting the hardcopy document into an electronic document, the electronic document being the document supplied to the segmenter. 
     
     
       9. The method according to  claim 1 , wherein the connected components are each an isolated, continuous region of foreground pixels, and wherein at least one of the fragments is a subset of one of the connected components. 
     
     
       10. The method according to  claim 1 , wherein the segmenting includes:
 grouping the selected feature points into strips; 
 fitting lines to the strips; and 
 forming the enclosing bounding boxes from pairs of fitted lines. 
 
     
     
       11. The method according to  claim 1 , further including:
 training the classifier from the groundtruth images using a machine learning algorithm. 
 
     
     
       12. The method according to  claim 1 , further including:
 providing the category score to each of the fragments by:
 determining feature measurements for the fragment; 
 determining the category score for the fragment by applying the classifier to the determined feature measurements. 
   
     
     
       13. The method according to  claim 1 , wherein the segmenting includes:
 determining the enclosing boundary boxes for individual text lines based on the grouping. 
 
     
     
       14. The system according to  claim 5 , wherein the segmenting includes:
 grouping the selected feature points into strips; 
 fitting lines to the strips; and 
 forming the enclosing bounding boxes from pairs of fitted lines. 
 
     
     
       15. The system according to  claim 5 , wherein the segmenter segments the images by further:
 determining the enclosing boundary boxes for individual text lines based on the grouping. 
 
     
     
       16. The system according to  claim 5 , wherein the classifier provides the category score to each received fragment by:
 determining feature measurements for the fragment; and   determining the category score for the fragment by applying the classifier to the determined feature measurements.   
     
     
       17. A method of classifying marking types of one or more images of an electronic document, comprising:
 receiving an electronic document containing one or more images;   segmenting the one or more images by grouping selected feature points of the one or more images along predetermined orientations into a plurality of segments;   supplying the plurality of segments to a two-stage classifier, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each segment, wherein the two-stage classifier is trained from groundtruth images whose pixels are labeled of known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood segment of the each segment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements; and 
   assigning a same label to all pixels in a segment when the segment is classified by the two-stage classifier.   
     
     
       18. The method of claim 17, further comprising training the two-stage classifier from the groundtruth images using a machine learning algorithm. 
     
     
       19. The method of claim 17, wherein the segmenting includes: determining enclosing boundary boxes for individual text lines based on the grouping. 
     
     
       20. A method of classifying marking types on images of an electronic document, comprising:
 receiving an electronic document containing one or more images;   segmenting the one or more images received by grouping selected feature points along predetermined orientations into a plurality of segments;   supplying the segments to a two-stage classifier, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each segment, wherein the two-stage classifier is trained from groundtruth images whose pixels are labeled of known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood segment of the each segment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements; and 
   assigning a same label to all pixels in a segment when the segment is classified by the two-stage classifier;   wherein the segmenting includes processing the images to find text lines, the processing comprising:
 detecting upper and lower extrema of the connected components; 
 identifying upper and lower contour extrema of the detected upper and lower extrema of the connected components; 
 grouping the identified upper and lower contour extrema; 
 identifying upper contour point groups and lower contour point groups; 
 fitting the grouped upper and lower point groups; 
 filtering out the fitted and grouped upper and lower point groups that are outside a predetermined alignment threshold; 
 forming upper and lower alignment segments for the upper and lower point groups that remain after the filtering operation; 
 matching as pairs the upper and lower segments that remain after the filtering operation; and 
 forming text line bounding boxes based on the pairs of matched upper and lower segments that remain after the filtering operation, the bounding boxes identifying connected components believed to be aligned text. 
   
     
     
       21. A system of classifying marking types of one or more images of an electronic document, comprising:
 a segmenter operated on a processor that receives an electronic document containing one or more images, the segmenter segmenting the one or more images by grouping selected feature points along predetermined orientations; and   a two-stage classifier operated on the processor that:   receives the segments, the two-stage classifier providing a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each segment, wherein the two-stage classifier is trained from groundtruth images whose pixels are labeled of known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood segment of the each segment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements, and 
   assigns a same label to all pixels in a segment when the segment is classified by the two-stage classifier.   
     
     
       22. The system of claim 21, wherein the one or more images are electronic images stored in an electronic memory and are segmented by a processor associated with the electronic memory. 
     
     
       23. The system of claim 21, further comprising a scanner to:
 receive a hardcopy document containing images; and   convert the hardcopy document into an electronic document, wherein the electronic document is supplied to the segmenter.   
     
     
       24. The system of claim 21, wherein the segmenter segments the one or more images by further: determining enclosing boundary boxes for individual text lines based on the grouping. 
     
     
       25. An apparatus to classify marking types of one or more images of an electronic document, comprising:
 a two-stage segmenter-classifier that:
 receives an electronic document containing one or more images; 
 segments the one or more images by grouping selected feature points of the one or more images along predetermined orientations into a plurality of segments; 
   provides a plurality of first classifiers generating a first category score and a second classifier generating a second category score to each segment, wherein the two-stage segmenter-classifier is trained from groundtruth images whose pixels are labeled of known marking types, wherein the first category score is generated as an array of scores for the each fragment, and the second category score is provided by reclassifying the each fragment by considering a neighborhood segment of the each segment, and wherein the first and second category scores to the each fragment are generated by:
 determining feature measurements for the each fragment, the feature measurements including measuring a segmenter feature, a size feature, a location feature, a regularity feature, an edge curvature feature, a contour feature, a run length feature, and an edge-turn histogram feature, and 
 determining the first and second category scores for the each fragment by applying the two-stage classifier to the determined feature measurements; and 
   assigns a same label to all pixels in a segment when the segment is classified by the two-stage segmenter-classifier; and   a scanner, communicatively coupled to the two-stage segmenter-classifier, that:
 receives a hardcopy document containing one or more images, 
 converts the hardcopy document into an electronic document, 
 supplies the electronic document to the two-stage segmenter-classifier.
Cited by (0)

No later patents cite this yet.
References (0)

No backward citations on record.