P
US9530221B2ActiveUtilityPatentIndex 72

Context aware moving object detection

Assignee: ZHU HONGWEIPriority: Jan 6, 2012Filed: Jan 6, 2012Granted: Dec 27, 2016
Est. expiryJan 6, 2032(~5.5 yrs left)· nominal 20-yr term from priority
Inventors:ZHU HONGWEIAGHDASI FARZINMILLAR GREGWANG LEI
G06T 2207/30196G06T 2207/30232G06T 7/2053G06T 2207/10016G06T 7/254
72
PatentIndex Score
3
Cited by
9
References
33
Claims

Abstract

An image capture system includes: an image capture unit configured to capture a first image frame comprising a set of pixels; and a processor coupled to the image capture unit and configured to: determine a normalized distance of a pixel characteristic between the first image frame and a second image frame for each pixel in the first image frame; compare the normalized distance for each pixel in the first image frame against a pixel sensitivity value for that pixel; determine that a particular pixel of the first image frame is a foreground or background pixel based on the normalized distance of the particular pixel relative to the pixel sensitivity value for the particular pixel; and adapt the pixel sensitivity value for each pixel over a range of allowable pixel sensitivity values.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. An image capture system comprising:
 an image capture unit configured to capture a first image frame comprising a set of pixels; and 
 a processor coupled to the image capture unit and configured to:
 determine a normalized distance of a pixel characteristic between the first image frame and a second image frame for each pixel in the first image frame, wherein the normalized distance is a non-physical distance; 
 compare the normalized distance for each pixel in the first image frame against a pixel sensitivity value for that pixel; 
 determine that a particular pixel of the first image frame is a foreground or background pixel based on the normalized distance of the particular pixel relative to the pixel sensitivity value for the particular pixel; and 
 adapt the pixel sensitivity value for each pixel over a range of allowable pixel sensitivity values. 
 
 
     
     
       2. The system of  claim 1  wherein the processor is configured to compute the pixel sensitivity value for each pixel based on a base sensitivity value. 
     
     
       3. The system of  claim 2  wherein the processor is configured to adjust the base sensitivity value based on ratios of strong motion pixels to total motion pixels in identified blobs in the first image frame. 
     
     
       4. The system of  claim 3  wherein the processor is configured to:
 determine a histogram of percentage of strong motion pixels to total motion pixels in the identified blobs; 
 determine a peak index value of the histogram with a highest count among all index values of the histogram; 
 decrease the base sensitivity value if the peak index value is undesirably low; and 
 increase the base sensitivity value if the peak index value is undesirably high. 
 
     
     
       5. The system of  claim 1  wherein the processor is configured to determine the normalized distance as one of a finite plurality of normalized distance values, and wherein the second image frame is a background frame. 
     
     
       6. The system of  claim 1  wherein the processor is further configured to identify motion blobs by:
 grouping neighboring pixels from a start level to an end level of the normalized distance; and 
 monitoring changes over different levels in terms of number of pixels determined to be foreground pixels and a size of a bounding box of a region enclosing these foreground pixels. 
 
     
     
       7. The system of  claim 6  wherein the processor is further configured to generate objects by merging neighboring blobs together based on perspective information and previously tracked objects. 
     
     
       8. The system of  claim 1  wherein the processor is further configured to:
 determine whether each location of the second image frame is noisy and, if so, how noisy; 
 determine whether each location in the second image frame is part of a salient track; and 
 learn perspective information of a monitored scene. 
 
     
     
       9. The system of  claim 1  wherein the processor is further configured to:
 track objects over multiple frames; 
 compute a confidence value for each tracked object by calculating statistics of features of the objects over the multiple image frames; and 
 account for variant object features. 
 
     
     
       10. The system of  claim 9  wherein the processor is further configured to:
 update a scene noise map based on the confidence value of each of the tracked objects; 
 update a sensitivity map based on the confidence value of each of the tracked objects; 
 update a track salience map based on the confidence value of each of the tracked objects; and 
 update an object fitness index histogram based on the confidence value of each of the tracked objects. 
 
     
     
       11. The system of  claim 10  wherein the processor is further configured to compute the sensitivity value for each pixel based on the scene noise map and the track salience map. 
     
     
       12. The system of  claim 9  wherein the processor is further configured to automatically determine a perspective map by identifying size-persistent tracked objects and by comparing sizes of the size-persistent tracked objects at different scene locations relative to one or more reference object sizes. 
     
     
       13. An imaging method comprising:
 capturing a first image frame comprising a set of pixels; 
 determining, using a processor, a normalized distance of a pixel characteristic between the first image frame and a second image frame for each pixel in the first image frame, wherein the normalized distance is a non-physical distance; 
 varying a value of a reference from a start value to an end value within a range of possible normalized distance values; 
 comparing the normalized distance for each unlabeled pixel in the first image frame against a present value of the reference; and 
 labeling pixels each of whose respective normalized distance is greater than the present value of the reference. 
 
     
     
       14. The method of  claim 13  further comprising:
 grouping labeled neighboring pixels of the first image frame into a blob; and 
 monitoring changes over different values of the reference in terms of number of pixels in the blob and a size of a bounding box of the blob. 
 
     
     
       15. The method of  claim 14  further comprising generating objects by merging neighboring blobs together based on perspective information and previously tracked objects. 
     
     
       16. The method of  claim 13  further comprising:
 computing a pixel sensitivity value for each pixel based on a base sensitivity value; 
 using the pixel sensitivity value to determine the normalized distances and to group pixels into a blob; and 
 altering the base sensitivity value. 
 
     
     
       17. The method of  claim 16  wherein altering the base sensitivity value is based on ratios of strong motion pixels to total motion pixels in identified blobs in the first image frame. 
     
     
       18. The method of  claim 17  wherein altering the base sensitivity value comprises:
 determining a histogram of percentage of strong motion pixels to total motion pixels in the identified blobs; 
 determining a peak index value of the histogram with a highest count among all index values of the histogram; 
 decreasing the base sensitivity value if the peak index value is undesirably low; and 
 increasing the base sensitivity value if the peak index value is undesirably high. 
 
     
     
       19. The method of  claim 13  further comprising:
 determining whether each location of the second image frame is noisy and, if so, how noisy; 
 determining whether each location in the second image frame is part of a salient track; and 
 learning perspective information of a monitored scene. 
 
     
     
       20. The method of  claim 13  further comprising:
 tracking objects over multiple image frames; 
 computing a confidence value for each tracked object by calculating statistics of features of the objects over the multiple image frames; and 
 accounting for variant object features. 
 
     
     
       21. The method of  claim 20  further comprising:
 updating a scene noise map based on the confidence value of each of the tracked objects; 
 updating a sensitivity map based on the confidence value of each of the tracked objects; 
 updating a track salience map based on the confidence value of each of the tracked objects; and 
 updating an object fitness index histogram based on the confidence value of each of the tracked objects. 
 
     
     
       22. The method of  claim 21  further comprising computing a pixel sensitivity value for each pixel based on the scene noise map and the track salience map. 
     
     
       23. The method of  claim 20  further comprising automatically determining a perspective map by identifying size-persistent tracked objects and by comparing sizes of the size-persistent tracked objects at different scene locations relative to one or more reference object sizes. 
     
     
       24. A moving object detection system comprising:
 an image capture unit configured to capture image frames each comprising a set of pixels; 
 means for determining a normalized distance of a pixel characteristic between a plurality of the image frames for each pixel in the image frames, wherein the normalized distance is a non-physical distance; 
 means for identifying motion blobs comprising neighboring pixels of similar normalized distance values; and 
 means for forming objects by combining neighboring motion blobs based on perspective information associated with the blobs. 
 
     
     
       25. The system of  claim 24  further comprising means for determining the perspective information by tracking an object over multiple ones of the image frames and using one or more reference object sizes in the multiple ones of the image frames. 
     
     
       26. The system of  claim 24  further comprising means for altering pixel sensitivity information based on a base sensitivity value, a scene noise map, and a track salience map, wherein the means for determining the normalized distance uses the sensitivity information to determine the normalized distance. 
     
     
       27. The system of  claim 26  wherein the means for altering the pixel sensitivity information are configured to adjust a base sensitivity value based on ratios of strong motion pixels to total motion pixels in identified blobs in the image frames. 
     
     
       28. The system of  claim 27  further comprising:
 means for determining a histogram of percentage of strong motion pixels to total motion pixels in the identified blobs; 
 means for determining a peak index value of the histogram with a highest count among all index values of the histogram; 
 means for decreasing the base sensitivity value if the peak index value is undesirably low; and 
 means for increasing the base sensitivity value if the peak index value is undesirably high. 
 
     
     
       29. The system of  claim 24  wherein the means for identifying motion blobs comprises:
 means for grouping neighboring pixels from a start level to an end level of the normalized distance; and 
 means for monitoring changes over different levels in terms of number of pixels determined to be foreground pixels and a size of a bounding box of a region enclosing these foreground pixels. 
 
     
     
       30. The system of  claim 29  further comprising means for generating objects by merging neighboring blobs together based on perspective information and previously tracked objects. 
     
     
       31. The system of  claim 24  further comprising:
 means for tracking objects across multiple image frames; 
 means for computing a confidence value for each tracked object by calculating statistics of features of the objects over the multiple image frames; and 
 means for accounting for variant object features. 
 
     
     
       32. The system of  claim 31  further comprising:
 means for updating a scene noise map based on the confidence value of each of the tracked objects; 
 means for updating a sensitivity map based on the confidence value of each of the tracked objects; 
 means for updating a track salience map based on the confidence value of each of the tracked objects; 
 means for updating the object fitness index histogram based on the confidence value of each of the tracked objects. 
 
     
     
       33. The system of  claim 24  further comprising:
 means for determining whether each location of a first image frame is noisy and, if so, how noisy; 
 means for determining whether each location in the first image frame is part of a salient track; and 
 means for learning perspective information of a monitored scene.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.