P
US9036693B2ActiveUtilityPatentIndex 69

Method and system for providing region-of-interest video compression

Assignee: ISNARDI MICHAEL ANTHONYPriority: Jan 8, 2009Filed: Dec 22, 2009Granted: May 19, 2015
Est. expiryJan 8, 2029(~2.5 yrs left)· nominal 20-yr term from priority
Inventors:ISNARDI MICHAEL ANTHONYKOPANSKY ARKADY
G06T 9/007H04N 19/117H04N 19/61H04N 19/17H04N 19/80
69
PatentIndex Score
4
Cited by
22
References
12
Claims

Abstract

Embodiments of the present invention provide for a region-of-interest compression methodology wherein a variety of encoders may be utilized to perform video compression on a plurality of filtered video frames without the need to generate specific instructions for each of the variety of encoders. Embodiments of the present invention receive a video frame and create a region-of-interest map based on the received video frame. The region-of-interest map is utilized to create a filtered video frame based on the received video frame. This process may be repeated for each video frame within a video stream, thereby creating a plurality of filtered video frames. The plurality of filtered video frames is transmitted to an encoder for video compression.

Claims

exact text as granted — not AI-modified
The invention claimed is: 
     
       1. A method for compressing a video stream comprising:
 receiving one or more region-of-interest maps which define one or more regions of interest across corresponding video frames of the video stream; 
 applying, for each video frame, a first spatial filter to a first set of pixels of the video frame to generate a first set of filtered pixel values, and 
 applying a second spatial filter to a second set of pixels of the video frame to generate a second set of filtered pixel values, 
 wherein the first spatial filter reduces an amount of high spatial frequency energy in the pixels and the second spatial filter reduces a greater amount of high spatial frequency energy in pixels than the first spatial filter; 
 forming, for each video frame, a filtered video frame comprising a plurality of filtered pixel values, each of said filtered pixel values of the filtered video frame being derived based on: 
 (a) a value in a corresponding location of the one or more reaction-of-interest maps corresponding to the video frame, and 
 (b) a filtered pixel value in a corresponding location from at least one of the first set and second set of filtered pixel values; 
 forming a spatially filtered video stream comprising each of the filtered video frames; and 
 providing the spatially filtered video stream to a standard video encoder for encoding the spatially filtered video stream, 
 wherein the standard video encoder automatically assigns fewer bits to regions with less high spatial frequency energy and more bits to regions with greater higher spatial frequency energy. 
 
     
     
       2. The method of  claim 1 , wherein:
 one or more of the filtered pixel values of the filtered video frame is derived by selecting a filtered pixel value from the first set of filtered pixel values, for pixel locations inside the region of interest defined by the one or more region-of-interest maps; and 
 one or more of the filtered pixel values of the filtered video frame is derived by selecting a filtered pixel value from the second set of filtered pixel values, for pixel locations outside the region of interest defined by the one or more region-of-interest maps. 
 
     
     
       3. The method of  claim 2  wherein a threshold value parameter is applied to a real-valued region of interest map to determine if a pixel is located within the one or more region of interests. 
     
     
       4. The method of  claim 1  wherein blending the first set of pixels and the second set of pixels comprises selecting either a pixel from the first set of pixels or a pixel from the second set of pixels. 
     
     
       5. The method of  claim 1 , wherein the first and second filter comprise one of separable, non-separable, boxcar, triangular or Gaussian filters. 
     
     
       6. The method of  claim 1 , wherein one or more of the filtered pixel values of the filtered video frame is derived by blending:
 (a) a filtered pixel value from the first set of filtered pixel values for a corresponding pixel location and 
 (b) a filtered pixel value from the second set of filtered pixel values for a corresponding pixel location, wherein said blending is weighted based on a value of the corresponding region-of-interest map for a corresponding pixel location. 
 
     
     
       7. An apparatus for compressing a video stream prior to standard encoding, the apparatus comprising a processor that executes a filtering module that:
 receives one or more region-of-interest maps which define one or more regions of interest across corresponding video frames of the video stream; 
 applies, for each video frame, a first spatial filter to a first set of pixels of the video frame to generate a first set of filtered pixel values, and applies 
 a second spatial filter to a second set of pixels of the video frame to generate a second set of filtered pixel values, 
 wherein the first spatial filter reduces an amount of high spatial frequency energy in the pixels and the second spatial filter reduces a greater amount of high spatial frequency energy in pixels than the first spatial filter; 
 forms, for each video frame, a filtered video frame comprising a plurality of filtered pixel values, each of said filtered pixel values of the filtered video frame being derived based on: 
 (a) a value in a corresponding location of the one or more reaction-of-interest maps corresponding to the video frame, and 
 (b) a filtered pixel value in a corresponding location from at least one of the first set and second set of filtered pixel values; 
 forms a spatially filtered video stream comprising each of the filtered video frames; 
 and provides the spatially filtered video stream to a standard video encoder for encoding the spatially filtered video stream, 
 wherein the standard video encoder automatically assigns fewer bits to regions with less high spatial frequency energy and more bits to regions with greater higher spatial frequency energy. 
 
     
     
       8. The filtering module of  claim 7 , wherein the filtering module applies a blending function that creates smooth transition regions along edges of the one or more regions of interest. 
     
     
       9. The apparatus of  claim 7 , wherein one or more of the filtered pixel values of the filtered video frame is derived by selecting a filtered pixel value from the first set of filtered pixel values, for pixel locations inside the region of interest defined by the one or more region-of-interest maps; and one or more of the filtered pixel values of the filtered video frame is derived by selecting a filtered pixel value from the second set of filtered pixel values, for pixel locations outside the region of interest defined by the one or more region-of-interest maps. 
     
     
       10. The apparatus of  claim 9  wherein a threshold value parameter is applied to a real-valued region of interest map to determine if a pixel is located within the one or more region of interests. 
     
     
       11. The apparatus of  claim 7  wherein blending the first set of pixels and the second set of pixels comprises selecting either a pixel from the first set of pixels or a pixel from the second set of pixels. 
     
     
       12. The apparatus of  claim 7 , wherein the first and second filter comprise one of separable, non-separable, boxcar, triangular or Gaussian filters.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.