Method, apparatus and system for video processing
Abstract
A method for video processing is provided. The method comprises obtaining an image of a scene, obtaining a video that records an area included in the scene, determining one or more frames from the plurality of frames of the video, determining pairs of matched features, generating a plurality of composite frames by combining each of the selected one or more frames with the image of the scene based on the pairs of matched features, and generating a composite video based on the plurality of composite frames. Each of the pairs of matched features is related to an object that is in both the image and the one or more frames. Each of the pairs of matched features is associated with one or more pixels of the image of the scene and one or more pixels of a selected frame of the one or more frames.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A method for video processing, comprising:
obtaining an image of a scene;
obtaining a video that records an area included in the scene, the video comprising a plurality of frames, wherein the image of the scene is associated with a first image plane, and each frame in the video is associated with a second image plane;
determining one or more frames from the plurality of frames of the video;
determining pairs of matched features, wherein each of the pairs of matched features is related to an object that is in both the image and the one or more frames, and wherein each of the pairs of matched features is associated with one or more pixels of the image of the scene and one or more pixels of a selected frame of the one or more frames;
determining, based on the matched features, one or more relationships between the first image plane and one or more second image planes;
generating, based on the pairs of matched features, a plurality of composite frames by combining each of the selected one or more frames with the image of the scene, wherein generating the plurality of composite frames further comprises:
projecting, based on the one or more relationships, pixels of the frames in the video from the associated second image planes to the first image plane; and
combining the projected pixels with the pixels of the image of the scene in the first image plane to generate the plurality of composite frames; and
generating a composite video based on the plurality of composite frames.
2. The method according to claim 1 , wherein the area recorded by the video is a target scene included in the scene, wherein pixels in the composite frames that are related to the target scene are the projected pixels from the respective second image planes, and wherein the remainder pixels in the composite frames are from the pixels in the image of the scene.
3. The method according to claim 1 , wherein the second image planes for the frames in the video are the same, and one relationship is determined between the first image plane and the second image planes based on the one or more frames.
4. The method according to claim 1 , wherein the second image planes for the frames in the video include different second image planes, wherein the frames in the video are divided into groups, and each group of the frames is associated with one second image plane, and wherein one relationship is determined for each group of the frames.
5. The method according to claim 4 , wherein processing the plurality of composite frames further comprises:
mitigating boundaries caused by combining each of the one or more frames with the image of the scene; or
adjusting colors in the composite frames.
6. The method according to claim 5 , further comprising:
processing the plurality of composite frames to improve the quality of the composite frames,
wherein the composite video is generated based on the processed composite frames.
7. The method according to claim 1 , wherein determining the matched features between the image of the scene and the one or more frames further comprises:
determining a set of first features from the image of the scene;
determining a set of second features from each of the one or more frames; and
comparing the set of first features and each set of second features,
wherein the matched features include the first features and the corresponding second features that are related to the same objects in the scene based on the comparison results.
8. The method according to claim 1 , wherein obtaining the image of the scene further comprises:
obtaining a plurality of images from different perspectives; and
generating the image of the scene by combining the plurality of images.
9. The method according to claim 1 , wherein the video is recorded by an imaging device, and the settings of the imaging device remain the same during the recording of the video.
10. The method according to claim 1 , wherein the video is recorded for motions of one or more objects in the area included in the scene.
11. The method according to claim 1 , further comprising:
causing display of the composite video.
12. A device for video processing, comprising:
one or more processors; and
a non-transitory computer-readable medium, having computer-executable instructions stored thereon, the computer-executable instructions, when executed by the one or more processors, causing the one or more processors to facilitate:
obtaining an image of a scene;
obtaining a video that records an area included in the scene, the video comprising a plurality of frames, wherein the image of the scene is associated with a first image plane, and each frame in the video is associated with a second image plane;
determining one or more frames from the plurality of frames of the video;
determining pairs of matched features, wherein each of the pairs of matched features is related to an object that is in both the image and the one or more frames, and wherein each of the pairs of matched features is associated with one or more pixels of the image of the scene and one or more pixels of a selected frame of the one or more frames;
determining, based on the matched features, one or more relationships between the first image plane and one or more second image planes;
generating, based on the pairs of matched features, a plurality of composite frames by combining each of the selected one or more frames with the image of the scene, wherein generating the plurality of composite frames further comprises:
projecting, based on the one or more relationships, pixels of the frames in the video from the associated second image planes to the first image plane; and
combining the projected pixels with the pixels of the image of the scene in the first image plane to generate the plurality of composite frames; and
generating a composite video based on the plurality of composite frames.
13. The device according to claim 12 , wherein the area recorded by the video is a target scene included in the scene, wherein pixels in the composite frames that are related to the target scene are the projected pixels from the respective second image planes, and wherein the remainder pixels in the composite frames are from the pixels in the image of the scene.
14. The device according to claim 12 , wherein the second image planes for the frames in the video are the same, and one relationship is determined between the first image plane and the second image planes based on the one or more frames.
15. The device according to claim 12 , wherein the second image planes for the frames in the video include different second image planes, wherein the frames in the video are divided into groups, and each group of the frames is associated with one second image plane, and wherein one relationship is determined for each group of the frames.
16. The device according to claim 12 , wherein the instructions cause the one or more processors to further facilitate:
processing the plurality of composite frames to improve the quality of the composite frames,
wherein the composite video is generated based on the processed composite frames.
17. The device according to claim 16 , wherein processing the plurality of composite frames further comprises:
mitigating boundaries caused by combining each of the one or more frames with the image of the scene; or
adjusting colors in the composite frames.
18. A non-transitory computer-readable medium, having computer-executable instructions stored thereon, the computer-executable instructions, when executed by one or more processors, causing a processor to facilitate:
obtaining an image of a scene;
obtaining a video that records an area included in the scene, the video comprising a plurality of frames, wherein the image of the scene is associated with a first image plane, and each frame in the video is associated with a second image plane;
determining one or more frames from the plurality of frames of the video;
determining pairs of matched features, wherein each of the pairs of matched features is related to an object that is in both the image and the one or more frames, and wherein each of the pairs of matched features is associated with one or more pixels of the image of the scene and one or more pixels of a selected frame of the one or more frames;
determining, based on the matched features, one or more relationships between the first image plane and one or more second image planes;
generating, based on the pairs of matched features, a plurality of composite frames by combining each of the selected one or more frames with the image of the scene, wherein generating the plurality of composite frames further comprises:
projecting, based on the one or more relationships, pixels of the frames in the video from the associated second image planes to the first image plane; and
combining the projected pixels with the pixels of the image of the scene in the first image plane to generate the plurality of composite frames; and
generating a composite video based on the plurality of composite frames.
19. The non-transitory computer-readable medium according to claim 18 , wherein the area recorded by the video is a target scene included in the scene, wherein pixels in the composite frames that are related to the target scene are the projected pixels from the respective second image planes, and wherein the remainder pixels in the composite frames are from the pixels in the image of the scene.
20. The non-transitory computer-readable medium according to claim 18 , wherein the second image planes for the frames in the video are the same, and one relationship is determined between the first image plane and the second image planes based on the one or more frames.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.