Intra-coded video frame caching for video telephony sessions
Abstract
Video telephony (VT) call management techniques are described. The techniques enable a device to cache intra-frame data at a pre-decoder-initialization stage. An example device includes a memory configured to store video data associated with a VT call, a video decoder configured to render a portion of the stored video data; and one or more processors. The processor(s) are configured to determine whether the received video frame data comprises i-frame data, to determine whether the video decoder is in a pre-initialized state or an initialized state, and when the received video frame data comprises the i-frame data and the video decoder is in the pre-initialized state, to store the i-frame data to the memory.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A method of processing video frame data of a video telephony (VT) call, the method comprising:
receiving, at a device, the video frame data of the VT call over a communications channel;
determining, by the device, that the received video frame data comprises i-frame data;
determining, by the device, that a video decoder of the device is in a pre-initialized state, the pre-initialized state representing a condition in which the video decoder cannot render the received video frame data for display;
based on the received video frame data comprising the i-frame data and the video decoder being in the pre-initialized state, storing, by the device, the i-frame data to a buffer of the device.
2. The method of claim 1 , the method further comprising:
detecting that the video decoder has transitioned from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render the received video frame data for display; and
based on the detected transition, retrieving the stored i-frame data from the buffer of the device;
decoding, by the video decoder, while in the initialized state, the retrieved i-frame data to form decoded i-frame data; and
rendering, by the video decoder, while in the initialized state, the decoded i-frame data for display.
3. The method of claim 2 , further comprising:
determining that a remote device that generated the received video frame data supports an audio video profile with feedback (AVPF) mode;
determining that a predetermined time has elapsed since the retrieved i-frame data was rendered for display; and
based on the predetermined time having elapsed since the retrieved i-frame data was rendered for display and the remote device supporting the AVPF mode:
generating an i-frame request, and
sending the generated i-frame request to the remote device.
4. The method of claim 2 , further comprising:
using, by the video decoder, while in the initialized state, the retrieved i-frame data to decode and render prediction frame data associated with the VT call.
5. The method of claim 1 , wherein a storage capacity of the buffer is equal to or greater than a maximum i-frame size that the device is configured to support.
6. The method of claim 1 , wherein the stored i-frame data is first i-frame data associated with a first i-frame, the method further comprising:
receiving, at the device, additional video frame data comprising second i-frame data associated with a second i-frame;
determining that the second i-frame was generated at a time more recent than a generation time of the first i-frame; and
based on the second i-frame having been generated at a time more recent than the generation time of the first i-frame,
overwriting, in the buffer, the first i-frame data with the second i-frame data.
7. The method of claim 1 , further comprising:
determining that a predetermined time has elapsed since the i-frame data was stored to the buffer; and
based on the predetermined time having elapsed since the i-frame data was stored to the buffer, determining that the stored i-frame data is no longer usable by the video decoder to render the video frame data of the VT call.
8. The method of claim 1 , further comprising:
in response to receiving the video frame data of the VT call, initiating a decoder initialization process with respect to the video decoder of the device, wherein the decoder initialization process is configured to transition the video decoder from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render the received video frame data for display.
9. The method of claim 1 , wherein determining that the received video frame data comprises the i-frame data comprises:
decapsulating a packet that includes the video frame data to obtain a video payload of the packet;
inspecting a first byte of the video payload; and
determining that the received video frame data comprises i-frame data based on the inspected first byte.
10. The method of claim 9 , wherein determining that the received video frame data comprises the i-frame data comprises:
determining, based on the inspected first byte indicating a decimal value of one hundred twenty eight (128), that the received video frame data comprises the i-frame data.
11. A device comprising:
a communication interface configured to receive, over a communications channel, video frame data of a video telephony (VT) call;
a memory;
a video decoder in communication with the memory; and
one or more processors in communication with the memory and the video decoder, the one or more processors being configured to:
determine that the video frame data received by the communication interface comprises i-frame data;
determine that the video decoder is in a pre-initialized state, the pre-initialized state representing a condition in which the video decoder cannot render the received video frame data for display; and
store, based on the video frame data received by the communication interface comprising the i-frame data and the video decoder being in the pre-initialized state, the i-frame data to the memory.
12. The device of claim 11 , wherein the memory implements a pre-initialization buffer, and wherein to store the i-frame data to the memory, the one or more processors are configured to store the i-frame data to the pre-initialization buffer.
13. The device of claim 12 , wherein the pre-initialization buffer includes storage capacity equal to or greater than a maximum i-frame size that the device is configured to support.
14. The device of claim 11 , wherein the device further comprises a display, and wherein the one or more processors are further configured to:
detect that the video decoder has transitioned from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render, for display, the video frame data received by the communication interface;
retrieve, based on the detected transition, the stored i-frame data from the memory;
and
cause the video decoder to:
decode the retrieved i-frame data to form decoded i-frame data, and
render the decoded i-frame data for output via the display.
15. The device of claim 14 , wherein the one or more processors are further configured to:
determine that a remote device that generated the received video frame data supports an audio video profile with feedback (AVPF) mode;
determine that a predetermined time has elapsed since the retrieved i-frame was rendered for output via the display device; and
based on the predetermined time having elapsed since the retrieved i-frame data was rendered for display and the remote device supporting the AVPF mode:
generate an i-frame request, and
send, via the communication interface, the generated i-frame request to the remote device.
16. The device of claim 14 , wherein the video decoder is configured to:
use, while in the initialized state, the retrieved i-frame data to decode and render prediction frame data associated with the VT call.
17. The device of claim 11 , wherein the stored i-frame data is first i-frame data associated with a first i-frame, wherein the communication interface is configured to receive, via the communications channel, additional video frame data comprising second i-frame data associated with a second i-frame, and wherein the one or more processors are further configured to:
determine that the second i-frame was generated at a time more recent than the generation time of the first i-frame; and
based on the second i-frame having been generated at a time more recent than a generation time of the first i-frame, overwrite, in the memory, the first i-frame data with the second i-frame data.
18. The device of claim 11 , wherein the one or more processors are further configured to:
determine that a predetermined time has elapsed since the i-frame data was stored to the memory; and
based on the predetermined time having elapsed since the i-frame data was stored to the memory, determine that the stored i-frame data is no longer usable by the video decoder to render the video frame data of the VT call.
19. The device of claim 11 , wherein the one or more processors are further configured to:
initiate, in response to the communication interface receiving the video frame data of the VT call, a decoder initialization process with respect to the video decoder of the device, wherein the decoder initialization process is configured to transition the video decoder from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render, for display, the video frame data received by the communication interface.
20. The device of claim 11 , wherein to determine that the video frame data received by the communication interface comprises the i-frame data, the one or more processors are further configured to:
decapsulate a packet that includes the video frame data to obtain a video payload of the packet;
inspect a first byte of the video payload; and
determine, based on the inspected first byte, that the video frame data received by the communication interface comprises the i-frame data.
21. The device of claim 20 , wherein to determine that the received video frame data comprises the i-frame data, the one or more processors are further configured to:
determine, based on the inspected first byte indicating a decimal value of one hundred twenty eight (128), that the received video frame data comprises the i-frame data.
22. An apparatus for processing video frame data of a video telephony (VT) call, the apparatus comprising:
means for receiving the video frame data of the VT call over a communications channel;
means for determining that the received video frame data comprises i-frame data;
means for determining that a video decoder of the apparatus is in a pre-initialized state, the pre-initialized state representing a condition in which the video decoder cannot render the received video frame data for display; and
means for storing, based on the received video frame data comprising the i-frame data and the video decoder being in the pre-initialized state, the i-frame data to a buffer of the apparatus.
23. The apparatus of claim 22 , further comprising:
means for detecting that the video decoder has transitioned from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render the received video frame data for display;
means for retrieving, based on the detected transition, the stored i-frame data from the buffer of the apparatus;
means for decoding, while the video decoder is in the initialized state, based on the detected transition, the retrieved i-frame data to form decoded i-frame data; and
means for rendering, while the video decoder is in the initialized state, based on the detected transition, the decoded i-frame data for display.
24. The apparatus of claim 23 , further comprising:
means for determining that a remote device that generated the received video frame data supports an audio video profile with feedback (AVPF) mode;
means for determining that a predetermined time has elapsed since the retrieved i-frame data was rendered for display;
means for generating, based on the predetermined time having elapsed since the retrieved i-frame data was rendered for display and the remote device supporting the AVPF mode, an i-frame request; and
means for sending the generated i-frame request to the remote device over the communications channel.
25. The apparatus of claim 23 , further comprising:
means for using the retrieved i-frame data to decode and render prediction video frame data associated with the VT call while the video decoder is in the initialized state.
26. The apparatus of claim 22 , wherein a storage capacity of the buffer is equal to or greater than a maximum i-frame size that the device is configured to support.
27. A non-transitory computer-readable storage medium encoded with instructions that, when executed, cause one or more processors of a device for processing video frame data of a video telephony (VT) call to:
receive the video frame data of the VT call over a communications channel;
determine that the received video frame data comprises i-frame data;
determine that a video decoder of the device is in a pre-initialized state, the pre-initialized state representing a condition in which the video decoder cannot render the received video frame data for display; and
based on the received video frame data comprising the i-frame data and the video decoder being in the pre-initialized state, store the i-frame data to a buffer of the device.
28. The non-transitory computer-readable storage medium of claim 27 , wherein the stored i-frame data is first i-frame data associated with a first i-frame, further encoded with instructions that, when executed, cause the one or more processors of the device to:
receive, via a communication interface, additional video frame data comprising second i-frame data associated with a second i-frame;
determine that the second i-frame was generated at a time more recent than a generation time of the first i-frame; and
based on the second i-frame having been generated at a time more recent than the generation time of the first i-frame, overwrite, in the buffer of the device, the first i-frame data with the second i-frame data.
29. The non-transitory computer-readable storage medium of claim 27 , further encoded with instructions that, when executed, cause the one or more processors of the device to:
determine that a predetermined time has elapsed since the i-frame data was stored to the memory; and
based on the predetermined time having elapsed since the i-frame data was stored to the buffer, determine that the stored i-frame data is no longer usable by the video decoder to render the video data of the VT call.
30. The non-transitory computer-readable storage medium of claim 27 , further encoded with instructions that, when executed, cause the one or more processors of the device to:
initiate, in response to receiving the video frame data, a decoder initialization process with respect to the video decoder of the device, wherein the decoder initialization process is configured to transition the video decoder from the pre-initialized state to an initialized state, the initialized state representing a condition in which the video decoder can render the received video frame data for display.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.