P
US12094476B2ActiveUtilityPatentIndex 56

Systems, methods and apparatus for conversion from channel-based audio to object-based audio

Assignee: DOLBY LABORATORIES LICENSING CORPPriority: Dec 2, 2019Filed: Dec 2, 2020Granted: Sep 17, 2024
Est. expiryDec 2, 2039(~13.4 yrs left)· nominal 20-yr term from priority
Inventors:WARD MICHAEL CSANCHEZ FREDDIEFERSCH CHRISTOF JOSEPH
H04S 2400/11H04S 2400/03G10L 19/167G10L 19/173G10L 19/008H04S 7/308H04S 3/008
56
PatentIndex Score
1
Cited by
37
References
30
Claims

Abstract

Embodiments are disclosed for channel-based audio (CBA) (e.g., 22.2-ch audio) to object-based audio (OBA) conversion. The conversion includes converting CBA metadata to object audio metadata (OAMD) and reordering the CBA channels based on channel shuffle information derived in accordance with channel ordering constraints of the OAMD. The OBA with reordered channels is rendered in a playback device using the OAMD or in a source device, such as a set-top box or audio/video recorder. In an embodiment, the CBA metadata includes signaling that indicates a specific OAMD representation to be used in the conversion of the metadata. In an embodiment, pre-computed OAMD is transmitted in a native audio bitstream (e.g., AAC) for transmission (e.g., over HDMI) or for rendering in a source device. In an embodiment, pre-computed OAMD is transmitted in a transport layer bitstream (e.g., ISO BMFF, MPEG4 audio bitstream) to a playback device or source device.

Claims

exact text as granted — not AI-modified
The invention claimed is: 
     
       1. A method comprising:
 receiving, by one or more processors of an audio processing apparatus, a bitstream including channel-based audio and associated channel-based audio metadata; 
 the one or more processors configured to:
 parse a signaling parameter from the channel-based audio metadata, the signaling parameter indicating one of a plurality of different object audio metadata (OAMD) representations, each one of the OAMD representations mapping one or more audio channels of the channel-based audio to one or more audio objects 
 convert the channel-based metadata into OAMD associated with the one or more audio objects using the OAMD representation that is indicated by the signaling parameter; 
 generate channel shuffle information based on channel ordering constraints of the OAMD; 
 reorder the one or more audio channels of the channel-based audio based on the channel shuffle information to generate reordered, channel-based audio; and 
 render the reordered, channel-based audio into rendered audio using the OAMD; or 
 encode the reordered, channel-based audio and the OAMD into an object-based audio bitstream and transmit the object-based audio bitstream to a playback device or source device. 
 
 
     
     
       2. The method of  claim 1 , wherein the bitstream is a native audio bitstream, and the method further comprises decoding the native audio bitstream to determine the channel-based audio and metadata. 
     
     
       3. The method of  claim 2 , wherein the native audio bitstream is an advanced audio coding (AAC) bitstream. 
     
     
       4. The method of  claim 1 , wherein the channel-based audio and the associated channel-based audio metadata are N.M channel-based audio and channel-based audio metadata associated with the N.M channel-based audio respectively, and wherein N is a positive integer greater than nine and M is a positive integer greater than or equal to zero. 
     
     
       5. The method of  claim 4 , wherein the channel-based audio is 22.2. 
     
     
       6. The method of  claim 1 , wherein the source device is a television set-top box or an audio/video receiver. 
     
     
       7. The method of  claim 1 , further comprising:
 determining a first set of channels of the channel-based audio that are capable of being represented by OAMD bed channels; 
 assigning OAMD bed channel labels to the first set of channels; 
 determining a second set of channels of the channel-based audio that are not capable of being represented by OAMD bed channels; and 
 assigning static OAMD position coordinates to the second set of channels. 
 
     
     
       8. The method of  claim 1 , wherein the OAMD includes dimensional trim data to lower loudness levels of one or more out-of-screen audio objects in the rendered audio. 
     
     
       9. The method of  claim 1 , wherein the OAMD includes object gains that allow for compensation of differences between downmix values of the channel-based audio and rendering of OAMD representations of the channel-based audio. 
     
     
       10. A method comprising:
 receiving, by one or more processors of an audio processing apparatus, a bitstream including channel-based audio and associated channel-based audio metadata; 
 the one or more processors configured to:
 encode the channel-based audio into a native audio bitstream; 
 parse a signaling parameter from the channel-based audio metadata, the signaling parameter indicating one of a plurality of different object audio metadata (OAMD) representations, each one of the OAMD representations mapping one or more audio channels of the channel-based audio to one or more audio objects; 
 convert the channel-based metadata into OAMD associated with the one or more audio objects using the OAMD representation that is indicated by the signaling parameter; 
 generate channel shuffle information based on channel ordering constraints of the OAMD; 
 generate a bitstream package that includes the native audio bitstream, the channel shuffle information and the OAMD, the channel shuffle information enabling reordering the one or more audio channels of the channel-based audio based on the channel shuffle information at a playback device or source device to generate reordered, channel based audio; 
 multiplex the bitstream package into a transport layer bitstream; and 
 transmit the transport layer bitstream to the playback device or the source device. 
 
 
     
     
       11. The method of  claim 10 , wherein the native audio bitstream is an advanced audio coding (AAC) bitstream. 
     
     
       12. The method of  claim 10 , wherein the channel-based audio and the associated channel-based audio metadata are N.M channel-based audio and channel-based audio metadata associated with the N.M channel-based audio, respectively, and wherein N is a positive integer greater than seven and M is a positive integer greater than or equal to zero. 
     
     
       13. The method of  claim 12 , wherein the channel-based audio is 22.2. 
     
     
       14. The method of  claim 10 , wherein the source device is a television set-top box or an audio/video receiver. 
     
     
       15. The method of  claim 10 , wherein channels in the channel-based audio that can be represented by OAMD bed channel labels use the OAMD bed channel labels, and channels in the channel-based audio that cannot be represented by OAMD bed channel labels use static object positions, where each static object position is described in OAMD position coordinates. 
     
     
       16. The method of  claim 10 , wherein the OAMD includes dimensional trim data to lower loudness levels of one or more out-of-screen audio objects in the rendered audio. 
     
     
       17. The method of  claim 10 , wherein the OAMD includes object gains that allow for compensation of differences between downmix values of the channel-based audio and rendering of OAMD representations of the channel-based audio. 
     
     
       18. The method of  claim 10 , wherein the transport bitstream is a moving pictures experts group (MPEG) audio bitstream that includes a signal that indicates the presence of OAMD in an extension field of the MPEG audio bitstream. 
     
     
       19. The method of  claim 18 , wherein the signal that indicates the presence of OAMD in the MPEG audio bitstream is included in a reserved field of metadata in the MPEG audio bitstream for signaling a surround sound mode. 
     
     
       20. A method comprising:
 receiving, by one or more processors of an audio processing apparatus, a transport layer bitstream including a bitstream package, the bitstream package comprising a native audio bitstream comprising encoded channel-based audio, channel shuffle information and object audio metadata (OAMD); 
 the one or more processors configured to:
 demultiplex the transport layer bitstream to determine the bitstream package; 
 decode the bitstream package to determine the channel-based audio, the channel shuffle information and the object audio metadata (OAMD); 
 reorder the audio channels of the channel-based audio based on the channel shuffle information to generate reordered, channel based audio; and 
 render the reordered, channel-based audio into rendered audio using the OAMD; 
 
 or
 encode the reordered, channel-based audio and the OAMD into an object-based audio bitstream and transmit the object-based audio bitstream to a source device. 
 
 
     
     
       21. The method of  claim 20 , wherein the native audio bitstream is an advanced audio coding (AAC) bitstream. 
     
     
       22. The method of  claim 20 , wherein the channel-based audio is N.M channel-based audio, and wherein N is a positive integer greater than seven and M is a positive integer greater than or equal to zero. 
     
     
       23. The method of  claim 22 , wherein the channel-based audio is 22.2. 
     
     
       24. The method of  claim 20 , further comprising:
 determining a first set of channels of the channel-based audio that are capable of being represented by OAMD bed channels; 
 assigning OAMD bed channel labels to the first set of channels; 
 determining a second set of channels of the channel-based audio that are not capable of being represented by OAMD bed channels; and 
 assigning static OAMD position coordinates to the second set of channels. 
 
     
     
       25. The method of  claim 20 , wherein the OAMD includes dimensional trim data to lower loudness levels of one or more out-of-screen objects in the rendered audio. 
     
     
       26. The method of  claim 20 , wherein the OAMD includes object gains that allow for compensation of differences between downmix values of the channel-based audio and rendering of OAMD representations of the channel-based audio. 
     
     
       27. The method of  claim 20 , wherein the transport bitstream is an moving pictures experts group (MPEG) audio bitstream that includes a signal that indicates the presence of OAMD in an extension field of the MPEG audio bitstream. 
     
     
       28. The method of  claim 20 , wherein the signal that indicates the presence of OAMD in the MPEG audio bitstream is included in a reserved field of a data structure in metadata of the MPEG audio bitstream for signaling a surround sound mode. 
     
     
       29. An apparatus comprising:
 one or more processors; and 
 a non-transitory, computer-readable storage medium having instructions stored thereon that when executed by the one or more processors, cause the one or more processors to perform the method of  claim 1 . 
 
     
     
       30. A non-transitory, computer-readable storage medium having instructions stored thereon that when executed by one or more processors, cause the one or more processors to perform the method of  claim 1 .

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.