Disclosed are a multichannel signal encoding method, an encoding apparatus for performing the encoding method, a multichannel signal processing method, and a decoding apparatus for performing a decoding method. The decoding method comprises the steps of: identifying a downmix signal of an N/2 channel which has been derived from an input signal of an N channel; and generating an output signal of the N channel from the downmix signal of the identified N/2 channel by using a plurality of OTT boxes. The number of the plurality of OTT boxes may be the same as N/2 which is the number of channels of the downmix signal, if there is no LFE channel in the output signal.
Description ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë² ë° ì기 ë°©ë²ì ìííë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ Multi-channel signal processing method and multi-channel signal processing device performing the method본 ë°ëª ì ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë² ë° ì기 ë°©ë²ì ìííë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ì ê´í ê²ì¼ë¡, ë³´ë¤ êµ¬ì²´ì¼ë¡ë ë¤ì±ë ì í¸ì ì±ëìê° ì¦ê°íëë¼ë ìì§ ì´íìì´ ìì¶í ì ìë ë°©ë² ë° ì¥ì¹ì ê´í ê²ì´ë¤.The present invention relates to a multi-channel signal processing method and a multi-channel signal processing apparatus for performing the method, and more particularly to a method and apparatus that can be compressed without deterioration of sound quality even if the number of channels of the multi-channel signal increases.
MPS(MPEG Surround)ë 5.1ì±ë, 7.1ì±ë ë± ë¤ì±ë ì í¸ë¥¼ ì½ë©í기 ìí ì½ë±ì´ë¤. MPSì ìí´, ë¤ì±ë ì í¸ë¥¼ ëì ìì¶ì¨ë¡ ìì¶íì¬ ì ì¡ì´ ê°ë¥íë¤.MPS (MPEG Surround) is a codec for coding multichannel signals such as 5.1 channel and 7.1 channel. By MPS, multi-channel signals can be compressed and transmitted at a high compression rate.
ë¤ë§, ì¸ì½ë©/ëì½ë© ê³¼ì ìì íì í¸íì´ë¼ë ì ì½ ì¬íì ê°ì§ë¤. ì¦, MPS를 íµí´ ìì±ë ë¤ì±ë ì í¸ì ë¹í¸ì¤í¸ë¦¼ì 기존ì ì½ë±ì íµí´ ëª¨ë ¸ë ì¤í ë ì¤ ííë¡ ì¬ìì´ ê°ë¥í´ì¼ íë íì í¸íì´ ì구ëë¤.However, the encoding / decoding process has a limitation of backward compatibility. In other words, the bitstream of the multi-channel signal generated through the MPS is required to be backward compatible to be reproduced in mono or stereo format through the existing codec.
ë°ë¼ì, MPSì ì ìë ì±ë ê°ìë³´ë¤ ë§ì ì±ëì ê°ì§ë ë¤ì±ë ì í¸ê° MPSì ì ë ¥ëëë¼ë, MPSìì ì¶ë ¥ëì´ ì ì¡ëë ì í¸ë MPSì ëì¼íê² ëª¨ë ¸ ëë ì¤í ë ì¤ë¡ ííëì´ì¼ íë¤. ê·¸ë¬ë©´, ëì½ëë ì¸ì½ëë¡ë¶í° ìì í ë¶ê° ì 보를 ì´ì©íì¬ ë¹í¸ì¤í¸ë¦¼ì¼ë¡ë¶í° ë¤ì±ë ì í¸ë¥¼ ë³µìí ì ìë¤. ì´ ë, ëì½ëë ì 믹ì±ì ìí ë¶ê° ì ë³´ë¡ ë¤ì±ë ì í¸ë¥¼ ë³µìí ì ìë¤. Therefore, even if a multi-channel signal having more than the number of channels defined in the MPS is input to the MPS, the signal output and transmitted from the MPS should be expressed in mono or stereo as in the MPS. The decoder may then recover the multi-channel signal from the bitstream using the additional information received from the encoder. In this case, the decoder may restore the multi-channel signal with additional information for upmixing.
ë¤ë§, ìµê·¼ì íµì íê²½ì´ ê°ì ëë©´ì ì ì¡ ëìíì´ ì¦ê°í¨ì ë°ë¼ ì í¸ì í ë¹ëë ëìíë ì¦ê°íìë¤. ê·¸ë 기 ë문ì, ëìíì ëìëëë¡ ê³¼ëíê² ìì¶í기 ë³´ë¤ë ìë ë¤ì±ë ì í¸ê° ê°ì§ë ìì§ì ì ì§íë ë°©í¥ì¼ë¡ 기ì ì´ ë°ì íê³ ìë¤. ê·¸ë ë¤ê³ íëë¼ë, ë§¤ì° ë§ì ìì ì±ëì ê°ì§ë ë¤ì±ë ì í¸ë¥¼ ì²ë¦¬í기 ìí´ìë, ì¬ì í ì ì¡í ë ìì¶ì´ íìíë¤.However, as the communication environment improves recently, as the transmission bandwidth increases, the bandwidth allocated to the signal also increases. Therefore, technology is being developed to maintain the sound quality of the original multichannel signal rather than excessively compressing it to correspond to the bandwidth. Even so, in order to process multichannel signals with very large numbers of channels, compression is still required when transmitting.
ë°ë¼ì, MPS íì¤ìì ì ìíë ì±ëìë³´ë¤ ë§ì ì±ë ì를 ê°ì§ë ì ë ¥ ì í¸ë¥¼ ì²ë¦¬íë ê²½ì°, ë¤ì±ë ì í¸ì íì§ì ì ì§íë©´ì ì¼ì ìì¤ ì´ìì ìì¶ì íµí´ ë°ì´í°ëì ì¤ì¬ì ì ì¡í ì ìë ë°©ë²ì´ ì구ëë¤.Therefore, when processing an input signal having a greater number of channels than the number of channels defined in the MPS standard, there is a demand for a method capable of reducing the amount of data through a predetermined level or more while maintaining the quality of a multichannel signal.
본 ë°ëª ì N-N/2-N 구조를 íµí´ ë¤ì±ë ì í¸ë¥¼ ì²ë¦¬íë ë°©ë² ë° ì¥ì¹ë¥¼ ì ê³µíë¤.The present invention provides a method and apparatus for processing a multichannel signal through an N-N / 2-N structure.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ëì¶ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìë³íë ë¨ê³; ë° ë³µìì OTT ë°ì¤ë¤ì ì´ì©íì¬ ì기 ìë³ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨íê³ , ì기 ë³µìì OTT ë°ì¤ë¤ì ê°ìë, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì´ ìë ê²½ì° ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ì ìë¤.Multi-channel signal processing method according to an embodiment of the present invention comprises the steps of identifying the downmix signal of the N / 2 channel derived from the input signal of the N channel; And generating an N-channel output signal from the identified N / 2 channel downmix signal using a plurality of OTT boxes, wherein the number of the plurality of OTT boxes includes no LFE channel in the output signal. In this case, the number of channels of the downmix signal may be equal to N / 2.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì ëìíë ë¹ìê´ê¸°(decorrelator)ë¡ë¶í° ìì±ë ë¹ìê´ì± ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate an output signal of two channels using an uncorrelated signal generated from a decorrelator corresponding to each of the plurality of OTT boxes and a downmix signal of one channel. .
ì기 ì¶ë ¥ ì í¸ì ì±ëìì¸ Nì´ ë¯¸ë¦¬ ì¤ì ë ì±ëì Mì ì´ê³¼íë ê²½ì°, ì기 ë¹ìê´ê¸°ë, M ì´íì ì±ëì ëìíë ì 1 ë¹ìê´ê¸°ì M ì´ê³¼ì ì±ëì ëìíë ì 2 ë¹ìê´ê¸°ë¥¼ í¬í¨íê³ , ì기 ì 2 ë¹ìê´ê¸°ë, ì 1 ë¹ìê´ê¸°ì íí°ì (filter set)ì ì¬ì¬ì©í ì ìë¤.When the number N of channels of the output signal exceeds a preset channel number M, the decorrelator includes a first decorrelator corresponding to a channel of M or less and a second decorrelator corresponding to more than M channels; The second decorrelator may reuse a filter set of the first decorrelator.
ì기 ë³µìì OTT ë°ì¤ë¤ ì¤ ì¶ë ¥ì´ LFE ì±ëì¸ OTT ë°ì¤ë, ë¹ìê´ì± ì í¸ë¥¼ ì´ì©íì§ ìê³ 2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤.An OTT box whose output is an LFE channel among the plurality of OTT boxes may generate two channels of downmix signals without using an uncorrelated signal.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, ì ì¡ë ìì°¨ ì í¸ê° ì¡´ì¬íë ê²½ì°, ë¹ìê´ì± ì í¸ ëì ì ìì°¨ ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate two channel output signals using the residual signal and one channel downmix signal instead of the uncorrelated signal when the transmitted residual signal exists.
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë, í리 ë¹ìê´ê¸° 매í¸ë¦ì¤(pre decorrelator matrix) M1ê³¼ ë¯¹ì¤ ë§¤í¸ë¦ì¤(mix matrix) M2를 ì´ì©íì¬ N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.The generating of the N-channel output signal may include generating an N-channel output signal using a pre decorrelator matrix M1 and a mix matrix M2.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, CLD(channel level difference)를 ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate an output signal of N channels using a channel level difference (CLD).
ì기 ì¶ë ¥ ì í¸ì ì±ëì Nì 10ë¶í° 32ê¹ì§ì ì§ìì¼ ì ìë¤.The number N of channels of the output signal may be an even number from 10 to 32.
본 ë°ëª ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ì 1 ì½ë© ë°©ìì ë°ë¼ ì¸ì½ë©ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ëì½ë©íë ë¨ê³; ë° ì 2 ì½ë© ë°©ìì ë°ë¼ ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨íê³ , ì기 ì 2 ì½ë© ë°©ìì, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì í¬í¨íì§ ìë ê²½ì°, ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ê°ìì OTT(one-to-two) ë°ì¤ë¤ì ì´ì©í ì ìë¤.In accordance with another aspect of the present invention, there is provided a method of processing a multichannel signal, the method including: decoding a downmix signal of an N / 2 channel encoded according to a first coding scheme; And generating an output signal of the N channel from the downmix signal of the N / 2 channel according to a second coding scheme, wherein the second coding scheme, when the output signal does not include an LFE channel, One number of one-to-two (OTT) boxes equal to N / 2, which is the number of channels of the downmix signal, may be used.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ì¤ííë íë¡ì¸ì¤ë¥¼ í¬í¨íê³ , ì기 íë¡ì¸ì¤ë, Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ëì¶ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìë³íê³ , ë³µìì OTT ë°ì¤ë¤ì ì´ì©íì¬ ì기 ìë³ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë©°, ì기 ë³µìì OTT ë°ì¤ë¤ì ê°ìë, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì´ ìë ê²½ì° ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ì ìë¤.The multi-channel signal processing apparatus according to an embodiment of the present invention includes a process for executing a multi-channel signal processing method, wherein the process identifies a downmix signal of the N / 2 channel derived from the input signal of the N channel and And generating an N-channel output signal from the identified N / 2 channel downmix signal using a plurality of OTT boxes, wherein the number of the plurality of OTT boxes is equal to the downmix when the LFE channel is not present in the output signal. It may be equal to N / 2 which is the number of channels of the signal.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì ëìíë ë¹ìê´ê¸°(decorrelator)ë¡ë¶í° ìì±ë ë¹ìê´ì± ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate an output signal of two channels using an uncorrelated signal generated from a decorrelator corresponding to each of the plurality of OTT boxes and a downmix signal of one channel. .
ì기 ì¶ë ¥ ì í¸ì ì±ëìì¸ Nì´ ë¯¸ë¦¬ ì¤ì ë ì±ëì Mì ì´ê³¼íë ê²½ì°, ì기 ë¹ìê´ê¸°ë, M ì´íì ì±ëì ëìíë ì 1 ë¹ìê´ê¸°ì M ì´ê³¼ì ì±ëì ëìíë ì 2 ë¹ìê´ê¸°ë¥¼ í¬í¨íê³ , ì기 ì 2 ë¹ìê´ê¸°ë, ì 1 ë¹ìê´ê¸°ì íí°ì (filter set)ì ì¬ì¬ì©í ì ìë¤.When the number N of channels of the output signal exceeds a preset channel number M, the decorrelator includes a first decorrelator corresponding to a channel of M or less and a second decorrelator corresponding to more than M channels; The second decorrelator may reuse a filter set of the first decorrelator.
ì기 ë³µìì OTT ë°ì¤ë¤ ì¤ ì¶ë ¥ì´ LFE ì±ëì¸ OTT ë°ì¤ë, ë¹ìê´ì± ì í¸ë¥¼ ì´ì©íì§ ìê³ 2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤.An OTT box whose output is an LFE channel among the plurality of OTT boxes may generate two channels of downmix signals without using an uncorrelated signal.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, ì ì¡ë ìì°¨ ì í¸ê° ì¡´ì¬íë ê²½ì°, ë¹ìê´ì± ì í¸ ëì ì ìì°¨ ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate two channel output signals using the residual signal and one channel downmix signal instead of the uncorrelated signal when the transmitted residual signal exists.
ì기 íë¡ì¸ì¤ë, í리 ë¹ìê´ê¸° 매í¸ë¦ì¤(pre decorrelator matrix) M1ê³¼ ë¯¹ì¤ ë§¤í¸ë¦ì¤(mix matrix) M2를 ì´ì©íì¬ N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.The process may generate an output signal of the N channel using a pre decorrelator matrix M1 and a mix matrix M2.
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, CLD(channel level difference)를 ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Each of the plurality of OTT boxes may generate an output signal of N channels using a channel level difference (CLD).
ì기 ì¶ë ¥ ì í¸ì ì±ëì Nì 10ë¶í° 32ê¹ì§ì ì§ìì¼ ì ìë¤.The number N of channels of the output signal may be an even number from 10 to 32.
본 ë°ëª ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ì¤ííë íë¡ì¸ì¤ë¥¼ í¬í¨íê³ , ì기 íë¡ì¸ì¤ë, ì 1 ì½ë© ë°©ìì ë°ë¼ ì¸ì½ë©ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ëì½ë©íê³ , ì 2 ì½ë© ë°©ìì ë°ë¼ ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë©°, ì기 ì 2 ì½ë© ë°©ìì, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì í¬í¨íì§ ìë ê²½ì°, ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ê°ìì OTT(one-to-two) ë°ì¤ë¤ì ì´ì©í ì ìë¤.The multi-channel signal processing apparatus according to another embodiment of the present invention includes a process for executing a multi-channel signal processing method, wherein the process decodes the downmix signal of the N / 2 channel encoded according to the first coding scheme and And generating an output signal of the N channel from the downmix signal of the N / 2 channel according to a second coding scheme, wherein the second coding scheme, when the output signal does not include an LFE channel, One number of one-to-two (OTT) boxes equal to the number of channels N / 2 may be used.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥´ë©´, N-N/2-N 구조ì ë°ë¼ ë¤ì±ë ì í¸ë¥¼ ì²ë¦¬í¨ì¼ë¡ì¨ MPSìì ì ìíë ì±ë ìë³´ë¤ ë§ì ì±ë ìì ë¤ì±ë ì í¸ë¥¼ í¨ì¨ì ì¼ë¡ ì²ë¦¬í ì ìë¤.According to an embodiment of the present invention, by processing a multi-channel signal according to the N-N / 2-N structure, it is possible to efficiently process a multi-channel signal of a greater number of channels than the number of channels defined in MPS.
ë 1ì ì¼ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ëì½ë© ì¥ì¹ë¥¼ ëìí ëë©´ì´ë¤.1 is a diagram illustrating an encoding apparatus and a decoding apparatus, according to an embodiment.
ë 2ë ì¼ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.2 is a diagram illustrating detailed components of an encoding apparatus according to an embodiment.
ë 3ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.3 is a diagram illustrating detailed components of an encoding apparatus according to another embodiment.
ë 4ë ì¼ì¤ììì ë°ë¥¸ ì 1 ì¸ì½ë©ë¶ì ëìì ì¤ëª í기 ìí ëë©´ì´ë¤.4 is a diagram for describing an operation of a first encoding unit, according to an exemplary embodiment.
ë 5ë ì¼ì¤ììì ë°ë¥¸ ëì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.5 is a diagram illustrating detailed components of a decoding apparatus according to an embodiment.
ë 6ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ëì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.6 is a diagram illustrating detailed components of a decoding apparatus according to another exemplary embodiment.
ë 7ì ì¼ì¤ììì ë°ë¥¸ ì 2 ëì½ë©ë¶ì ëìì ì¤ëª í기 ìí ëë©´ì´ë¤.7 is a diagram for describing an operation of a second decoding unit, according to an exemplary embodiment.
ë 8ì ì¼ì¤ììì ë°ë¥¸ N-N/2-N 구조를 ìí ê³µê°ì ì¸ ì¤ëì¤ ì²ë¦¬ ê³¼ì ì ëìí ëë©´ì´ë¤.8 is a diagram illustrating a spatial audio processing procedure for an N-N / 2-N structure according to an embodiment.
ë 9ë ì¼ì¤ììì ë°ë¥¸ N-N/2-N 구조를 ìí ê³µê°ì ì¸ ì¤ëì¤ ì²ë¦¬ë¥¼ ìííë í¸ë¦¬ 구조를 ëìí ëë©´ì´ë¤.9 illustrates a tree structure for performing spatial audio processing for the N-N / 2-N structure according to an embodiment.
ë 10ì ì¼ì¤ììì ë°ë¥¸ 12ì±ëì ë¤ì´ë¯¹ì¤ë¡ë¶í° 24ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì ëìí ëë©´ì´ë¤.FIG. 10 illustrates a process of generating an output signal of 24 channels from a 12-channel downmix according to an embodiment.
ë 11ì ì¼ì¤ììì ë°ë¥¸ ë 10ì ê³¼ì ì OTT ë°ì¤ë¡ ííí ëë©´ì´ë¤.FIG. 11 illustrates an OTT box of the process of FIG. 10, according to an exemplary embodiment.
ë 12ë ì¼ì¤ììì ë°ë¥¸ ë 11ì ê³¼ì ì MPS íì¤ì ë°ë¼ ííí ëë©´ì´ë¤.FIG. 12 illustrates a process of FIG. 11 according to an MPS standard according to an embodiment.
ì´í, 본 ë°ëª ì ì¤ìì를 첨ë¶ë ëë©´ì 참조íì¬ ìì¸íê² ì¤ëª íë¤. 본 ë°ëª ì ìíë©´, MPS ì¸ì½ë를 íµí´ Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±íê³ , MPS ëì½ë를 íµí´ N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì ì¤ëª íë¤. ì´ ë, N/2 ì±ëì 기존ì MPS íì¤ìì ì ìë ì±ëìë³´ë¤ ë ë§ì ì±ëì를 ëíë¸ë¤. ì¼ë¡ë¡, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ MPS ëì½ëë MPEG-H 3D AUDIO íì¤ì ìí íì¥ë MPS íì¤ì ë§ì¡±í ì ìë¤.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. According to the present invention, an N / 2 channel downmix signal is generated from an N channel input signal through an MPS encoder, and an N / 2 output signal is generated using an N / 2 channel downmix signal through an MPS decoder. Explain the process. At this time, the N / 2 channel represents more channels than the number of channels defined in the existing MPS standard. For example, the MPS decoder according to an embodiment of the present invention may satisfy the extended MPS standard for the MPEG-H 3D AUDIO standard.
ì´í, 본 ë°ëª ì ì¤ìì를 첨ë¶ë ëë©´ì 참조íì¬ ìì¸íê² ì¤ëª íë¤. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
본 ë°ëª ìì ì¸ì½ë© ì¥ì¹ì ëì½ë© ì¥ì¹ë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ì ëìíë¤.In the present invention, the encoding apparatus and the decoding apparatus correspond to the multichannel signal processing apparatus.
ë 1ì ì¼ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ëì½ë© ì¥ì¹ë¥¼ ëìí ëë©´ì´ë¤.1 is a diagram illustrating an encoding apparatus and a decoding apparatus, according to an embodiment.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸, ì¸ì½ë© ì¥ì¹(100)ë Nì±ëì ì ë ¥ ì í¸ë¥¼ ë¤ì´ë¯¹ì±íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤. ê·¸ë¬ë©´, ëì½ë© ì¥ì¹(101)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, Nì 10 ì´ìì¼ ì ìë¤.According to an embodiment of the present invention, the encoding apparatus 100 may generate an N / 2 channel downmix signal by downmixing an N channel input signal. Then, the decoding apparatus 101 may generate an output signal of the N channel by using the downmix signal of the N / 2 channel. Here, N may be 10 or more.
ë 2ë ì¼ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.2 is a diagram illustrating detailed components of an encoding apparatus according to an embodiment.
ë 2를 ì°¸ê³ íë©´, ì¸ì½ë© ì¥ì¹ë ì 1 ì¸ì½ë©ë¶(201), ìíë§ì¨ ë³íë¶(202) ë° ì 2 ì¸ì½ë©ë¶(203)를 í¬í¨í ì ìë¤. ì 1 ì¸ì½ë©ë¶(201)ë MPS ì¸ì½ëë¡ ì ìëë¤. ê·¸ë¦¬ê³ , ì 2 ì¸ì½ë©ë¶(203)ë USAC(Unified Speech and Audio Codec) ì¸ì½ëë¡ ì ìëë¤. ì¦, Nì±ëì ì ë ¥ ì í¸ë¥¼ ë¤ì´ë¯¹ì¤íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤. Referring to FIG. 2, the encoding apparatus may include a first encoding unit 201, a sampling rate converter 202, and a second encoding unit 203. The first encoding unit 201 is defined as an MPS encoder. The second encoding unit 203 is defined as a USAC (Unified Speech and Audio Codec) encoder. That is, an N / 2 channel downmix signal may be generated by downmixing an input signal of N channels.
ê·¸ë¬ë©´, ìíë§ì¨ ë³íë¶(202)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ëí´ ìíë§ì¨ì ë³íí ì ìë¤. ìíë§ì¨ ë³íë¶(202)ë ì 2 ì¸ì½ë©ë¶(203)ì¸ USAC ì¸ì½ëì í ë¹ë ë¹í¸ë ì´í¸ì 기ì´íì¬ ë¤ì´ìíë§í ì ìë¤. ë§ì½, ì 2 ì¸ì½ë©ë¶(203)ì¸ USAC ì¸ì½ëì ì¶©ë¶í ëì ë¹í¸ë ì´í¸ê° í ë¹ëë¤ë©´, ìíë§ì¨ ë³íë¶(202)ë ë°ì´í¨ì¤ë ì ìë¤.Then, the sampling rate converter 202 may convert the sampling rate for the downmix signal of the N / 2 channel. The sampling rate converter 202 may downsample the bit rate based on the bitrate allocated to the USAC encoder, which is the second encoder 203. If a sufficiently high bitrate is allocated to the USAC encoder, which is the second encoding unit 203, the sampling rate converter 202 may be bypassed.
ì´ í, ì 2 ì¸ì½ë©ë¶(203)ë ìíë§ì¨ì´ ë³íë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ì½ì´ ëìì ëí´ ì¸ì½ë©í ì ìë¤. ê·¸ë¬ë©´, ì 2 ì¸ì½ë©ë¶(203)를 íµí´ ì¸ì½ë©ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ê° ìì±ë ì ìë¤. ì¸ì½ë©ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë Mì±ë(Mì N/2ë³´ë¤ ê°ê±°ë ìì)ì ì í¸ì¼ ìë ìë¤. ì¬ê¸°ì, USAC ì¸ì½ëìì ì ì©ëë SBR(Spectral Band Replication)ì íµí´ 주íì ëìì´ íì¥ëë ê²½ì°, ì½ì´ ëìì 주íì ëìì´ íì¥ëì§ ìì ì 주íì ëìì ì미íë¤.Thereafter, the second encoding unit 203 may encode the core band of the downmix signal of the N / 2 channel whose sampling rate is converted. Then, the downmix signal of the N / 2 channel encoded through the second encoder 203 may be generated. The encoded downmix signal of the N / 2 channel may be a signal of the M channel (M is equal to or smaller than N / 2). Here, when the frequency band is extended through SBR (Spectral Band Replication) applied in the USAC encoder, the core band means a low frequency band in which the frequency band is not extended.
기존ì MPS íì¤ì ìíë©´, ì 1 ì¸ì½ë©ë¶(201)ì ëìíë MPS ì¸ì½ë를 íµí´ ì¶ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ìë 1ì±ë, 2ì±ë, ë° 5.1 ì±ëë¡ íì ëì´ ìë¤. íì§ë§, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ì 1 ì¸ì½ë©ë¶(201)ë ì´ì ê°ì MPS íì¤ìì ì ìíë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ì를 ì´ê³¼í ì ìë¤. ì¦, ì 1 ì¸ì½ë©ë¶(201)ë Nì±ëì ì ë ¥ ì í¸ë¥¼ ë¤ì´ë¯¹ì±íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ìì, N/2ì±ëì 1, 2, 5.1 ëë 5.1 ì´ìì´ ë ì ìë¤.According to the existing MPS standard, the number of channels of the downmix signal output through the MPS encoder corresponding to the first encoding unit 201 is limited to one channel, two channels, and 5.1 channels. However, the first encoding unit 201 according to an embodiment of the present invention may exceed the number of channels of the downmix signal defined in the MPS standard. That is, the first encoding unit 201 may generate an N / 2 channel downmix signal by downmixing an input signal of N channels. Here, in the N / 2 channel downmix signal, the N / 2 channel may be 1, 2, 5.1, or 5.1 or more.
ë 3ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ì¸ì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.3 is a diagram illustrating detailed components of an encoding apparatus according to another embodiment.
ë 3ì ë 2ìì ì¤ëª íë êµ¬ì± ììì ëì¼íë, ê·¸ ììê° ë³ê²½ë ì¤ìì를 ëíë¸ë¤. 구체ì ì¼ë¡, ë 2ë ì 1 ì¸ì½ë©ë¶(201)ì ì 2 ì¸ì½ë©ë¶(203) ì¬ì´ì ìíë§ì¨ ë³íë¶(202)ê° ì¡´ì¬íë ì¤ìì를 ëíë¸ë¤. íì§ë§, ë 3ì ìíë§ì¨ ë³íë¶(301) ì´íì, ì 1 ì¸ì½ë©ë¶(302)ì ì 2 ì¸ì½ë©ë¶(303)ê° ë°°ì¹ë ì¤ìì를 ëíë¸ë¤.3 is the same as the component described in FIG. 2, but shows an embodiment in which the order is changed. Specifically, FIG. 2 illustrates an embodiment in which a sampling rate converter 202 exists between the first encoder 201 and the second encoder 203. However, FIG. 3 illustrates an embodiment in which the first encoding unit 302 and the second encoding unit 303 are disposed after the sampling rate converter 301.
ë 4ë ì¼ì¤ììì ë°ë¥¸ ì 1 ì¸ì½ë©ë¶ì ëìì ì¤ëª í기 ìí ëë©´ì´ë¤.4 is a diagram for describing an operation of a first encoding unit, according to an exemplary embodiment.
ë 4ë N ì±ëì ì ë ¥ ì í¸ë¡ë¶í° N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±íë ê³¼ì ì ëíë¸ë¤. ë 4를 ì°¸ê³ íë©´, ì 1 ì¸ì½ë©ë¶(401)ë ë³µìì TTO ë°ì¤(402)ë¤ì í¬í¨í ì ìë¤. ì¬ê¸°ì, ë³µìì TTO ë°ì¤(402)ë¤ ê°ê°ì 2ì±ëì ì ë ¥ ì í¸ë¥¼ ë¤ì´ë¯¹ì±íì¬ 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì¶ë ¥í ì ìë¤. ì¦, ë 4ì ê°ì´ ì ë ¥ë Nì±ëì ì ë ¥ ì í¸ë¥¼ ë¤ì´ë¯¹ì±íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í기 ìí´ì, ì 1 ì¸ì½ë©ë¶(401)ë N/2ê°ì TTO ë°ì¤(402)를 í¬í¨í ì ìë¤.4 illustrates a process of generating a downmix signal of N / 2 channels from an input signal of N channels. Referring to FIG. 4, the first encoding unit 401 may include a plurality of TTO boxes 402. Here, each of the plurality of TTO boxes 402 may downmix two input signals and output one downmix signal. That is, the first encoder 401 may include N / 2 TTO boxes 402 to downmix the input signals of the N channels input as shown in FIG. 4 to generate the downmix signals of the N / 2 channels. Can be.
ì 1 ì¸ì½ë©ë¶(401)ê° ê¸°ì¡´ì MPS íì¤ì ë°ë¥¸ë¤ë©´, ì 1 ì¸ì½ë©ë¶(401)ìì ìì±ëë ë¤ì´ë¯¹ì¤ ì í¸ë 1ì±ë, 2ì±ë, ëë 5.1 ì±ëë§ ê°ë¥íë¤. íì§ë§, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥´ë©´, ì 1 ì¸ì½ë©ë¶(401)ë MPSì ë°ë¼ Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, N/2ì±ëì 1ì±ë, 2ì±ë ëë 5.1 ì±ë ë¿ë§ ìëë¼ 5.1 ì±ë ì´ìì ì±ëë ê°ë¥íë¤. ì´ ë, Nì±ëì´ MPSìì ì ìíë ì±ëë³´ë¤ í° ê²½ì°, ì 1 ì¸ì½ë©ë¶(401)ë MPS를 ì ì´í기 ìí´ ì¶ê°ì ì¸ êµ¬ë¬¸ì ê³ ë ¤í íìê° ìë¤. ì¼ë¡ë¡, ì 1 ì¸ì½ë©ë¶(401)ë ììì ì¸ í¸ë¦¬(arbitrary tree)를 ì´ì©í ì½ë© 모ë를 íì©íì¬ MPS를 ì ì´í기 ìí ì¶ê°ì ì¸ êµ¬ë¬¸ì ì ìí ì ìë¤.If the first encoder 401 conforms to the existing MPS standard, the downmix signal generated by the first encoder 401 may be one channel, two channels, or 5.1 channels. However, according to an embodiment of the present invention, the first encoding unit 401 may generate an N / 2 channel downmix signal from the N channel input signal according to the MPS. Here, the N / 2 channel may be a channel of 5.1 channels or more as well as 1 channel, 2 channels or 5.1 channels. In this case, when the N channel is larger than the channel defined in the MPS, the first encoding unit 401 needs to consider an additional syntax to control the MPS. For example, the first encoding unit 401 may define an additional syntax for controlling the MPS by using a coding mode using an arbitrary tree.
ë 5ë ì¼ì¤ììì ë°ë¥¸ ëì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.5 is a diagram illustrating detailed components of a decoding apparatus according to an embodiment.
ë 5ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì ëíë¸ë¤. ë 5를 ì°¸ê³ íë©´, ëì½ë© ì¥ì¹ë ì 1 ëì½ë©ë¶(501), ìíë§ì¨ ë³íë¶(502), ë° ì 2 ëì½ë©ë¶(503)를 í¬í¨í ì ìë¤. ì 1 ëì½ë©ë¶(501)ë ì¸ì½ë©ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ëì½ë©íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ë³µìí ì ìë¤. ì¬ê¸°ì, ì 1 ëì½ë©ë¶(501)ë USAC ëì½ëë¡ ì ìë ì ìë¤. 5 shows a process of generating an output signal of the N channel from the downmix signal of the N / 2 channel. Referring to FIG. 5, the decoding apparatus may include a first decoding unit 501, a sampling rate converter 502, and a second decoding unit 503. The first decoding unit 501 may reconstruct the downmix signal of the N / 2 channel by decoding the encoded downmix signal of the N / 2 channel. Here, the first decoding unit 501 may be defined as a USAC decoder.
ê·¸ë¦¬ê³ , ìíë§ì¨ ë³íë¶(502)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ëí ìíë§ì¨ì ë³íí ì ìë¤. ì´ ë, ìíë§ì¨ ë³íë¶(502)ë ì¸ì½ë© ì¥ì¹ìì ìíë§ì¨ì´ ë³íë ì¤ëì¤ ì í¸ì ëí´ ìëì ìíë§ì¨ë¡ ë³íí ì ìë¤. ë¤ì ë§í´ì, ë 2ë ë 3ìì ìíë§ì¨ ë³íì´ ìíë ê²½ì°, ìíë§ì¨ ë³íë¶(502)ê° ëìíë¤. ë§ì½, ë 2ë ë 3ìì ìíë§ì¨ ë³íì´ ìíëì§ ìì ê²½ì°, ìíë§ì¨ ë³íë¶(502)ë ëìíì§ ìê³ ë°ì´í¨ì¤ë ì ìë¤.The sampling rate converter 502 may convert the sampling rate of the downmix signal of the N / 2 channel. In this case, the sampling rate converter 502 may convert the sampling rate of the audio signal converted by the encoding apparatus to the original sampling rate. In other words, when the sampling rate conversion is performed in FIG. 2 or FIG. 3, the sampling rate conversion unit 502 operates. If the sampling rate conversion is not performed in FIG. 2 or FIG. 3, the sampling rate conversion unit 502 may be bypassed without operation.
íí¸, ì 2 ëì½ë©ë¶(503)ë ìíë§ì¨ ë³íë¶(502)ìì ì¶ë ¥ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.Meanwhile, the second decoding unit 503 may generate an N-channel output signal by upmixing the N / 2 channel downmix signal output from the sampling rate converter 502.
ì¢ ëì MPS ëì½ëì ì ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ë 1ì±ë, 2ì±ë, ë° 5.1 ì±ëë¡ íì ëì´ ìë¤. íì§ë§, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ì 2 ëì½ë©ë¶(503)ì ì ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ë 1ì±ë, 2ì±ë, 5.1ì±ë ë¿ë§ ìëë¼ N/2ì±ëê¹ì§ íì¥ë ì ìë¤. ê·¸ë¬ë©´, ì 2 ëì½ë©ë¶(503)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, ì 2 ëì½ë©ë¶(503)ì ì ë ¥ëë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë ìµìí 5.1 ì±ë ì´ìì ì미íë¯ë¡, Nì 10.2 ì±ë ì´ìì´ ë ì ìë¤.The downmix signal input to the conventional MPS decoder is limited to one channel, two channels, and 5.1 channels. However, the downmix signal input to the second decoding unit 503 according to an embodiment of the present invention may be extended to N / 2 channels as well as 1 channel, 2 channels, and 5.1 channels. Then, the second decoding unit 503 may generate the N-channel output signal by upmixing the N / 2 channel downmix signal. Here, since the N / 2 channel downmix signal input to the second decoding unit 503 means at least 5.1 channel or more, N may be 10.2 or more channels.
ë 6ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ëì½ë© ì¥ì¹ì ì¸ë¶ êµ¬ì± ìì를 ëìí ëë©´ì´ë¤.6 is a diagram illustrating detailed components of a decoding apparatus according to another exemplary embodiment.
ë 6ì ë 5ì ë¬ë¦¬ ì 1 ëì½ë©ë¶(601), ì 2 ëì½ë©ë¶(602) ë° ìíë§ì¨ ë³íë¶(603)ì ììì ë°ë¼ ì¤ëì¤ ì í¸ë¥¼ ì²ë¦¬í ì ìë¤. ì 1 ëì½ë©ë¶(601)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ë³µìí ì ìë¤. ê·¸ë¬ë©´, ì 2 ëì½ë©ë¶(602)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±í¨ì¼ë¡ì¨, Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì´ í, ìíë§ì¨ ë³íë¶(603)ë ì 2 ëì½ë©ë¶(602)를 íµí´ ìì±ë Nì±ëì ì¶ë ¥ ì í¸ì ëí´ ìíë§ì¨ì ë³íí ì ìë¤.Unlike FIG. 5, FIG. 6 may process an audio signal in the order of the first decoding unit 601, the second decoding unit 602, and the sampling rate converter 603. The first decoding unit 601 may restore the downmix signal of the N / 2 channel. Then, the second decoding unit 602 may generate the output signal of the N channel by upmixing the downmix signal of the N / 2 channel. Thereafter, the sampling rate converter 603 may convert the sampling rate of the output signal of the N channel generated through the second decoder 602.
ë 7ì ì¼ì¤ììì ë°ë¥¸ ì 2 ëì½ë©ë¶ì ëìì ì¤ëª í기 ìí ëë©´ì´ë¤.7 is a diagram for describing an operation of a second decoding unit, according to an exemplary embodiment.
ë 5 ë° ë 6ìì ì¤ëª íë ì 2 ëì½ë©ë¶(701)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±í¨ì¼ë¡ì¨, Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì´ ë, ì 2 ëì½ë©ë¶(701)ë ë³µìì OTT ë°ì¤(702)를 í¬í¨í ì ìë¤. OTT ë°ì¤(702)ë 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±íì¬ ì¤í ë ì¤ ííì 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤.The second decoding unit 701 described with reference to FIGS. 5 and 6 may generate an output signal of the N channel by upmixing the downmix signal of the N / 2 channel. In this case, the second decoding unit 701 may include a plurality of OTT boxes 702. The OTT box 702 may generate two channels of output signals in stereo form by upmixing one channel of downmix signals.
ë°ë¼ì, ì 2 ëì½ë©ë¶(701)ê° N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±í¨ì¼ë¡ì¨ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í기 ìí´ì, ì 2 ëì½ë©ë¶(701)ë N/2ê°ì OTT ë°ì¤(702)ë¤ì í¬í¨í ì ìë¤.Accordingly, the second decoding unit 701 generates N / 2 OTT boxes 702 in order for the second decoding unit 701 to upmix the N / 2 channel downmix signal to generate the N channel output signal. It may include.
ì 2 ëì½ë©ë¶(701)ê° ê¸°ì¡´ì MPS íì¤ì ë°ë¥¸ë¤ë©´, ì 2 ëì½ë©ë¶(701)ì ì ë ¥ëì´ ì²ë¦¬ë ì ìë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìë 1ì±ë, 2ì±ë, ëë 5.1ì±ëí ì ìë¤. íì§ë§, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥´ë©´, ì 2 ëì½ë©ë¶(701)ë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° MPSì ë°ë¼ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, Nì 10.2 ì´ìì¼ ì ìë¤.If the second decoding unit 701 conforms to the existing MPS standard, the number of channels of the downmix signal input to the second decoding unit 701 and processed may be one channel, two channels, or 5.1 channels. However, according to an embodiment of the present invention, the second decoding unit 701 may generate an output signal of the N channel according to the MPS from the downmix signal of the N / 2 channel. Here, N may be 10.2 or more.
ì´ ë, ì 2 ëì½ë©ë¶(701)ë MPS를 ì ì´í기 ìí´ ì¶ê°ì ì¸ êµ¬ë¬¸ì ê³ ë ¤í íìê° ìë¤. ì¼ë¡ë¡, ì 2 ëì½ë©ë¶(701)ë ììì ì¸ í¸ë¦¬(arbitrary tree)를 íì©í ì½ë© 모ë를 íì©íì¬ MPS를 ì ì´í기 ìí ì¶ê°ì ì¸ êµ¬ë¬¸ì ì ìí ì ìë¤.In this case, the second decoding unit 701 needs to consider additional syntax to control the MPS. For example, the second decoding unit 701 may define an additional syntax for controlling the MPS by using a coding mode using an arbitrary tree.
ë 8 ë´ì§ ë 12ìì ì¤ëª íë MPS ëì½ëë ë 5ì ì 2 ëì½ë©ë¶(503) ë° ë 6ì ì 2 ëì½ë©ë¶(602)ì ê´í ê²ì´ë¤.The MPS decoder illustrated in FIGS. 8 to 12 is related to the second decoding unit 503 of FIG. 5 and the second decoding unit 602 of FIG. 6.
ë 8ì N-N/2-N 구조(configuration)ì ë°ë¼ ë¤ì±ë ì í¸ë¥¼ ì²ë¦¬íë ê³¼ì ì ëìíë¤. 8 illustrates a process of processing a multichannel signal according to an N-N / 2-N configuration.
ë 8ì, MPEG SURROUNDì ì ìë êµ¬ì¡°ê° ë³ê²½ë N-N/2-N 구조를 ëíë¸ë¤. MPEG SURROUNDì ê²½ì°, í 1ê³¼ ê°ì´ ëì½ëìì ê³µê°ì í©ì±(spatial synthesis)ì´ ìíë ì ìë¤. ê³µê°ì í©ì±ì ì ë ¥ ì í¸ë¤ì íì´ë¸ë¦¬ë QMF ë¶ì ë± í¬(hybrid QMF(Quadrature Mirror Filter) analysis bank)를 íµí´ ìê° ëë©ì¸ìì ë¹ê·ì¹ì ì¸(non-uniform) ìë¸ë°´ë ëë©ì¸ì¼ë¡ ë³íí ì ìë¤. ì¬ê¸°ì, ë¹ê·ì¹ì ì´ë¼ë ì미ë íì´ë¸ë¦¬ëì ëìíë¤.8 shows an N-N / 2-N structure in which the structure defined in MPEG SURROUND is changed. In the case of MPEG SURROUND, spatial synthesis may be performed in a decoder as shown in Table 1. Spatial synthesis can transform the input signals from the time domain into a non-uniform subband domain through a hybrid Quadrature Mirror Filter (QMF) analysis bank. Here, the term irregular corresponds to a hybrid.
ê·¸ë¬ë©´, ëì½ëë íì´ë¸ë¦¬ë ìë¸ë°´ëìì ëìíë¤. ëì½ëë ì¸ì½ëìì ì ë¬ë ê³µê° íë¼ë¯¸í°ë¤(spatial parameter)ì 기ì´íì¬ ê³µê°ì ì¸ í©ì±ì ìíí¨ì¼ë¡ì¨ ì ë ¥ ì í¸ë¤ë¡ë¶í° ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ê·¸ë° í, ëì½ëë íì´ë¸ë¦¬ë QMF í©ì± ë± í¬(hybrid QMF synthesis bank)를 ì´ì©íì¬ ì¶ë ¥ ì í¸ë¤ì íì´ë¸ë¦¬ë ìë¸ë°´ëìì ìê° ëë©ì¸ì¼ë¡ ìë³íí ì ìë¤.The decoder then operates in the hybrid subband. The decoder may generate an output signal from the input signals by performing spatial synthesis based on the spatial parameters passed by the encoder. The decoder can then use the hybrid QMF synthesis bank to inverse the output signals from the hybrid subband to the time domain.
ë 8ì ëì½ëê° ìííë ê³µê°ì ì¸ í©ì±ì í¼í©ë 매í¸ë¦ì¤ë¥¼ íµí´ ë¤ì±ë ì í¸ë¥¼ ì²ë¦¬íë ê³¼ì ì ì¤ëª íë¤. 기본ì ì¼ë¡ MPEG SURROUNDë 5-1-5 구조, 5-2-5 구조, 7-2-7 구조, 7-5-7 구조를 ì ìíê³ ìì§ë§, 본 ë°ëª ì N-N/2-N구조를 ì ìíë¤.8 illustrates a process of processing a multi-channel signal through a mixed matrix of spatial synthesis performed by a decoder. Basically, MPEG SURROUND defines a 5-1-5 structure, a 5-2-5 structure, a 7-2-7 structure, and a 7-5-7 structure, but the present invention proposes an N-N / 2-N structure.
N-N/2-N 구조ì ê²½ì°, Nì±ëì ì ë ¥ ì í¸ê° N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ ë³íë í, N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ê° ìì±ëë ê³¼ì ì ëíë¸ë¤. 본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ëì½ëë N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. 기본ì ì¼ë¡, 본 ë°ëª ì N-N/2-N 구조ìì Nì±ëì ê°ìë ì íì´ ìë¤. ì¦, N-N/2-N 구조ë MPSìì ì§ìíë ì±ë 구조 ë¿ë§ ìëë¼, MPSìì ì§ìíì§ ìë ë¤ì±ë ì í¸ì ì±ë 구조ê¹ì§ ì§ìí ì ìë¤.In the case of the N-N / 2-N structure, after the input signal of the N channel is converted to the downmix signal of the N / 2 channel, the output signal of the N channel is generated from the downmix signal of the N / 2 channel. The decoder according to an embodiment of the present invention may generate the N-channel output signal by upmixing the N / 2 channel downmix signal. Basically, the number of N channels in the N-N / 2-N structure of the present invention is not limited. That is, the N-N / 2-N structure may support not only a channel structure supported by the MPS but also a channel structure of a multichannel signal not supported by the MPS.
ë 8ìì N/2ë MPS를 íµí´ ëì¶ë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ì를 ì미íë¤. NumInChë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ì를 ì미íê³ , NumOutChë ì¶ë ¥ ì í¸ì ì±ë ê°ì를 ì미íë¤. 구체ì ì¼ë¡, ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ NumInCh ë N/2ì´ë¤. ì¦, NumInChë N/2ê°ì´ê³ , NumOutChë Nê°ì´ë¤.In FIG. 8, N / 2 means the number of channels of the downmix signal derived through the MPS. NumInCh means the number of channels of the downmix signal, NumOutCh means the number of channels of the output signal. Specifically, NumInCh, which is the number of channels of the downmix signal, is N / 2. In other words, NumInCh is N / 2 and NumOutCh is N.
ë 8ìì N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ (X0~XNumInch - 1)ì ìì°¨ ì í¸(res)ë¤ì´ ì ë ¥ ë²¡í° X를 구ì±íë¤. ë 8ìì NumInChë N/2ì´ë¯ë¡, X0ë¶í° XNumInCh - 1ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì미íë¤. OTT(One-To-Two) ë°ì¤ì ê°ìê° N/2ê° ì´ë¯ë¡, N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì²ë¦¬í기 ìí´ ì¶ë ¥ ì í¸ì ì±ë ê°ìì¸ Nì ì§ìì´ì´ì¼ íë¤. 본 ë°ëª ì ì¼ì¤ììì ë°ë¥´ë©´, Nì 10ë¶í° 32ì¼ ì ìë¤.In FIG. 8, the downmix signals X 0 to X NumInch â 1 and the residual signals res of the N / 2 channels form an input vector X. In FIG. 8, since NumInCh is N / 2, X 0 to X NumInCh â 1 represent downmix signals of N / 2 channels. Since the number of one-to-two (OTT) boxes is N / 2, N, the number of channels of the output signal, must be even to process the downmix signal of the N / 2 channel. According to an embodiment of the present invention, N may be from 10 to 32.
ë 8ìì, 1ë¶í° M(NumInCh-NumLfe)ë¡ ë¼ë²¨ë§ë ëì½ë¦´ë ì´í°ë¤, ë¹ìê´ì± ì í¸ë¤, ìì°¨ ì í¸ë¤ì ìë¡ ë¤ë¥¸ OTT ë°ì¤ë¤ì ëìíë¤. N-N/2-N êµ¬ì¡°ê° ì ì©ëë ë¤ì±ë ì í¸ë¥¼ ìí ë³µì ê³¼ì ì í¸ë¦¬ êµ¬ì¡°ë¡ ìê°íë ì ìë¤.In FIG. 8, the decorrelators, uncorrelated signals, and residual signals labeled from 1 to M (NumInCh-NumLfe) correspond to different OTT boxes. The reconstruction process for the multi-channel signal to which the N-N / 2-N structure is applied can be visualized in a tree structure.
ë¹ìê´ê¸°ì ì¶ë ¥ ì í¸ë¤ì ì§êµì±(orthogonality)ì ë³´ì¥í기 ìí´ Nì´ 20ì¸ ê²½ì° íì©ê°ë¥í ë¹ìê´ê¸°ì ê°ìê° í¹ì ê°ì(ex. 10ê°)ë¡ ì íë íìê° ì기 ë문ì, ëªëªì ë¹ìê´ê¸°ì ì¸ë±ì¤ë¤ì´ ë°ë³µë ì ìë¤. ê·¸ëì, 본 ë°ëª ì ë°ëì§í ì¤ììì ìíë©´, N-N/2-N구조ìì ì¶ë ¥ ì í¸ì ì±ë ê°ìì¸ Nì ì íë í¹ì ê°ìì 2ë°°(ex. N<20)ë³´ë¤ ì ì íìê° ìë¤. ë§ì½, ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ë ê²½ì°, Nì±ëì LFE ì±ëì ê°ì를 ê³ ë ¤íì¬ í¹ì ê°ìì 2ë°°ë³´ë¤ ì¢ë ë§ì ì±ëë³´ë¤ ìì ê°ìì ì±ë(ex. N<24)ë¡ êµ¬ì±ë íìê° ìë¤.In order to ensure orthogonality of the output signals of the decorrelator, some N decorator indexes are repeated because N is 20, the number of available decorrelators needs to be limited to a certain number (ex. 10). Can be. Therefore, according to a preferred embodiment of the present invention, N, which is the number of channels of the output signal in the N-N / 2-N structure, needs to be less than twice the limited specific number (ex. N <20). If the LFE channel is included in the output signal, the N channel needs to be configured with a smaller number of channels (eg, N <24) than more than twice the specific number in consideration of the number of LFE channels.
ê·¸ë¦¬ê³ , ë¹ìê´ê¸°ë¤ì ì¶ë ¥ ê²°ê³¼ë ë¹í¸ì¤í¸ë¦¼ì ìì¡´íì¬ í¹ì 주íì ììì ëí ìì°¨ ì í¸ë¡ ëì²´ë ì ìë¤. LFE ì±ëì´ OTT ë°ì¤ì ì¶ë ¥ ì¤ íëì¸ ê²½ì°, ì 믹ì¤ì 기ì´í OTT ë°ì¤ì ëí´ ë¹ìê´ê¸°ê° ì¬ì©ëì§ ìëë¤.And, the output result of the decorrelators may be replaced with the residual signal for a specific frequency region depending on the bitstream. If the LFE channel is one of the outputs of the OTT box, no decorrelator is used for the OTT box based on the upmix.
ë 8ìì 1ë¶í° M(ex. NumInCh-NumLfe)ë¡ ë¼ë²¨ë§ë ë¹ìê´ê¸°ë¤, ë¹ìê´ê¸°ì ì¶ë ¥ ê²°ê³¼(ë¹ìê´ë ì í¸), ìì°¨ ì í¸ë¤ì ìë¡ ë¤ë¥¸ OTT ë°ì¤ë¤ì ëìíë¤. d1~dMì ë¹ìê´ê¸°(D1~DM)ì ì¶ë ¥ ê²°ê³¼ì¸ ë¹ìê´ë ì í¸ë¥¼ ì미íê³ , res1~resMì ë¹ìê´ê¸°(D1~DM)ì ì¶ë ¥ ê²°ê³¼ì¸ ìì°¨ ì í¸ë¥¼ ì미íë¤. ê·¸ë¦¬ê³ , ë¹ìê´ê¸° D1~DMì ìë¡ ë¤ë¥¸ OTTë°ì¤ë¤ ê°ê°ì ëìíë¤.In FIG. 8, the decorrelators labeled M (ex. NumInCh-NumLfe) from 1, the output result of the decorrelator (uncorrelated signal), and residual signals correspond to different OTT boxes. d 1 ~ d M means uncorrelated signal which is the output result of the decorrelator (D 1 ~ D M ), res 1 ~ res M means the residual signal which is the output result of the decorrelator (D 1 ~ D M ) do. The decorrelators D1 to DM correspond to different OTT boxes, respectively.
(1) ìê°ì ì¸ ìì´í í´(termporal shaping tool)ì´ ì¬ì©ëì§ ìë ê²½ì°(1) When a term shaping tool is not used
(2) ìê°ì ì¸ ìì´í í´ì´ ì¬ì©ëë ê²½ì°(2) when temporal shaping tools are used
<STPê° ì¬ì©ëë ê²½ì°><When STP is used>
ì¶ë ¥ ì í¸ì ì±ëë¤ ê°ì ë¹ìê´ ì ë를 í©ì±í기 ìí´, ê³µê°ì ì¸ í©ì±ì ìí ë¹ìê´ê¸°ë¥¼ íµí´ íì° ì í¸ê° ìì±ëë¤. ì´ ë, ìì±ë íì° ì í¸ë ë¤ì´ë í¸ ì í¸ì 믹ì±ë ì ìë¤. ì¼ë°ì ì¼ë¡ íì° ì í¸ì ìê°ì ì¸ í¬ë½ì ì ë¤ì´ë í¸ ì í¸ì í¬ë½ì ê³¼ 매ì¹ëì§ ìëë¤In order to synthesize the degree of decorrelation between the channels of the output signal, a spreading signal is generated through the decorrelator for spatial synthesis. In this case, the generated spread signal may be mixed with the direct signal. In general, the temporal envelope of the spread signal does not match the envelope of the direct signal.
ì´ ë, ìë¸ë°´ë ëë©ì¸ ìê° íë¡ì¸ì±ì ì¶ë ¥ ì í¸ì ê°ê°ì íì° ì í¸ ë¶ë¶ì í¬ë½ì ì ì¸ì½ëë¡ë¶í° ì ì¡ë ë¤ì´ë¯¹ì¤ ì í¸ì ìê°ì ì¸ ëª¨ì(termpoal shape)ì 매ì¹ëëë¡ ìì´íí기 ìí´ ì¬ì©ëë¤. ì´ë¬í íë¡ì¸ì±ì ë¤ì´ë í¸ ì í¸ì íì° ì í¸ì ëí´ í¬ë½ì ë¹ì¨ ê³ì° ëë íì° ì í¸ì ìì ì¤íí¸ë¼ ë¶ë¶ì ìì´íê³¼ ê°ì í¬ë½ì ì¶ì ì¼ë¡ 구íë ì ìë¤.At this time, subband domain time processing is used to shape the envelope of each spreading signal portion of the output signal to match the temporal shape of the downmix signal transmitted from the encoder. Such processing may be implemented with envelope estimation, such as envelope ratio calculation for direct and spread signals or shaping of the upper spectral portion of the spread signal.
ì¦, ì 믹ì±ì íµí´ ìì±ë ì¶ë ¥ ì í¸ìì ë¤ì´ë í¸ ì í¸ì í´ë¹íë ë¶ë¶ê³¼ íì° ì í¸ì ëìíë ë¶ë¶ì ëí ìê°ì ì¸ ìëì§ í¬ë½ì ì´ ì¶ì ë ì ìë¤. ìì´í íí°ë ë¤ì´ë í¸ ì í¸ì í´ë¹íë ë¶ë¶ê³¼ íì° ì í¸ì ëìíë ë¶ë¶ì ëí ìê°ì ì¸ ìëì§ í¬ë½ì ê°ì ë¹ì¨ë¡ ê³ì°ë ì ìë¤.That is, the temporal energy envelope of the portion corresponding to the direct signal and the portion corresponding to the spread signal in the output signal generated through upmixing can be estimated. The shaping factor may be calculated as the ratio between the temporal energy envelope for the portion corresponding to the direct signal and the portion corresponding to the spread signal.
íí¸, ì¶ë ¥ ì í¸ë¥¼ ìì±í기 ìí ê³µê°ì ì¸ ì 믹ì¤ì ëí´ ì ì¡ë ì본 ë¤ì´ë¯¹ì¤ ì í¸ì ì§ì° ì ë ¬(delay alignment)ì íìì±ì ì¤ì´ê¸° ìí´, ê³µê°ì ì¸ ì 믹ì¤ì ë¤ì´ë¯¹ì¤ë ì ì¡ë ì본 ë¤ì´ë¯¹ì¤ ì í¸ì ê·¼ì¬ê°(approximation)ì¼ë¡ ê³ì°ë ì ìë¤. On the other hand, in order to reduce the need for delay alignment of the transmitted original downmix signal relative to the spatial upmix for generating the output signal, the downmix of the spatial upmix is approximated with the transmitted original downmix signal ( approximation).
N-N/2-N 구조ì ëí´, (NumInCh-NumLfe)ì ëí ë¤ì´ë í¸ ë¤ì´ë¯¹ì¤ ì í¸ë í기 ìíì 8ì ìí´ ì ìë ì ìë¤.For the N-N / 2-N structure, the direct downmix signal for (NumInCh-NumLfe) may be defined by Equation 8 below.
ë¤ì´ë¯¹ì¤ì ë¸ë¡ëë°´ë í¬ë½ì ë¤ê³¼ ê°ê°ì ì ë¯¹ì¤ ì±ëì íì° ì í¸ ë¶ë¶ì ëí í¬ë½ì ì ì ê·íë ë¤ì´ë í¸ ìëì§ë¥¼ ì´ì©íì¬ í기 ìíì 9ì ë°ë¼ ì¶ì ë ì ìë¤.The envelopes of the downmix broadband envelopes and the spread signal portion of each upmix channel can be estimated using Equation 9 below using normalized direct energy.
<GESê° ì¬ì©ëë ê²½ì° ><When GES is used>
ìì ì¤ëª í ì¶ë ¥ ì í¸ì íì¥ ì í¸ ë¶ë¶ì ìê°ì ì¸ ìì´íì ìííë ê²½ì°, í¹ì§ì ì¸ ìê³¡ì´ ë°ìë ê°ë¥ì±ì´ ìë¤. ê·¸ëì, ê°ì´ëë í¬ë½ì ìì´í (Guided Envelope Shaping :GES)ì ì곡 문ì 를 í´ê²°íë©´ì ìê°ì /ê³µê°ì ì¸ íì§ì í¥ììí¬ ì ìë¤. ëì½ëìì ì¶ë ¥ ì í¸ì ë¤ì´ë í¸ ì í¸ ë¶ë¶ê³¼ íì¥ ì í¸ ë¶ë¶ì ê°ë³ì ì¼ë¡ ì²ë¦¬íëë°, GESê° ì ì©ëë©´ ì 믹ì±ë ì¶ë ¥ ì í¸ì ë¤ì´ë í¸ ì í¸ ë¶ë¶ë§ ë³ê²½ë ì ìë¤.When temporal shaping is performed on the extended signal portion of the output signal described above, characteristic distortion may occur. Thus, Guided Envelope Shaping (GES) can improve temporal / spatial quality while solving distortion problems. The decoder processes the direct signal portion and the extension signal portion of the output signal separately, but when GES is applied, only the direct signal portion of the upmixed output signal can be changed.
GESë í©ì±ë ì¶ë ¥ ì í¸ì ë¸ë¡ëë°´ë í¬ë½ì ì ë³µìí ì ìë¤. GESë ì¶ë ¥ ì í¸ì ê° ì±ëë³ë¡ ë¤ì´ë í¸ ì í¸ ë¶ë¶ì ëí´ í¬ë½ì ì íí¸í(flatterning)íê³ ë¦¬ìì´í(reshaping)íë ê³¼ì ì´íì ìì ë ì ë¯¹ì± ê³¼ì ì í¬í¨íë¤.GES can recover the broadband envelope of the synthesized output signal. GES includes a modified upmixing process after flattening and reshaping the envelope for the direct signal portion for each channel of the output signal.
리ìì´íì ëí´, ë¹í¸ì¤í¸ë¦¼ì í¬í¨ë íë¼ë©í¸ë¦ ë¸ë¡ëë°´ë í¬ë½ì (parametric broadband envelop)ì ë¶ê° ì ë³´ê° ì¬ì©ë ì ìë¤. ë¶ê° ì ë³´ë ì본 ì ë ¥ ì í¸ì í¬ë½ì ê³¼ ë¤ì´ë¯¹ì¤ ì í¸ì í¬ë½ì ì ëí í¬ë½ì ë¹ì¨ì í¬í¨íë¤. ëì½ëìì í¬ë½ì ë¹ì¨ì ì¶ë ¥ ì í¸ì ì±ëë³ë¡ íë ìì í¬í¨ë ê°ê°ì íì ì¬ë¡¯ì ë¤ì´ë í¸ ì í¸ ë¶ë¶ì ì ì©ë ì ìë¤. GESë¡ ì¸í´ ì¶ë ¥ ì í¸ì ì±ëë³ë¡ íì° ì í¸ ë¶ë¶ì ë³ê²½(alter)ëì§ ìëë¤.For reshaping, additional information of a parametric broadband envelope included in the bitstream may be used. The additional information includes the envelope ratio of the envelope of the original input signal and the envelope of the downmix signal. The envelope ratio at the decoder may be applied to the direct signal portion of each time slot included in the frame for each channel of the output signal. The GES does not alter the spread signal portion for each channel of the output signal.
ìíì 11ìì ì¶ë ¥ ì í¸ yì ëí ë¤ì´ë í¸ ì í¸ ë¶ë¶ì ë¤ì´ë í¸ ì í¸ì ìì°¨ ì í¸ë¥¼ ì ê³µíê³ , ì¶ë ¥ ì í¸ yì ëí íì¥ ì í¸ ë¶ë¶ì íì¥ ì í¸ë¥¼ ì ê³µíë¤. ì ì²´ì ì¼ë¡, GESì ìí´ ë¤ì´ë í¸ ì í¸ë§ ì²ë¦¬ë ì ìë¤.In Equation 11, the direct signal portion for the output signal y provides the direct signal and the residual signal, and the extension signal portion for the output signal y provides the extension signal. In total, only the direct signal can be processed by the GES.
GESê° ì²ë¦¬ë ê²°ê³¼ë í기 ìíì 12ì ë°ë¼ ê²°ì ë ì ìë¤.The result of processing the GES may be determined according to Equation 12 below.
GESë í¸ë¦¬ 구조ì ìì¡´íì¬ LFE ì±ëì ì ì¸í ê³µê°ì ì¸ í©ì±ì ìííë ë¤ì´ë¯¹ì¤ ì í¸ ë° ëì½ëì ìí´ ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° ì 믹ì±ë ì¶ë ¥ ì í¸ì í¹ì ì±ëì ëí´ í¬ë½ì ì ì¶ì¶í ì ìë¤. The GES can extract an envelope for a particular channel of the upmixed output signal from the downmix signal by the downmix signal and decoder that performs spatial synthesis except the LFE channel depending on the tree structure.
<매í¸ë¦ì¤ M1 (Pre-Matrix)ì ì ì><Definition of Matrix M1 (Pre-Matrix)>
매í¸ë¦ì¤ M1ì í¬ê¸°ë 매í¸ë¦ì¤ M1ì ì ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ìì ëì½ëìì ì¬ì©ëë ë¹ìê´ê¸°ì ê°ìì ìì¡´íë¤. ë°ë©´ì 매í¸ë¦ì¤ M1ì ì리먼í¸ë¤ì CLD ë°/ëë CPC íë¼ë¯¸í°ë¤ë¡ë¶í° ëì¶ë ì ìë¤. M1ì ì´í ìíì 13ì ìí´ ì ìë ì ìë¤.The size of the matrix M1 depends on the number of channels of the downmix signal input to the matrix M1 and the number of decorrelators used in the decoder. On the other hand, the elements of the matrix M1 may be derived from the CLD and / or CPC parameters. M1 may be defined by Equation 13 below.
(1) 매í¸ë¦ì¤ R1(1) matrix R1
(2) 매í¸ë¦ì¤ G1(2) Matrix G1
(3) 매í¸ë¦ì¤ H1(3) matrix H1
<매í¸ë¦ì¤ M2(post-matrix)ì ì ì><Definition of matrix M2 (post-matrix)>
ê·¸ë¬ë©´, ììì OTT ë°ì¤ë¡ë¶í° ì¶ë ¥ëë ê²°ê³¼ë í기 ìíì 21ì ìí´ ì ìë ì ìë¤.Then, the result output from any OTT box can be defined by Equation 21 below.
ì´ ë, í¬ì¤í¸ ê²ì¸ 매í¸ë¦ì¤ë í기 ìíì 22ì ê°ì´ ì ìë ì ìë¤.In this case, the post gain matrix may be defined as in Equation 22 below.
ì¬ê¸°ì, CLDì ICCë í기 ìíì 24ì ìí´ ì ìë ì ìë¤.Here, CLD and ICC may be defined by Equation 24 below.
<ë¹ìê´ê¸°ì ì ì><Definition of Emergency Correlator>
N-N/2-N 구조ìì, ë¹ìê´ê¸°ë¤ì QMF ìë¸ë°´ë ëë©ì¸ìì ìí¥ íí°(reverberation filter)ì ìí´ ìíë ì ìë¤. ìí¥ íí°ë 모ë íì´ë¸ë¦¬ë ìë¸ë°´ëìì íì¬ ì´ë¤ íì´ë¸ë¦¬ë ìë¸ë°´ëì í´ë¹íëì§ì 기ì´íì¬ ìë¡ ë¤ë¥¸ íí° í¹ì§ì ëíë¸ë¤.In the N-N / 2-N structure, decorrelators may be performed by a reverberation filter in the QMF subband domain. Reverberation filters exhibit different filter characteristics based on which hybrid subband currently corresponds to all hybrid subbands.
ìí¥ íí°ë IIR 격ì íí°ì´ë¤. ìí¸ì ì¼ë¡ ë¹ìê´ë ì§êµ ì í¸ë¤ì ìì±í기 ìí´ ìë¡ ë¤ë¥¸ ë¹ìê´ê¸°ì ëí´ IIR 격ì íí°ë¤ì ìë¡ ë¤ë¥¸ íí° ê³ì를 ê°ì§ë¤.The reverberation filter is an IIR grating filter. The IIR grating filters have different filter coefficients for different decorrelators to produce mutually uncorrelated orthogonal signals.
ë¹ìê´ íí°ë ê³ ì ë 주íì ìì¡´ ëë ì´(constant frequency-dependent delay)ì ìí´ ì ííë ë³µìì ì ì íµê³¼(All-pass(IIR)) ììì¼ë¡ 구ì±ëë¤. 주íì ì¶ì QMF ë¶í 주íìì ëìëëë¡ ìë¡ ë¤ë¥¸ ììì¼ë¡ ë¶í ë ì ìë¤. ê° ììë§ë¤ ëë ì´ì 길ì´ì íí° ê³ì 벡í°ë¤ì 길ì´ë ìë¡ ëì¼íë¤. ê·¸ë¦¬ê³ , ì¶ê°ì ì¸ ìì íì (additional phase rotation) ë문ì ë¶ë¶ì ì¸ ëë ì´(fractional delay)를 ê°ì§ë ë¹ìê´ê¸°ì íí° ê³ìë íì´ë¸ë¦¬ë ìë¸ë°´ë ì¸ë±ì¤ì ìì¡´íë¤.The uncorrelated filter consists of a plurality of all-pass (IIR) regions preceded by a fixed frequency-dependent delay. The frequency axis may be divided into different regions so as to correspond to the QMF division frequency. In each region, the length of the delay and the length of the filter coefficient vectors are the same. And, the filter coefficients of the decorrelator with fractional delay due to additional phase rotation depend on the hybrid subband index.
ìì ì´í´ë³¸ ë°ì ê°ì´, ë¹ìê´ê¸°ë¤ë¡ë¶í° ì¶ë ¥ë ë¹ìê´ë ì í¸ë¤ ê°ì ì§êµì±ì ë³´ì¥í기 ìí´ ë¹ìê´ê¸°ì íí°ë¤ì ìë¡ ë¤ë¥¸ íí° ê³ì를 ê°ì§ë¤. N-N/2-N 구조ìì, N/2ê°ì ë¹ìê´ê¸°ë¤ì´ ì구ëë¤. ì´ ë, N-N/2-N 구조ìì, ë¹ìê´ê¸°ë¤ì ê°ìë 10ê°ë¡ ì íë ì ìë¤. Lfe 모ëê° ì¡´ì¬íì§ ìë N-N/2-N 구조ìì, OTT ë°ì¤ì ê°ìì¸ N/2ê° 10ì ì´ê³¼íë ê²½ì°, 10 기본 모ëë¡ ì°ì°(basis modulo operation)ì ë°ë¼ ë¹ìê´ê¸°ë¤ì 10ì ì´ê³¼íë OTT ë°ì¤ì ê°ìì ëìíì¬ ì¬ì¬ì©ë ì ìë¤.As discussed above, the filters of the decorrelators have different filter coefficients to ensure orthogonality between the uncorrelated signals output from the decorrelators. In the N-N / 2-N structure, N / 2 decorrelators are required. At this time, in the N-N / 2-N structure, the number of decorrelators may be limited to ten. In the NN / 2-N structure where Lfe mode does not exist, when the number of OTT boxes, N / 2, exceeds 10, the decorators are more than 10 OTT boxes according to 10 basis modulo operations. It can be reused corresponding to the number of.
í기 í 6ë, N-N/2-N 구조ì ëì½ëìì ë¹ìê´ê¸°ì ì¸ë±ì¤ë¥¼ ëíë¸ë¤. í 6ì ì°¸ê³ íë©´, N/2ê°ì ë¹ìê´ê¸°ë¤ì 10 ë¨ìë¡ ì¸ë±ì¤ê° ë°ë³µëë¤. ì¦, 0ë²ì§¸ ë¹ìê´ê¸°ì 10ë²ì§¸ ë¹ìê´ê¸°ë ë¡ ëì¼í ì¸ë±ì¤ë¥¼ ê°ì§ë¤. 구체ì ì¼ë¡, ì¶ë ¥ ì í¸ì ì±ëìì¸ NDL 미리 ì¤ì ë ì±ëì Mì ì´ê³¼íë ê²½ì°, ë¹ìê´ê¸°ë, M ì´íì ì±ëì ëìíë ì 1 ë¹ìê´ê¸°ì M ì´ê³¼ì ì±ëì ëìíë ì 2 ë¹ìê´ê¸°ë¥¼ í¬í¨í ì ìë¤. ê·¸ë¦¬ê³ , ì 2 ë¹ìê´ê¸°ë, ì 1 ë¹ìê´ê¸°ì íí°ì (filter set)ì ì¬ì¬ì©í ì ìë¤.Table 6 below shows the index of the uncorrelator in the decoder of the NN / 2-N structure. Referring to Table 6, the N / 2 decorrelators are indexed by 10 units. That is, the 0th decorator and the 10th decorator Have the same index. In detail, when the number of channels of the output signal exceeds the NDL preset channel number M, the decorrelator may include a first decorrelator corresponding to a channel less than or equal to M and a second decorrelator corresponding to more than M channels. Can be. In addition, the second decorrelator may reuse the filter set of the first decorrelator.N-N/2-N 구조ì ê²½ì°, í기 í 7ì ì íì¤ì ìí´ êµ¬íë ì ìë¤.For the N-N / 2-N structure, it may be implemented by the syntax of Table 7.
ì´ ë, bsTreeConfigë í기 í 8ì ìí´ êµ¬íë ì ìë¤. ì´ ë, bsTreeConfigë í기 í 8ì ìí´ êµ¬íë ì ìë¤. í 8ì ìíë©´, bsTreeConfigê° 7ì¸ ê²½ì°, 본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ N-N/2-N구조ì ëì½ë© ì¥ì¹ì 구ì±ì ëíë¸ë¤. OTT ë°ì¤ë¤ì ì(numOttBoxes)ë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ì(NumInCh)ê³¼ ëì¼íë¤. ê·¸ë¦¬ê³ , TTT ë°ì¤ë¤ì ìë 0ì´ë¤.At this time, bsTreeConfig may be implemented by Table 8. At this time, bsTreeConfig may be implemented by Table 8. According to Table 8, when bsTreeConfig is 7, the configuration of the decoding apparatus of the N-N / 2-N structure according to an embodiment of the present invention. The number of OTT boxes numOttBoxes is equal to the number of channels NumInCh of the downmix signal. And the number of TTT boxes is zero.
ì´ ë, bsTreeConfigê° 0,1,2,3,4,5,6ì¸ ê²½ì°, MPS íì¤ì¸ ISO/IEC 20003-1:2007ì Table 40ì í 9ë¡ ì ìëë¤.At this time, when bsTreeConfig is 0, 1, 2, 3, 4, 5, 6, Table 40 of the MPS standard ISO / IEC 20003-1: 2007 is defined as Table 9.
ê·¸ë¦¬ê³ , N-N/2-N 구조ìì ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ìì¸ bsNumInChë í기 í 10ê³¼ ê°ì´ 구íë ì ìë¤.In addition, bsNumInCh, which is the number of channels of the downmix signal in the N-N / 2-N structure, may be implemented as shown in Table 10 below.
ì´ ë, NumInChì N-N/2-N구조ì ëì½ë© ì¥ì¹ì ì ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëì를 ì미íê³ , NumOutChì ë¤ì´ë¯¹ì¤ ì í¸ê° ì 믹ì±ë ì¶ë ¥ ì í¸ì ì±ëì를 ì미íë¤. ê·¸ë¦¬ê³ , N-N/2-N 구조ìì, ì¶ë ¥ ì í¸ë¤ ì¤ LFE ì±ëì ê°ìì¸ NLFEë í기 í 11ê³¼ ê°ì´ 구íë ì ìë¤. NumLfeë N-N/2-N구조ìì LFE ì±ëì(NLFE)를 ì미íë¤.In this case, NumInCh refers to the number of channels of the downmix signal input to the decoding apparatus of the NN / 2-N structure, and NumOutCh refers to the number of channels of the output signal to which the downmix signal is upmixed. In the NN / 2-N structure, N LFE which is the number of LFE channels among the output signals may be implemented as shown in Table 11 below. NumLfe means the number of LFE channels (N LFE ) in the NN / 2-N structure.
ê·¸ë¦¬ê³ , N-N/2-N 구조ìì, ì¶ë ¥ ì í¸ì ì±ë ììë ì¶ë ¥ ì í¸ì ì±ë ê°ì ë° LFE ì±ëì ê°ìì ë°ë¼ í 12ì ê°ì´ 구íë ì ìë¤.In the N-N / 2-N structure, the channel order of the output signal may be implemented as shown in Table 12 according to the number of channels of the output signal and the number of LFE channels.
í 7ìì bsHasSpeakerConfigë ì¤ì ë¡ ì¬ìíê³ ì íë ì¶ë ¥ ì í¸ì ë ì´ììì´ í 12ìì 구체íë ì±ë ììì ë¤ë¥¸ ë ì´ììì¸ì§ ì¬ë¶ë¥¼ ëíë´ë íëê·¸ì´ë¤. ë§ì½, bsHasSpeakerConfig == 1ì¸ ê²½ì°, ì¤ì ì¬ìí ëì ë¼ì°ëì¤í¼ì»¤ì ë ì´ììì¸ audioChannelLayoutê° ë ëë§ì ìí´ ì¬ì©ë ì ìë¤.In Table 7, bsHasSpeakerConfig is a flag indicating whether the layout of the output signal to be actually reproduced is different from the channel order specified in Table 12. If bsHasSpeakerConfig == 1, audioChannelLayout, which is the layout of the loudspeakers during actual playback, may be used for rendering.
ê·¸ë¦¬ê³ , audioChannelLayout ë ì¤ì ì¬ìí ëì ë¼ì°ëì¤í¼ì»¤(LoudSpeaker)ì ë ì´ììì ëíë¸ë¤. ë§ì½, ì¶ë ¥ ì í¸ê° LFE ì±ëì í¬í¨íë ê²½ì°, LFE ì±ëì ì±ë ììë (i) OTT ë°ì¤ë¥¼ ì´ì©íì¬ LFE ì±ëì´ ìë ë¤ë¥¸ ì±ëê³¼ í¨ê» ì²ë¦¬ëë ì¡°ê±´ê³¼, (ii) ì±ë 리ì¤í¸ìì ë§ì§ë§ì ìì¹íë ì¡°ê±´ì ë§ì¡±íëë¡ ê²°ì ë ì ìë¤. (ì를 ë¤ë©´, L,Lv,R,Rv,Ls,Lss,Rs,Rss,C,LFE,Cvr,LFE2) ì를 ë¤ë©´, LFE ì±ëì ì±ë 리ì¤í¸ì¸ L,Lv,R,Rv,Ls,Lss,Rs,Rss,C,LFE,Cvr,LFE2ìì 맨 ë§ì§ë§ì ìì¹íë¤.And audioChannelLayout represents the layout of the loudspeaker Loudspeaker at the time of actual reproduction. If the output signal contains an LFE channel, the channel order of the LFE channel is determined by (i) a condition processed with a channel other than the LFE channel using an OTT box, and (ii) a condition located last in the channel list. Can be determined to satisfy. (E.g., L, Lv, R, Rv, Ls, Lss, Rs, Rss, C, LFE, Cvr, LFE2) For example, LFE channels are L, Lv, R, Rv, Ls, Lss It is located last in Rs, Rss, C, LFE, Cvr, and LFE2.
ë 9ë ì¼ì¤ììì ë°ë¥¸ N-N/2-N 구조를 ìí ê³µê°ì ì¸ ì¤ëì¤ ì²ë¦¬ë¥¼ ìííë í¸ë¦¬ 구조를 ëìí ëë©´ì´ë¤.9 illustrates a tree structure for performing spatial audio processing for the N-N / 2-N structure according to an embodiment.
ë 8ì ëìë N-N/2-N구조ë ë 9ì ê°ì´ í¸ë¦¬ ííë¡ ííë ì ìë¤. ë 9ìì 모ë OTT ë°ì¤ë¤ì CLD, ICC, ìì°¨ ì í¸ ë° ì ë ¥ ì í¸ì 기ì´íì¬ 2ê° ì±ëì ì¶ë ¥ ì í¸ë¥¼ ì¬ìì±í ì ìë¤. OTT ë°ì¤ì ì´ì ëìíë CLD, ICC, ìì°¨ ì í¸ ë° ì ë ¥ ì í¸ë ë¹í¸ì¤í¸ë¦¼ì ëíëë ììì ë°ë¼ ë²í¸ê° ë§¤ê²¨ì§ ì ìë¤.The N-N / 2-N structure shown in FIG. 8 may be represented in a tree form as shown in FIG. 9. In FIG. 9 all OTT boxes can regenerate two channels of output signals based on CLD, ICC, residual signal and input signal. OTT boxes and their corresponding CLD, ICC, residual and input signals may be numbered in the order in which they appear in the bitstream.
ë 9ì ìíë©´, ë³µìì OTT ë°ì¤ë¤ì N/2ê°ê° ì¡´ì¬íë¤. ì´ ë, ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ì¸ ëì½ëë N/2ê°ì OTT ë°ì¤ë¥¼ ì´ì©íì¬ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. ì¬ê¸°ì, N/2ê°ì OTT ë°ì¤ë¤ì ë³µìì ê³ì¸µì íµí´ 구íëì§ ìëë¤. ì¦, OTT ë°ì¤ë¤ì N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ê° ì±ëë³ë¡ ë³ë ¬ì ì¼ë¡ ì 믹ì±ì ìíí ì ìë¤. ë¤ì ë§í´ì, ì´ë íëì OTT ë°ì¤ë ë¤ë¥¸ OTT ë°ì¤ì ì°ê²°ëì§ ìëë¤.According to FIG. 9, there are N / 2 of the plurality of OTT boxes. In this case, the decoder, which is a multichannel signal processing apparatus, may generate N-channel output signals from N / 2-channel downmix signals using N / 2 OTT boxes. Here, N / 2 OTT boxes are not implemented through a plurality of layers. That is, the OTT boxes may perform upmixing in parallel for each channel of the downmix signal of the N / 2 channel. In other words, one OTT box is not connected to another OTT box.
ë 9ì ì¼ìª½ í¸ë¦¬ 구조ë LFE ì±ëì´ ì ì©ëì§ ìì ëì N-N/2-N í¸ë¦¬ 구조를 ëíë´ê³ , ì¤ë¥¸ìª½ í¸ë¦¬ 구조ë LFE ì±ëì´ ì ì©ë ëì N-N/2-N í¸ë¦¬ 구조를 ëíë¸ë¤. ë 9ì ëìë 모ë OTT ë°ì¤ë¤ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸(M)를 ì 믹ì±íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ì¬ìì±í ì ìë¤. The left tree structure of FIG. 9 shows an N-N / 2-N tree structure when no LFE channel is applied, and the right tree structure shows an N-N / 2-N tree structure when an LFE channel is applied. All OTT boxes shown in FIG. 9 may remix two channels of output signals by upmixing one channel of downmix signals (M).
ì´ ë, Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ëì§ ìë ê²½ì°, N/2ê°ì OTTë°ì¤ë¤ì ìì°¨ ì í¸(res)ì ë¤ì´ë¯¹ì¤ ì í¸(M)를 ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±í ì ìë¤. íì§ë§, Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ë ê²½ì°, N/2ê°ì OTT ë°ì¤ë¤ ì¤ LFE ì±ëì´ ì¶ë ¥ëë OTT ë°ì¤ë ìì°¨ ì í¸ë¥¼ ì ì¸í ë¤ì´ë¯¹ì¤ ì í¸ë§ ì´ì©í ì ìë¤. In this case, when the LFE channel is not included in the output signal of the N channel, the N / 2 OTT boxes may generate the output signal of the N channel using the residual signal res and the downmix signal M. FIG. However, when the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is output among the N / 2 OTT boxes may use only the downmix signal except the residual signal.
ë¿ë§ ìëë¼, Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ë ê²½ì°, N/2ê°ì OTT ë°ì¤ë¤ ì¤ LFE ì±ëì´ ì¶ë ¥ëì§ ìë OTT ë°ì¤ë CLDì ICC를 ì´ì©íì¬ ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±íì§ë§, LFE ì±ëì´ ì¶ë ¥ëë OTT ë°ì¤ë CLDë§ ì´ì©íì¬ ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì 믹ì±í ì ìë¤.In addition, when the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is not output among the N / 2 OTT boxes upmixes the downmix signal using CLD and ICC, but the LFE channel is The output OTT box can upmix the downmix signal using only the CLD.
ê·¸ë¦¬ê³ , Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ë ê²½ì°, N/2ê°ì OTT ë°ì¤ë¤ ì¤ LFE ì±ëì´ ì¶ë ¥ëì§ ìë OTT ë°ì¤ë ë¹ìê´ê¸°ë¥¼ íµí´ ë¹ìê´ë ì í¸ë¥¼ ìì±íì§ë§, LFE ì±ëì´ ì¶ë ¥ëë OTT ë°ì¤ë ë¹ìê´ ê³¼ì ì ìííì§ ìì¼ë¯ë¡ ë¹ìê´ë ì í¸ë¥¼ ìì±íì§ ìëë¤.If the LFE channel is included in the output signal of the N channel, the OTT box in which the LFE channel is not output among the N / 2 OTT boxes generates an uncorrelated signal through the decorrelator, but the OTT in which the LFE channel is output. The box does not perform uncorrelated processes and therefore does not generate uncorrelated signals.
ë 10ì ì¼ì¤ììì ë°ë¥¸ 12ì±ëì ë¤ì´ë¯¹ì¤ë¡ë¶í° 24ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì ëìí ëë©´ì´ë¤.FIG. 10 illustrates a process of generating an output signal of 24 channels from a 12-channel downmix according to an embodiment.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥´ë©´, MPS ì¸ì½ë©ì íµí´ Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ê° ìì±ë ì ìë¤. ê·¸ë¦¬ê³ , MPS ëì½ë©ì íµí´ N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ê° ìì±ë ì ìë¤.According to an embodiment of the present invention, an N / 2 channel downmix signal may be generated from an N channel input signal through MPS encoding. The N-channel output signal may be generated from the downmix signal of the N / 2 channel through MPS decoding.
ë¤ë§, 기존ì MPS íì¤ìì ì¸ì½ë를 íµí´ ì¶ë ¥ëë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëì 1ì±ë, 2ì±ë, 5.1ì±ëì´ë¤. íì§ë§, 본 ë°ëª ì ì´ì íì ëì§ ìëë¤. ë¤ë§ 기존ì MPS íì¤ì ì ìëì´ ìì§ ìì ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëì를 ì§ìí기 ìí´ìë ì¶ê°ì ì¸ êµ¬ë¬¸ì ìê° íìíë¤. However, in the conventional MPS standard, the channels of the downmix signal output through the encoder are 1 channel, 2 channels, and 5.1 channels. However, the present invention is not limited thereto. However, additional syntax definition is required to support the number of channels of downmix signals that are not defined in the existing MPS standard.
MPS íì¤ìì ì ì¶ë ¥ ê´ê³ë í 9ì ê°ì´ BsTreeConfigì íµí´ ì ìë ì ìë¤. BsTreeConfigì ë°ë¼ ì ë ¥ ì í¸ì ì¶ë ¥ ì í¸ì ëì½ë© ê³¼ì ì´ ì ìëë¤.In the MPS standard, input / output relationships can be defined through BsTreeConfig as shown in Table 9. BsTreeConfig defines the decoding process of input and output signals.
BsTreeConfig 0ì ê²½ì°, 6ì±ë(5.1ì±ë)ì ì ë ¥ ì í¸ë¡ë¶í° 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±íê³ , 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° 6ì±ë(5.1ì±ë)ì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì ì ìíë¤. ì´ë¥¼ ìí´, ëì½ëë 5ê°ì OTT ë°ì¤ê° íìíê³ , ê°ê°ì OTT ë°ì¤ì CLD(Channel Level Difference)ê° ì ì©ë ì ìë¤.In the case of BsTreeConfig 0, a process of generating a downmix signal of one channel from an input signal of six channels (5.1 channels) and an output signal of six channels (5.1 channels) from a downmix signal of one channel is defined. To this end, the decoder needs five OTT boxes, and channel level difference (CLD) may be applied to each OTT box.
ì´ ë, OTT ë°ì¤ì ì ë ¥ëë CLDë OTT ë°ì¤ì ìì¹ì ë°ë¼ defaultCLD[0~5]ê¹ì§ ì ìë ì ìì¼ë©°, OTT ë°ì¤ì ëìíë CLDê° enableëë¤. ì¦, CLDê° enableëë©´ OTT ë°ì¤ì CLDê° ì ë ¥ë ì ìë¤. ottModeLfeë OTT ë°ì¤ë¡ë¶í° LFE ì±ëì´ ì¶ë ¥ëë ì§ë¥¼ ì미íë¤. At this time, the CLD input to the OTT box may be defined up to defaultCLD [0 ~ 5] according to the position of the OTT box, and the CLD corresponding to the OTT box is enabled. That is, if CLD is enabled, CLD may be input to the OTT box. ottModeLfe also means that the LFE channel is output from the OTT box.
íì¬ MPS íì¤ì ì ìë í 9ì ìíë©´, 6ê°ì OTT ë°ì¤ë¤ì ëìíë defaultCLD[0~5]ë§ ì ìëì´ ìë¤. ê·¸ëì, íì¬ MPS íì¤ì ì ë ¥ ì í¸ì ì±ëì´ 10ì ì´ê³¼íì¬ 5ì±ë ì´ìì ë¤ì´ë¯¹ì¤ë¥¼ ìì±íë ê²½ì°ë¥¼ 커ë²íì§ ëª»íë¤. According to Table 9 defined in the current MPS standard, only defaultCLD [0 ~ 5] corresponding to six OTT boxes is defined. Thus, the current MPS standard does not cover the case where the channels of the input signal exceed 10 to produce more than 5 channels of downmix.
ì´ë¥¼ ìí´, 본 ë°ëª ì MPS íì¤ì reserved bit를 ì´ì©íì¬ ê¸°ì¡´ì MPS íì¤ìì ì ìí ì±ëê³¼ ë¤ë¥¸ ì±ëì ê°ì§ë ì ë ¥ ì í¸ë¥¼ ì²ë¦¬í ì ìë¤. ì를 ë¤ì´, ì ë ¥ ì í¸ì ì±ëìì¸ Nì´ 24ì´ê³ , ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìê° 12ì¸ ê²½ì°, í 13ê³¼ ê°ì´ ì ìë ì ìë¤.To this end, the present invention can process an input signal having a channel different from the channel defined in the existing MPS standard by using the reserved bit in the MPS standard. For example, when N, the channel number of the input signal, is 24, and the channel number of the downmix signal is 12, it may be defined as shown in Table 13.
ë 10ì í 13ì ë°ë¼ 구íí ëì½ë를 ì미íë¤. ë 10ì ìíë©´, 12ì±ëì ë¤ì´ë¯¹ì¤ ì í¸(x0-x11)ë¡ë¶í° 2ê°ì LFE ì±ëì í¬í¨íë 24ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ê³¼ì ì´ ëìëë¤.10 shows a decoder implemented according to Table 13. Referring to FIG. 10, a process of generating an output signal of 24 channels including two LFE channels from a 12-channel downmix signal x 0- x 11 is illustrated.
ë 10ìì ë²¡í° x(1001)를 ì°¸ê³ íë©´, 12ì±ëì ë¤ì´ë¯¹ì¤ ì í¸(x0-x11)ì 12ì±ëì ìì°¨ ì í¸(res1-res11)ê° ì ë ¥ëìì§ë§, ì´íììë ìì°¨ ì í¸ë¥¼ ì ì¸íê³ ì¤ëª íê¸°ë¡ íë¤. ë 10ì ëì½ëë 12ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ë¹ìê´ê¸°(1007)ì ì ë ¥íì¬ ë¹ìê´ì± ì í¸ë¥¼ ìì±í ì ìë¤.Referring to the vector x (1001) in FIG. 10, 12 channels of downmix signals (x0-x11) and 12 signals of residual signals (res 1 -res 11 ) are input, but will be described below except for the residual signal. do. The decoder of FIG. 10 may input a downmix signal of 12 channels to the decorrelator 1007 to generate an uncorrelated signal.
ë 10ì ë²¡í° v(1003)ë ë²¡í° x(1001)ì 매í¸ë¦ì¤ M1(1002)ê° ì ì©ë¨ì¼ë¡ì¨ ëì¶ë ì ìë¤. ë²¡í° v(1003)ë í기 ìíì 25ì ë°ë¼ ê²°ì ë ì ìë¤.The vector v 1003 of FIG. 10 may be derived by applying the matrix M1 1002 to the vector x 1001. The vector v 1003 may be determined according to Equation 25 below.
ìíì 25ë ìíì 1ì ëìíë¤. ìíì 25ìì ìì°¨ ì í¸(res)ê° ì¡´ì¬íì§ ìë ê²½ì°, xMo~xM11ì vM0~vM11ì 매íë ì ìë¤. ë¹ìê´ì± ì í¸ë ë¤ì´ë¯¹ì¤ ì í¸ì ê°ìì ëì¼íê² ëì¶ë ì ìë¤.(25) corresponds to (1). When the residual signal res does not exist in Equation 25, x Mo to x M11 may be mapped to v M0 to v M11 . The uncorrelated signal may be derived equal to the number of downmix signals.
ë°±í°w(1004)ë í기 ìíì 26ì ë°ë¼ ê²°ì ë ì ìë¤.The vector w 1004 may be determined according to Equation 26 below.
ìíì 26ì ìíì 2ì ëìíë¤. ë¹ìê´ê¸°(1007)ì ìì°¨ ì í¸ê° ì¡´ì¬íì§ ìë ê²½ì°ì ëìíë¤. ì¦, ìì°¨ ì í¸ê° ì¡´ì¬íì§ ìì¼ë©´, ë¹ìê´ì± ì í¸ê° ìì±ë ì ìë¤. D()ë ë¹ìê´ê¸°ê° ë¹ìê´ì± ì í¸ë¥¼ ìì±í ë íì©ëë¤. ìíì 26ìì, ìì°¨ ì í¸ê° ì¡´ì¬íë©´, ë 0ì´ê³ ê·¸ë ì§ ìì¼ë©´ 1ì´ë¤. ì¦, ê° 1ì¼ ë ìíì 15ì ë°ë¼ ë¹ìê´ì± ì í¸ê° ìì±ë ì ìë¤. Equation 26 corresponds to Equation 2. The decorrelator 1007 operates when there is no residual signal. That is, if there is no residual signal, an uncorrelated signal may be generated. D () is used when the decorrelator generates an uncorrelated signal. In Equation 26, if a residual signal exists, Is 0 otherwise 1. In other words, When 1 is uncorrelated signal can be generated according to the equation (15).ë 10ìì ë²¡í° y(1006)ë ìíì 27ì ë°ë¼ ë²¡í° w(1004)ì 매í¸ë¦ì¤ M2(1005)를 ì ì©í¨ì¼ë¡ì¨ ëì¶ë ì ìë¤. ë²¡í° y(1006)ë Nì±ë(N=24)ì ì¶ë ¥ ì í¸ì ëìíë¤. In FIG. 10, the vector y 1006 may be derived by applying the matrix M2 1005 to the vector w 1004 according to Equation 27. Vector y 1006 corresponds to an output signal of N channels (N = 24).
매í¸ë¦ì¤ M1(1002)ê³¼ 매í¸ë¦ì¤ M2(1005)를 ëì¶íë ê³¼ì ì ë 8ì ì¤ëª ì íµí´ ëì¶ë ì ìë¤. 매í¸ë¦ì¤ M1(1002)ì ëì¶í기 ìí R1ì í기 ìíì 28ê³¼ ê°ê³ , 매í¸ì¤ M2(1005)를 ëì¶í기 ìí R2ë í기 ìíì 29ì ê°ë¤.The process of deriving the matrix M1 1002 and the matrix M2 1005 may be derived through the description of FIG. 8. R1 for deriving the matrix M1 1002 is represented by Equation 28 below, and R2 for deriving the mates M2 1005 is represented by Equation 29 below.
ìíì 29ì HLL, HLR, HRL, HRRì ê° OTT ë°ì¤ì ëìíë CLDì ICCë¡ë¶í° ëì¶ë ì ìë¤.H LL , H LR , H RL , and H RR in Equation 29 may be derived from CLD and ICC corresponding to each OTT box.
본 ë°ëª ì ìë¡ê² ì ìë bsTreeConfig ì ë³´ì ë°ë¼ N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë³ë ¬ 구조ì OTT기ë°ì MPS(MPEG Surround) ëì½ë를 ì ìíë¤. The present invention proposes an OTT-based MPS (MPEG Surround) decoder having a parallel structure that generates N-channel output signals from N / 2 channel downmix signals according to newly defined bsTreeConfig information.
ë 11ì ì¼ì¤ììì ë°ë¥¸ ë 10ì ê³¼ì ì OTT ë°ì¤ë¡ ííí ëë©´ì´ë¤.FIG. 11 illustrates an OTT box of the process of FIG. 10, according to an exemplary embodiment.
ë 11ì ìíë©´, ê°ê°ì OTT ë°ì¤ë 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ë¹ìê´ê¸°(D)를 íµí´ ìì±ë ë¹ìê´ì± ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì í¸ë¥¼ ìì±íë¤. OTT ë°ì¤ìë CLDì ëìíë defaultCld[0]~defaultCld[9]ì LFE ì±ëì ëìíë OttModelfe[0], OttModelfe[1]ì´ ì ë ¥ë ì ìë¤. ì를 ë¤ì´, ì¶ë ¥ ì í¸ì´ 22.2ì±ëì¸ ê²½ì° ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ë ì ìë¤. ê·¸ë¬ë©´, OttModelfe[0], OttModelfe[1]ì´ enableëë¤.Referring to FIG. 11, each OTT box generates two channels of signals using a downmix signal of one channel and an uncorrelated signal generated through the decorrelator (D). In the OTT box, defaultCld [0] to defaultCld [9] corresponding to the CLD, and OttModelfe [0] and OttModelfe [1] corresponding to the LFE channel may be input. For example, when the output signal is 22.2 channels, the LFE channel may be included in the output signal. OttModelfe [0] and OttModelfe [1] are then enabled.
ë 12ë ì¼ì¤ììì ë°ë¥¸ ë 11ì ê³¼ì ì MPS íì¤ì ë°ë¼ ííí ëë©´ì´ë¤.FIG. 12 illustrates a process of FIG. 11 according to an MPS standard according to an embodiment.
ë 12ì ìíë©´, 12ì±ëì ë¤ì´ë¯¹ì¤ ì í¸(M0-M11)ê° ê°ê°ì OTT ë°ì¤ì ì ë ¥ëë ê²½ì°ê° ëìëë¤. ê·¸ë¬ë©´, 24ì±ëì ì¶ë ¥ ì í¸(y)ê° ìì±ëë¤. ì¬ê¸°ì, CLDì ICCë ê° OTT ë°ì¤ì ì ë ¥ëë¤. ë 12ìì ìì°¨ ì í¸ê° OTT ë°ì¤ì ì ë ¥ëë ê²ì¼ë¡ ëìëìì¼ë, ìì°¨ ì í¸ê° ìë ê²½ì° ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° ë¹ìê´ê¸°ë¥¼ íµí´ ìì±ë ë¹ìê´ì± ì í¸ê° ìì°¨ ì í¸ ëì OTT ë°ì¤ì ì ë ¥ë ì ìë¤.According to FIG. 12, a case in which 12 channels of downmix signals M 0 to M 11 are input to each OTT box is illustrated. Then, the output signal y of 24 channels is generated. Here, CLD and ICC are also input to each OTT box. Although the residual signal is illustrated in FIG. 12 as being input to the OTT box, if there is no residual signal, an uncorrelated signal generated through the decorrelator from the downmix signal may be input to the OTT box instead of the residual signal.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ë¤ì±ë ì¤ëì¤ ì í¸ ì²ë¦¬ ë°©ë²ì Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ìì±ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ìì°¨ ì í¸ë¥¼ ìë³íë ë¨ê³; ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ìì°¨ ì í¸ë¥¼ ì 1 매í¸ë¦ì¤ì ì ì©íë ë¨ê³; ì기 ì 1 매í¸ë¦ì¤ë¥¼ íµí´ N/2ê°ì OTT ë°ì¤ë¤ì ëìíë N/2ê°ì ë¹ìê´ê¸°ì ì ë ¥ëë ì 1 ì í¸ ë° N/2ê°ì ë¹ìê´ê¸°ì ì ë ¥ëì§ ìê³ ì 2 매í¸ë¦ì¤ì ì ë¬ëë ì 2 ì í¸ë¥¼ ì¶ë ¥íë ë¨ê³; ì기 N/2ê°ì ë¹ìê´ê¸°ë¥¼ íµí´ ì 1 ì í¸ë¡ë¶í° ë¹ìê´ë ì í¸ë¥¼ ì¶ë ¥íë ë¨ê³; ì기 ë¹ìê´ë ì í¸ì ì 2 ì í¸ë¥¼ ì 2 매í¸ë¦ì¤ì ì ì©íë ë¨ê³; ë° ì기 ì 2 매í¸ë¦ì¤ë¥¼ íµí´ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨í ì ìë¤.Multi-channel audio signal processing method according to an embodiment of the present invention comprises the steps of identifying the downmix signal and the residual signal of the N / 2 channel generated from the input signal of the N channel; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.
ì기 Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ëì§ ìë ê²½ì°, ì기 N/2ê°ì OTT ë°ì¤ë¤ì N/2ê°ì ë¹ìê´ê¸°ê° ëìí ì ìë¤.When the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators may correspond to the N / 2 OTT boxes.
ì기 ë¹ìê´ê¸°ì ê°ìê° ëª¨ëë¡ ì°ì°ì 기ì¤ê°ì ì´ê³¼íë ê²½ì°, ì기 ë¹ìê´ê¸°ì ì¸ë±ì¤ë 기ì¤ê°ì ë°ë¼ ë°ë³µì ì¼ë¡ ì¬ì¬ì©ë ì ìë¤.If the number of decorrelators exceeds the reference value of the modulo operation, the index of the decorrelator may be repeatedly reused according to the reference value.
ì기 Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ëë ê²½ì°, ì기 ë¹ìê´ê¸°ë, N/2ê°ìì LFE ì±ë ê°ì를 ì ì¸í ëë¨¸ì§ ê°ìê° ì¬ì©ëê³ , ì기 LFE ì±ëì, OTT ë°ì¤ì ë¹ìê´ê¸°ë¥¼ ì¬ì©íì§ ìì ì ìë¤.When the LFE channel is included in the output signal of the N channel, the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .
ìê°ì ì¸ ìì´í í´ì´ ì¬ì©ëì§ ìë ê²½ì°, ì기 ì 2 매í¸ë¦ì¤ë, ì기 ì 2 ì í¸, ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ë¹ìê´ë ì í¸ ë° ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ìì°¨ ì í¸ë¥¼ í¬í¨íë íëì 벡í°ê° ì ë ¥ë ì ìë¤.When the temporal shaping tool is not used, the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.
ìê°ì ì¸ ìì´í í´ì´ ì¬ì©ëë ê²½ì°, ì기 ì 2 매í¸ë¦ì¤ë, ì기 ì 2 ì í¸ ë° ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ìì°¨ ì í¸ë¡ 구ì±ë ë¤ì´ë í¸ ì í¸ì ëìíë 벡í°ì ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ë¹ìê´ë ì í¸ë¡ 구ì±ë íì° ì í¸ì ëìíë 벡í°ê° ì ë ¥ë ì ìë¤.When a temporal shaping tool is used, the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator. A vector corresponding to the signal may be input.
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë, ìë¸ë°´ë ëë©ì¸ ìê° íë¡ì¸ì±(STP)ê° ì¬ì©ëë ê²½ì°, íì° ì í¸ì ë¤ì´ë í¸ ì í¸ì 기ì´í ì¤ì¼ì¼ í©í°ë¥¼ ì¶ë ¥ ì í¸ì íì° ì í¸ ë¶ë¶ì ì ì©íì¬ ì¶ë ¥ ì í¸ì ìê°ì ì¸ í¬ë½ì ì ìì´íí ì ìë¤.The generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal. You can shape
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë, ê°ì´ëë í¬ë½ì ìì´í(GES)ê° ì¬ì©ëë ê²½ì°, Nì±ëì ì¶ë ¥ ì í¸ì ì±ëë³ë¡ ë¤ì´ë í¸ ì í¸ ë¶ë¶ì ëí í¬ë½ì ì íí¸ííê³ ë¦¬ìì´íí ì ìë¤.The generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.
ì기 ì 1 매í¸ë¦ì¤ì í¬ê¸°ë, ì기 ì 1 매í¸ë¦ì¤ë¥¼ ì ì©íë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ìì ë¹ìê´ê¸°ì ê°ìì ë°ë¼ ê²°ì ëê³ , ì기 ì 1 매í¸ë¦ì¤ì ì리먼í¸ë, CLD íë¼ë¯¸í° ëë CPC íë¼ë¯¸í°ì ìí´ ê²°ì ë ì ìë¤.The size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.
본 ë°ëª ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ë¤ì±ë ì¤ëì¤ ì í¸ ì²ë¦¬ ë°©ë²ì N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì N/2 ì±ëì ìì°¨ ì í¸ë¥¼ ìë³íë ë¨ê³; N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì N/2 ì±ëì ìì°¨ ì í¸ë¥¼ N/2ê°ì OTT ë°ì¤ì ì ë ¥íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨íê³ , ì기 N/2ê°ì OTT ë°ì¤ë¤ì ìë¡ ì°ê²°ëì§ ìê³ ë³ë ¬ì ì¼ë¡ ë°°ì¹ëë©°, ì기 N/2ê°ì OTT ë°ì¤ë¤ ì¤ LFE ì±ëì ì¶ë ¥íë OTT ë°ì¤ë, (1) ìì°¨ ì í¸ë¥¼ ì ì¸í ë¤ì´ë¯¹ì¤ ì í¸ë§ ì ë ¥ë°ê³ , (2) CLD íë¼ë¯¸í°ì ICC íë¼ë¯¸í° ì¤ CLD íë¼ë¯¸í°ë¥¼ ì´ì©íë©°, (3) ë¹ìê´ê¸°ë¥¼ íµí´ ë¹ìê´ë ì í¸ë¥¼ ì¶ë ¥íì§ ìëë¤.In accordance with another aspect of the present invention, there is provided a method of processing a multichannel audio signal, including: identifying a downmix signal of an N / 2 channel and a residual signal of the N / 2 channel; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal, wherein the N / 2 OTT boxes are not connected to each other; The OTT box which is arranged in parallel without any other and outputs the LFE channel among the N / 2 OTT boxes receives (1) only the downmix signal except the residual signal, and (2) the CLD parameter among the CLD parameter and the ICC parameter. (3) Do not output uncorrelated signal through decorator.
본 ë°ëª ì ì¼ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ìííë íë¡ì¸ì를 í¬í¨íê³ , ì기 ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì, Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ìì±ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ìì°¨ ì í¸ë¥¼ ìë³íë ë¨ê³; ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì ìì°¨ ì í¸ë¥¼ ì 1 매í¸ë¦ì¤ì ì ì©íë ë¨ê³; ì기 ì 1 매í¸ë¦ì¤ë¥¼ íµí´ N/2ê°ì OTT ë°ì¤ë¤ì ëìíë N/2ê°ì ë¹ìê´ê¸°ì ì ë ¥ëë ì 1 ì í¸ ë° N/2ê°ì ë¹ìê´ê¸°ì ì ë ¥ëì§ ìê³ ì 2 매í¸ë¦ì¤ì ì ë¬ëë ì 2 ì í¸ë¥¼ ì¶ë ¥íë ë¨ê³; ì기 N/2ê°ì ë¹ìê´ê¸°ë¥¼ íµí´ ì 1 ì í¸ë¡ë¶í° ë¹ìê´ë ì í¸ë¥¼ ì¶ë ¥íë ë¨ê³; ì기 ë¹ìê´ë ì í¸ì ì 2 ì í¸ë¥¼ ì 2 매í¸ë¦ì¤ì ì ì©íë ë¨ê³; ë° ì기 ì 2 매í¸ë¦ì¤ë¥¼ íµí´ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨í ì ìë¤.An apparatus for processing a multichannel signal according to an embodiment of the present invention includes a processor for performing a multichannel signal processing method, and the multichannel signal processing method includes downmixing an N / 2 channel generated from an input signal of N channels. Identifying a signal and a residual signal; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.
ì기 Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ëì§ ìë ê²½ì°, ì기 N/2ê°ì OTT ë°ì¤ë¤ì N/2ê°ì ë¹ìê´ê¸°ê° ëìí ì ìë¤.When the LFE channel is not included in the output signal of the N channel, N / 2 decorrelators may correspond to the N / 2 OTT boxes.
ì기 ë¹ìê´ê¸°ì ê°ìê° ëª¨ëë¡ ì°ì°ì 기ì¤ê°ì ì´ê³¼íë ê²½ì°, ì기 ë¹ìê´ê¸°ì ì¸ë±ì¤ë 기ì¤ê°ì ë°ë¼ ë°ë³µì ì¼ë¡ ì¬ì¬ì©ë ì ìë¤.If the number of decorrelators exceeds the reference value of the modulo operation, the index of the decorrelator may be repeatedly reused according to the reference value.
ì기 Nì±ëì ì¶ë ¥ ì í¸ì LFE ì±ëì´ í¬í¨ëë ê²½ì°, ì기 ë¹ìê´ê¸°ë, N/2ê°ìì LFE ì±ë ê°ì를 ì ì¸í ëë¨¸ì§ ê°ìê° ì¬ì©ëê³ , ì기 LFE ì±ëì, OTT ë°ì¤ì ë¹ìê´ê¸°ë¥¼ ì¬ì©íì§ ìì ì ìë¤.When the LFE channel is included in the output signal of the N channel, the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .
ìê°ì ì¸ ìì´í í´ì´ ì¬ì©ëì§ ìë ê²½ì°, ì기 ì 2 매í¸ë¦ì¤ë, ì기 ì 2 ì í¸, ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ë¹ìê´ë ì í¸ ë° ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ìì°¨ ì í¸ë¥¼ í¬í¨íë íëì 벡í°ê° ì ë ¥ë ì ìë¤.When the temporal shaping tool is not used, the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.
ìê°ì ì¸ ìì´í í´ì´ ì¬ì©ëë ê²½ì°, ì기 ì 2 매í¸ë¦ì¤ë, ì기 ì 2 ì í¸ ë° ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ìì°¨ ì í¸ë¡ 구ì±ë ë¤ì´ë í¸ ì í¸ì ëìíë 벡í°ì ì기 ë¹ìê´ê¸°ë¡ë¶í° ëì¶ë ë¹ìê´ë ì í¸ë¡ 구ì±ë íì° ì í¸ì ëìíë 벡í°ê° ì ë ¥ë ì ìë¤.When a temporal shaping tool is used, the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator. A vector corresponding to the signal may be input.
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë, ìë¸ë°´ë ëë©ì¸ ìê° íë¡ì¸ì±(STP)ê° ì¬ì©ëë ê²½ì°, íì° ì í¸ì ë¤ì´ë í¸ ì í¸ì 기ì´í ì¤ì¼ì¼ í©í°ë¥¼ ì¶ë ¥ ì í¸ì íì° ì í¸ ë¶ë¶ì ì ì©íì¬ ì¶ë ¥ ì í¸ì ìê°ì ì¸ í¬ë½ì ì ìì´íí ì ìë¤.The generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal. You can shape
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë, ê°ì´ëë í¬ë½ì ìì´í(GES)ê° ì¬ì©ëë ê²½ì°, Nì±ëì ì¶ë ¥ ì í¸ì ì±ëë³ë¡ ë¤ì´ë í¸ ì í¸ ë¶ë¶ì ëí í¬ë½ì ì íí¸ííê³ ë¦¬ìì´íí ì ìë¤.The generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.
ì기 ì 1 매í¸ë¦ì¤ì í¬ê¸°ë, ì기 ì 1 매í¸ë¦ì¤ë¥¼ ì ì©íë ë¤ì´ë¯¹ì¤ ì í¸ì ì±ë ê°ìì ë¹ìê´ê¸°ì ê°ìì ë°ë¼ ê²°ì ëê³ , ì기 ì 1 매í¸ë¦ì¤ì ì리먼í¸ë, CLD íë¼ë¯¸í° ëë CPC íë¼ë¯¸í°ì ìí´ ê²°ì ë ì ìë¤.The size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.
본 ë°ëª ì ë¤ë¥¸ ì¤ììì ë°ë¥¸ ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ë, ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ìííë íë¡ì¸ì를 í¬í¨íê³ , ì기 ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì, N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì N/2 ì±ëì ìì°¨ ì í¸ë¥¼ ìë³íë ë¨ê³; N/2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ì N/2 ì±ëì ìì°¨ ì í¸ë¥¼ N/2ê°ì OTT ë°ì¤ì ì ë ¥íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë¥¼ í¬í¨íê³ ,In accordance with another aspect of the present invention, an apparatus for processing a multichannel signal includes a processor for performing a method for processing a multichannel signal, and the method for processing a multichannel signal includes: an N / 2 channel downmix signal and an N / 2 channel; Identifying a residual signal; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal,
ì기 N/2ê°ì OTT ë°ì¤ë¤ì ìë¡ ì°ê²°ëì§ ìê³ ë³ë ¬ì ì¼ë¡ ë°°ì¹ëë©°, ì기 N/2ê°ì OTT ë°ì¤ë¤ ì¤ LFE ì±ëì ì¶ë ¥íë OTT ë°ì¤ë, (1) ìì°¨ ì í¸ë¥¼ ì ì¸í ë¤ì´ë¯¹ì¤ ì í¸ë§ ì ë ¥ë°ê³ , (2) CLD íë¼ë¯¸í°ì ICC íë¼ë¯¸í° ì¤ CLD íë¼ë¯¸í°ë¥¼ ì´ì©íë©°, (3) ë¹ìê´ê¸°ë¥¼ íµí´ ë¹ìê´ë ì í¸ë¥¼ ì¶ë ¥íì§ ìëë¤.The N / 2 OTT boxes are arranged in parallel without being connected to each other, and an OTT box that outputs an LFE channel among the N / 2 OTT boxes receives (1) only a downmix signal except a residual signal, (2) It uses CLD parameter among CLD parameter and ICC parameter. (3) Does not output uncorrelated signal through decorator.
ì´ììì ì¤ëª ë ì¥ì¹ë íëì¨ì´ 구ì±ìì, ìíí¸ì¨ì´ 구ì±ìì, ë°/ëë íëì¨ì´ 구ì±ìì ë° ìíí¸ì¨ì´ 구ì±ììì ì¡°í©ì¼ë¡ 구íë ì ìë¤. ì를 ë¤ì´, ì¤ììë¤ìì ì¤ëª ë ì¥ì¹ ë° êµ¬ì±ììë, ì를 ë¤ì´, íë¡ì¸ì, ì½í¸ë¡¤ë¬, ALU(arithmetic logic unit), ëì§í¸ ì í¸ íë¡ì¸ì(digital signal processor), ë§ì´í¬ë¡ì»´í¨í°, FPA(field programmable array), PLU(programmable logic unit), ë§ì´í¬ë¡íë¡ì¸ì, ëë ëª ë ¹(instruction)ì ì¤ííê³ ìëµí ì ìë ë¤ë¥¸ ì´ë í ì¥ì¹ì ê°ì´, íë ì´ìì ë²ì© ì»´í¨í° ëë í¹ì 목ì ì»´í¨í°ë¥¼ ì´ì©íì¬ êµ¬íë ì ìë¤. ì²ë¦¬ ì¥ì¹ë ì´ì ì²´ì (OS) ë° ì기 ì´ì ì²´ì ììì ìíëë íë ì´ìì ìíí¸ì¨ì´ ì í리ì¼ì´ì ì ìíí ì ìë¤. ëí, ì²ë¦¬ ì¥ì¹ë ìíí¸ì¨ì´ì ì¤íì ìëµíì¬, ë°ì´í°ë¥¼ ì ê·¼, ì ì¥, ì¡°ì, ì²ë¦¬ ë° ìì±í ìë ìë¤. ì´í´ì í¸ì를 ìíì¬, ì²ë¦¬ ì¥ì¹ë íëê° ì¬ì©ëë ê²ì¼ë¡ ì¤ëª ë ê²½ì°ë ìì§ë§, í´ë¹ 기ì ë¶ì¼ìì íµìì ì§ìì ê°ì§ ìë, ì²ë¦¬ ì¥ì¹ê° ë³µì ê°ì ì²ë¦¬ ìì(processing element) ë°/ëë ë³µì ì íì ì²ë¦¬ ìì를 í¬í¨í ì ììì ì ì ìë¤. ì를 ë¤ì´, ì²ë¦¬ ì¥ì¹ë ë³µì ê°ì íë¡ì¸ì ëë íëì íë¡ì¸ì ë° íëì ì½í¸ë¡¤ë¬ë¥¼ í¬í¨í ì ìë¤. ëí, ë³ë ¬ íë¡ì¸ì(parallel processor)ì ê°ì, ë¤ë¥¸ ì²ë¦¬ 구ì±(processing configuration)ë ê°ë¥íë¤.The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the devices and components described in the embodiments may be, for example, processors, controllers, arithmetic logic units (ALUs), digital signal processors, microcomputers, field programmable arrays (FPAs), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to the execution of the software. For convenience of explanation, one processing device may be described as being used, but one of ordinary skill in the art will appreciate that the processing device includes a plurality of processing elements and / or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as parallel processors.
ìíí¸ì¨ì´ë ì»´í¨í° íë¡ê·¸ë¨(computer program), ì½ë(code), ëª ë ¹(instruction), ëë ì´ë¤ ì¤ íë ì´ìì ì¡°í©ì í¬í¨í ì ìì¼ë©°, ìíë ëë¡ ëìíëë¡ ì²ë¦¬ ì¥ì¹ë¥¼ 구ì±íê±°ë ë 립ì ì¼ë¡ ëë ê²°í©ì ì¼ë¡(collectively) ì²ë¦¬ ì¥ì¹ë¥¼ ëª ë ¹í ì ìë¤. ìíí¸ì¨ì´ ë°/ëë ë°ì´í°ë, ì²ë¦¬ ì¥ì¹ì ìíì¬ í´ìëê±°ë ì²ë¦¬ ì¥ì¹ì ëª ë ¹ ëë ë°ì´í°ë¥¼ ì ê³µí기 ìíì¬, ì´ë¤ ì íì 기ê³, 구ì±ìì(component), 물리ì ì¥ì¹, ê°ì ì¥ì¹(virtual equipment), ì»´í¨í° ì ì¥ ë§¤ì²´ ëë ì¥ì¹, ëë ì ì¡ëë ì í¸ í(signal wave)ì ì구ì ì¼ë¡, ëë ì¼ìì ì¼ë¡ 구체í(embody)ë ì ìë¤. ìíí¸ì¨ì´ë ë¤í¸ìí¬ë¡ ì°ê²°ë ì»´í¨í° ìì¤í ìì ë¶ì°ëì´ì, ë¶ì°ë ë°©ë²ì¼ë¡ ì ì¥ëê±°ë ì¤íë ìë ìë¤. ìíí¸ì¨ì´ ë° ë°ì´í°ë íë ì´ìì ì»´í¨í° íë ê°ë¥ ê¸°ë¡ ë§¤ì²´ì ì ì¥ë ì ìë¤.The software may include a computer program, code, instructions, or a combination of one or more of the above, and configure the processing device to operate as desired, or process it independently or collectively. You can command the device. Software and / or data may be any type of machine, component, physical device, virtual equipment, computer storage medium or device in order to be interpreted by or to provide instructions or data to the processing device. Or may be permanently or temporarily embodied in a signal wave to be transmitted. The software may be distributed over networked computer systems so that they may be stored or executed in a distributed manner. Software and data may be stored on one or more computer readable recording media.
ì¤ììì ë°ë¥¸ ë°©ë²ì ë¤ìí ì»´í¨í° ìë¨ì íµíì¬ ìíë ì ìë íë¡ê·¸ë¨ ëª ë ¹ ííë¡ êµ¬íëì´ ì»´í¨í° íë ê°ë¥ 매체ì 기ë¡ë ì ìë¤. ì기 ì»´í¨í° íë ê°ë¥ 매체ë íë¡ê·¸ë¨ ëª ë ¹, ë°ì´í° íì¼, ë°ì´í° 구조 ë±ì ë¨ë ì¼ë¡ ëë ì¡°í©íì¬ í¬í¨í ì ìë¤. ì기 매체ì 기ë¡ëë íë¡ê·¸ë¨ ëª ë ¹ì ì¤ìì를 ìíì¬ í¹ë³í ì¤ê³ëê³ êµ¬ì±ë ê²ë¤ì´ê±°ë ì»´í¨í° ìíí¸ì¨ì´ ë¹ì ììê² ê³µì§ëì´ ì¬ì© ê°ë¥í ê²ì¼ ìë ìë¤. ì»´í¨í° íë ê°ë¥ ê¸°ë¡ ë§¤ì²´ì ììë íë ëì¤í¬, íë¡í¼ ëì¤í¬ ë° ì기 í ì´íì ê°ì ì기 매체(magnetic media), CD-ROM, DVDì ê°ì ê´ê¸°ë¡ 매체(optical media), íë¡í°ì»¬ ëì¤í¬(floptical disk)ì ê°ì ì기-ê´ ë§¤ì²´(magneto-optical media), ë° ë¡¬(ROM), ë¨(RAM), íëì ë©ëª¨ë¦¬ ë±ê³¼ ê°ì íë¡ê·¸ë¨ ëª ë ¹ì ì ì¥íê³ ìííëë¡ í¹ë³í 구ì±ë íëì¨ì´ ì¥ì¹ê° í¬í¨ëë¤. íë¡ê·¸ë¨ ëª ë ¹ì ììë ì»´íì¼ë¬ì ìí´ ë§ë¤ì´ì§ë ê²ê³¼ ê°ì 기ê³ì´ ì½ëë¿ë§ ìëë¼ ì¸í°íë¦¬í° ë±ì ì¬ì©í´ì ì»´í¨í°ì ìí´ì ì¤íë ì ìë ê³ ê¸ ì¸ì´ ì½ë를 í¬í¨íë¤. ì기ë íëì¨ì´ ì¥ì¹ë ì¤ììì ëìì ìíí기 ìí´ íë ì´ìì ìíí¸ì¨ì´ ë°ì¤ë¡ì ìëíëë¡ êµ¬ì±ë ì ìì¼ë©°, ê·¸ ìë ë§ì°¬ê°ì§ì´ë¤.The method according to the embodiment may be embodied in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software boxes to perform the operations of the embodiments, and vice versa.
ì´ìê³¼ ê°ì´ ì¤ììë¤ì´ ë¹ë¡ íì ë ì¤ììì ëë©´ì ìí´ ì¤ëª ëìì¼ë, í´ë¹ 기ì ë¶ì¼ìì íµìì ì§ìì ê°ì§ ìë¼ë©´ ì기ì 기ì¬ë¡ë¶í° ë¤ìí ìì ë° ë³íì´ ê°ë¥íë¤. ì를 ë¤ì´, ì¤ëª ë 기ì ë¤ì´ ì¤ëª ë ë°©ë²ê³¼ ë¤ë¥¸ ììë¡ ìíëê±°ë, ë°/ëë ì¤ëª ë ìì¤í , 구조, ì¥ì¹, íë¡ ë±ì 구ì±ììë¤ì´ ì¤ëª ë ë°©ë²ê³¼ ë¤ë¥¸ ííë¡ ê²°í© ëë ì¡°í©ëê±°ë, ë¤ë¥¸ 구ì±ìì ëë ê· ë±ë¬¼ì ìíì¬ ëì¹ëê±°ë ì¹íëëë¼ë ì ì í ê²°ê³¼ê° ë¬ì±ë ì ìë¤. ê·¸ë¬ë¯ë¡, ë¤ë¥¸ 구íë¤, ë¤ë¥¸ ì¤ììë¤ ë° í¹íì²êµ¬ë²ìì ê· ë±í ê²ë¤ë íì íë í¹íì²êµ¬ë²ìì ë²ìì ìíë¤.Although the embodiments have been described by the limited embodiments and the drawings as described above, various modifications and variations are possible to those skilled in the art from the above description. For example, the described techniques may be performed in a different order than the described method, and / or components of the described systems, structures, devices, circuits, etc. may be combined or combined in a different form than the described method, or other components. Or even if replaced or substituted by equivalents, an appropriate result can be achieved. Therefore, other implementations, other embodiments, and equivalents to the claims are within the scope of the claims that follow.
Claims (18)Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ëì¶ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìë³íë ë¨ê³;Identifying a downmix signal of the N / 2 channel derived from the input signal of the N channel;
ë³µìì OTT ë°ì¤ë¤ì ì´ì©íì¬ ì기 ìë³ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³Generating an N-channel output signal from the identified N / 2 channel downmix signal using a plurality of OTT boxes
를 í¬í¨íê³ ,Including,
ì기 ë³µìì OTT ë°ì¤ë¤ì ê°ìë, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì´ ìë ê²½ì° ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².And the number of the plurality of OTT boxes is equal to N / 2 which is the number of channels of the downmix signal when there is no LFE channel in the output signal.
ì 1íì ìì´ì,The method of claim 1,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì,Each of the plurality of OTT boxes,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì ëìíë ë¹ìê´ê¸°(decorrelator)ë¡ë¶í° ìì±ë ë¹ìê´ì± ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².The multi-channel signal processing method of generating a two-channel output signal using a non-correlation signal and a one-channel downmix signal generated from a decorrelator corresponding to each of the plurality of OTT boxes.
ì 2íì ìì´ì,The method of claim 2,
ì기 ì¶ë ¥ ì í¸ì ì±ëìì¸ Nì´ ë¯¸ë¦¬ ì¤ì ë ì±ëì Mì ì´ê³¼íë ê²½ì°,When N, the channel number of the output signal, exceeds the preset channel number M,
ì기 ë¹ìê´ê¸°ë, M ì´íì ì±ëì ëìíë ì 1 ë¹ìê´ê¸°ì M ì´ê³¼ì ì±ëì ëìíë ì 2 ë¹ìê´ê¸°ë¥¼ í¬í¨íê³ ,The decorrelator includes a first decorrelator corresponding to a channel less than or equal to M and a second decorrelator corresponding to a channel greater than or equal to M,
ì기 ì 2 ë¹ìê´ê¸°ë, ì 1 ë¹ìê´ê¸°ì íí°ì (filter set)ì ì¬ì¬ì©íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².And the second decorrelator reuses a filter set of the first decorrelator.
ì 2íì ìì´ì,The method of claim 2,
ì기 ë³µìì OTT ë°ì¤ë¤ ì¤ ì¶ë ¥ì´ LFE ì±ëì¸ OTT ë°ì¤ë, ë¹ìê´ì± ì í¸ë¥¼ ì´ì©íì§ ìê³ 2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².The OTT box of which the output is an LFE channel among the plurality of OTT boxes generates a two-channel downmix signal without using an uncorrelated signal.
ì 2íì ìì´ì,The method of claim 2,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì,Each of the plurality of OTT boxes,
ì ì¡ë ìì°¨ ì í¸ê° ì¡´ì¬íë ê²½ì°, ë¹ìê´ì± ì í¸ ëì ì ìì°¨ ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².The multi-channel signal processing method for generating a two-channel output signal using the residual signal and the one-channel downmix signal in place of the uncorrelated signal, if there is a transmitted residual signal.
ì 1íì ìì´ì,The method of claim 1,
ì기 Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³ë,Generating the output signal of the N channel,
í리 ë¹ìê´ê¸° 매í¸ë¦ì¤(pre decorrelator matrix) M1ê³¼ ë¯¹ì¤ ë§¤í¸ë¦ì¤(mix matrix) M2를 ì´ì©íì¬ N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².A multi-channel signal processing method for generating an N-channel output signal using a pre decorrelator matrix M1 and a mix matrix M2.
ì 1íì ìì´ì,The method of claim 1,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, CLD(channel level difference)를 ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².Each of the plurality of OTT boxes, the multi-channel signal processing method for generating an output signal of the N channel using a channel level difference (CLD).
ì 1íì ìì´ì,The method of claim 1,
ì기 ì¶ë ¥ ì í¸ì ì±ëì Nì 10ë¶í° 32ê¹ì§ì ì§ìì¸ ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².The channel number N of the output signal is an even number from 10 to 32 multi-channel signal processing method.
ì 1 ì½ë© ë°©ìì ë°ë¼ ì¸ì½ë©ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ëì½ë©íë ë¨ê³; ë°Decoding the downmix signal of the N / 2 channel encoded according to the first coding scheme; And
ì 2 ì½ë© ë°©ìì ë°ë¼ ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¨ê³Generating an output signal of the N channel from the downmix signal of the N / 2 channel according to a second coding scheme
를 í¬í¨íê³ ,Including,
ì기 ì 2 ì½ë© ë°©ìì,The second coding scheme is
ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì í¬í¨íì§ ìë ê²½ì°, ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ê°ìì OTT(one-to-two) ë°ì¤ë¤ì ì´ì©íë ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë².When the output signal does not include an LFE channel, the multi-channel signal processing method using the same number of one-to-two (OTT) boxes equal to N / 2 which is the number of channels of the downmix signal.
ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ì ìì´ì,In the multi-channel signal processing apparatus,
ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ì¤ííë íë¡ì¸ì¤ë¥¼ í¬í¨íê³ ,A process for executing a multi-channel signal processing method,
ì기 íë¡ì¸ì¤ë,The process is
Nì±ëì ì ë ¥ ì í¸ë¡ë¶í° ëì¶ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìë³íê³ ,Identify the downmix signal of the N / 2 channel derived from the input signal of the N channel,
ë³µìì OTT ë°ì¤ë¤ì ì´ì©íì¬ ì기 ìë³ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë©°,Generating an N-channel output signal from the identified N / 2 channel downmix signal using a plurality of OTT boxes,
ì기 ë³µìì OTT ë°ì¤ë¤ì ê°ìë, ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì´ ìë ê²½ì° ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.And the number of the plurality of OTT boxes is equal to N / 2 which is the number of channels of the downmix signal when there is no LFE channel in the output signal.
ì 10íì ìì´ì,The method of claim 10,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì,Each of the plurality of OTT boxes,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì ëìíë ë¹ìê´ê¸°(decorrelator)ë¡ë¶í° ìì±ë ë¹ìê´ì± ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.And a two-channel output signal using a non-correlation signal generated from a decorrelator corresponding to each of the plurality of OTT boxes and a downmix signal of one channel.
ì 11íì ìì´ì,The method of claim 11,
ì기 ì¶ë ¥ ì í¸ì ì±ëìì¸ Nì´ ë¯¸ë¦¬ ì¤ì ë ì±ëì Mì ì´ê³¼íë ê²½ì°,When N, the channel number of the output signal, exceeds the preset channel number M,
ì기 ë¹ìê´ê¸°ë, M ì´íì ì±ëì ëìíë ì 1 ë¹ìê´ê¸°ì M ì´ê³¼ì ì±ëì ëìíë ì 2 ë¹ìê´ê¸°ë¥¼ í¬í¨íê³ ,The decorrelator includes a first decorrelator corresponding to a channel less than or equal to M and a second decorrelator corresponding to a channel greater than or equal to M,
ì기 ì 2 ë¹ìê´ê¸°ë, ì 1 ë¹ìê´ê¸°ì íí°ì (filter set)ì ì¬ì¬ì©íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.And the second decorrelator reuses a filter set of the first decorrelator.
ì 11íì ìì´ì,The method of claim 11,
ì기 ë³µìì OTT ë°ì¤ë¤ ì¤ ì¶ë ¥ì´ LFE ì±ëì¸ OTT ë°ì¤ë, ë¹ìê´ì± ì í¸ë¥¼ ì´ì©íì§ ìê³ 2ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.An OTT box whose output is an LFE channel among the plurality of OTT boxes generates a downmix signal of two channels without using an uncorrelated signal.
ì 11íì ìì´ì,The method of claim 11,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì,Each of the plurality of OTT boxes,
ì ì¡ë ìì°¨ ì í¸ê° ì¡´ì¬íë ê²½ì°, ë¹ìê´ì± ì í¸ ëì ì ìì°¨ ì í¸ì 1ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ì´ì©íì¬ 2ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.The multi-channel signal processing apparatus for generating a two-channel output signal using the residual signal and the one-channel downmix signal in place of the uncorrelated signal, if there is a transmitted residual signal.
ì 10íì ìì´ì,The method of claim 10,
ì기 íë¡ì¸ì¤ë,The process is
í리 ë¹ìê´ê¸° 매í¸ë¦ì¤(pre decorrelator matrix) M1ê³¼ ë¯¹ì¤ ë§¤í¸ë¦ì¤(mix matrix) M2를 ì´ì©íì¬ N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.A multi-channel signal processing apparatus for generating an output signal of the N channel by using a pre decorrelator matrix M1 and a mix matrix M2.
ì 10íì ìì´ì,The method of claim 10,
ì기 ë³µìì OTT ë°ì¤ë¤ ê°ê°ì, CLD(channel level difference)를 ì´ì©íì¬ Nì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.Each of the plurality of OTT boxes, the multi-channel signal processing device for generating an output signal of the N channel using a channel level difference (CLD).
ì 10íì ìì´ì,The method of claim 10,
ì기 ì¶ë ¥ ì í¸ì ì±ëì Nì 10ë¶í° 32ê¹ì§ì ì§ìì¸ ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.And a channel number N of the output signal is an even number ranging from 10 to 32.
ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹ì ìì´ì,In the multi-channel signal processing apparatus,
ë¤ì±ë ì í¸ ì²ë¦¬ ë°©ë²ì ì¤ííë íë¡ì¸ì¤ë¥¼ í¬í¨íê³ ,A process for executing a multi-channel signal processing method,
ì기 íë¡ì¸ì¤ë,The process is
ì 1 ì½ë© ë°©ìì ë°ë¼ ì¸ì½ë©ë N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¥¼ ëì½ë©íê³ ,Decode the downmix signal of the N / 2 channel encoded according to the first coding scheme,
ì 2 ì½ë© ë°©ìì ë°ë¼ ì기 N/2 ì±ëì ë¤ì´ë¯¹ì¤ ì í¸ë¡ë¶í° N ì±ëì ì¶ë ¥ ì í¸ë¥¼ ìì±íë©°,Generating an output signal of the N channel from the downmix signal of the N / 2 channel according to a second coding scheme,
ì기 ì 2 ì½ë© ë°©ìì,The second coding scheme is
ì기 ì¶ë ¥ ì í¸ì LFE ì±ëì í¬í¨íì§ ìë ê²½ì°, ì기 ë¤ì´ë¯¹ì¤ ì í¸ì ì±ëìì¸ N/2ì ëì¼í ê°ìì OTT(one-to-two) ë°ì¤ë¤ì ì´ì©íë ë¤ì±ë ì í¸ ì²ë¦¬ ì¥ì¹.When the output signal does not include an LFE channel, the multi-channel signal processing apparatus using the same number of one-to-two (OTT) boxes equal to N / 2 which is the number of channels of the downmix signal.
Ref document number: 16752696
Country of ref document: EP
Kind code of ref document: A1
2017-08-18 NENP Non-entry into the national phaseRef country code: DE
2018-03-14 122 Ep: pct application non-entry in european phaseRef document number: 16752696
Country of ref document: EP
Kind code of ref document: A1
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4