ãï¼ï¼ï¼ï¼ã[0001]
ãç£æ¥ä¸ã®å©ç¨åéãæ¬çºæã¯ããã£ã¸ã¿ã«ãªã¼ãã£ãª
ä¿¡å·çã®ãã£ã¸ã¿ã«ä¿¡å·ããããå§ç¸®ããé«è½ç符å·å
æ¹æ³åã³è£
ç½®ããã®å§ç¸®ãã¼ã¿ãä¼éããä¼éåªä½ï¼è¨
é²åªä½ãå«ãï¼ã«é¢ããç¹ã«ãå¦çãããã¯æ¯ã®ã¨ãã«
ã®ã¼ã®å½¢ç¶ã«å¿ãã¦ã該å½å¦çãããã¯ã®ãããå²å½é
ãå¤åããããããªé«è½ç符å·åæ¹æ³åã³è£
ç½®ã並ã³ã«
ä¼éåªä½ã«é¢ãããBACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-efficiency coding method and device for bit-compressing a digital signal such as a digital audio signal, and a transmission medium (including a recording medium) for transmitting the compressed data, and more particularly to a processing block. The present invention relates to a high-efficiency coding method and device, and a transmission medium, which changes the bit allocation amount of a corresponding processing block according to the shape of energy for each.
ãï¼ï¼ï¼ï¼ã[0002]
ã徿¥ã®æè¡ããªã¼ãã£ãªæãã¯é³å£°çã®ä¿¡å·ã®é«è½ç
符å·åã®ææ³åã³è£
ç½®ã«ã¯ç¨®ã
ããããä¾ãã°ãæéé
åã®ãªã¼ãã£ãªä¿¡å·çãå使鿝ã«ãããã¯åãã¦ã
ã®ãããã¯æ¯ã®æé軸ã®ä¿¡å·ã卿³¢æ°è»¸ä¸ã®ä¿¡å·ã«å¤æ
ï¼ç´äº¤å¤æï¼ãã¦è¤æ°ã®å¨æ³¢æ°å¸¯åã«åå²ããå叝忝
ã«ç¬¦å·åãããããã¯å卿³¢æ°å¸¯åå岿¹å¼ã§ãããã
ããå¤æç¬¦å·åæ¹å¼ããæéé åã®ãªã¼ãã£ãªä¿¡å·çã
å使鿝ã«ãããã¯åããªãã§ãè¤æ°ã®å¨æ³¢æ°å¸¯åã«
åå²ãã¦ç¬¦å·åããéãããã¯å卿³¢æ°å¸¯åå岿¹å¼ã§
ãã帯ååå²ç¬¦å·åï¼ãµãã»ãã³ãã»ã³ã¼ãã£ã³ã°ï¼ï¼³
ï¼¢ï¼£ï¼æ¹å¼çãæãããã¨ãã§ãããã¾ããä¸è¿°ã®å¸¯å
åå²ç¬¦å·åã¨å¤æç¬¦å·åã¨ãçµã¿åãããé«è½ç符å·å
ã®ææ³åã³è£
ç½®ãèãããã¦ããããã®å ´åã«ã¯ãä¾ã
ã°ãä¸è¨å¸¯ååå²ç¬¦å·åæ¹å¼ã§å¸¯ååå²ãè¡ã£ãå¾ã該
å叝忝ã®ä¿¡å·ãä¸è¨å¤æç¬¦å·åæ¹å¼ã§å¨æ³¢æ°é åã®ä¿¡
å·ã«ç´äº¤å¤æãããã®ç´äº¤å¤æãããå叝忝ã«ç¬¦å·å
ãæ½ããã¨ã«ãªãã2. Description of the Related Art There are various techniques and devices for high-efficiency coding of audio or voice signals. For example, a time domain audio signal is divided into blocks for each unit time, and a signal on the time axis of each block is used. Is converted into a signal on the frequency axis (orthogonal conversion), divided into multiple frequency bands, and coded for each band, so-called transform coding method, which is a so-called transform coding method, time domain audio signal, etc. , Which is a non-blocking frequency band division method that encodes by dividing into a plurality of frequency bands without blocking each unit time (sub-band coding: S
BC) method etc. can be mentioned. Further, a method and apparatus for high efficiency coding in which the above band division coding and transform coding are combined are also considered, and in this case, for example, after performing band division by the above band division coding method. The signals in the respective bands are orthogonally transformed into signals in the frequency domain by the transform coding method, and the respective orthogonally transformed bands are encoded.
ãï¼ï¼ï¼ï¼ãããã§ãä¸è¿°ãã帯ååå²ç¬¦å·åæ¹å¼ã«ä½¿
ç¨ããã帯ååå²ç¨ãã£ã«ã¿ã¨ãã¦ã¯ãä¾ãã°ï¼±ï¼ï¼¦(Q
uadrature Mirror filter)çã®ãã£ã«ã¿ããããããã¯
ä¾ãã°æç®ããã£ã¸ã¿ã«ã»ã³ã¼ãã£ã³ã°ã»ãªãã»ã¹ãã¼
ãã»ã¤ã³ã»ãµããã³ãºã("Digital coding of speech i
n subbands" R.E.Crochiere, Bell Syst.Tech. J.,Vo
l.55,No.8 1976) ã«è¿°ã¹ããã¦ããããã®ï¼±ï¼ï¼¦ã®ãã£
ã«ã¿ã¯ã帯åãçãã³ãå¹
ã«ï¼åå²ãããã®ã§ãããå½
該ãã£ã«ã¿ã«ããã¦ã¯ä¸è¨åå²ãã帯åãå¾ã«åæãã
éã«ããããã¨ãªã¢ã·ã³ã°ãçºçããªããã¨ãç¹å¾´ã¨ãª
ã£ã¦ãããã¾ããæç®ãããªãã§ã¼ãºã»ã¯ã¯ãã©ãã¡ã»
ãã£ã«ã¿ âæ°ãã帯ååå²ç¬¦å·åæè¡ã("Polyphase
Quadrature filters -A new subband coding techniqu
e", Joseph H. Rothweiler ICASSP 83, BOSTON)ã«ã¯ã
ç帯åå¹
ã®ãã£ã«ã¿å岿æ³ãè¿°ã¹ããã¦ããããã®ã
ãªãã§ã¼ãºã»ã¯ã¯ãã©ãã¡ã»ãã£ã«ã¿ã«ããã¦ã¯ãä¿¡å·
ãçãã³ãå¹
ã®è¤æ°ã®å¸¯åã«åå²ããéã«ä¸åº¦ã«åå²ã§
ãããã¨ãç¹å¾´ã¨ãªã£ã¦ãããHere, as a band division filter used in the above-mentioned band division encoding method, for example, QMF (Q
There is a filter such as uadrature Mirror filter), which is, for example, the document "Digital coding of speech in subvans".
n subbands "RE Crochiere, Bell Syst.Tech. J., Vo
l.55, No.8 1976). This QMF filter divides the band into two equal bandwidths, and is characterized in that so-called aliasing does not occur when the divided bands are combined later. In addition, the document âPolyphase Quadratic
Filter-New Band Division Coding Technique "(" Polyphase
Quadrature filters -A new subband coding techniqu
e ", Joseph H. Rothweiler ICASSP 83, BOSTON)
Equal bandwidth filter partitioning techniques are described. This polyphase quadrature filter is characterized in that when a signal is divided into a plurality of bands of equal bandwidth, it can be divided at once.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¿°ããç´äº¤å¤æã¨ãã¦ã¯ãä¾ã
ã°ãå
¥åãªã¼ãã£ãªä¿¡å·ãæå®å使éï¼ãã¬ã¼ã ï¼ã§
ãããã¯åããå½è©²ãããã¯æ¯ã«é«éãã¼ãªã¨å¤æï¼ï¼¦
FTï¼ãã颿£ã³ãµã¤ã³å¤æï¼ï¼¤ï¼£ï¼´ï¼ãã¢ãã£ãã¡ã¤
ãï¼¤ï¼£ï¼´å¤æï¼ï¼ï¼¤ï¼£ï¼´ï¼ãªã©ãè¡ããã¨ã§æé軸ãå¨
æ³¢æ°è»¸ã«å¤æãããããªç´äº¤å¤æãããããã®ï¼ï¼¤ï¼£ï¼´
ã«ã¤ãã¦ã¯ãæç®ãæéé åã¨ãªã¢ã·ã³ã°ã»ãã£ã³ã»ã«
ãåºç¤ã¨ãããã£ã«ã¿ã»ãã³ã¯è¨è¨ãç¨ãããµããã³ã
ï¼å¤æç¬¦å·åã("Subband/Transform Coding Using Fil
ter Bank Designs Based on Time Domain Aliasing Can
cellation," J.P.Princen A.B.Bradley, Univ. of Surr
ey Royal Melbourne Inst. of Tech. ICASSP 1987)ã«è¿°
ã¹ããã¦ãããAs the above-mentioned orthogonal transform, for example, the input audio signal is divided into blocks in a predetermined unit time (frame), and the fast Fourier transform (F
FT), Discrete Cosine Transform (DCT), Modified DCT Transform (MDCT), etc. are used to transform the time axis into the frequency axis. This MDCT
For the paper "Subband / Transform Coding Using Fil".
ter Bank Designs Based on Time Domain Aliasing Can
cellation, "JPPrincen ABBradley, Univ. of Surr
ey Royal Melbourne Inst. of Tech. ICASSP 1987).
ãï¼ï¼ï¼ï¼ãæ´ã«ã卿³¢æ°å¸¯ååå²ãããå卿³¢æ°æå
ãéååããå ´åã®å¨æ³¢æ°åå²å¹
ã¨ãã¦ã¯ãä¾ãã°äººé
ã®è´è¦ç¹æ§ãèæ
®ãã帯ååå²ããããããªãã¡ãä¸è¬
ã«è¨ç帯åï¼ã¯ãªãã£ã«ã«ãã³ãï¼ã¨å¼ã°ãã¦ããé«å
ã»ã©å¸¯åå¹
ãåºããªããããªå¸¯åå¹
ã§ããªã¼ãã£ãªä¿¡å·
ãè¤æ°ï¼ä¾ãã°ï¼ï¼ãã³ãï¼ã®å¸¯åã«åå²ãããã¨ãã
ããã¾ãããã®æã®å叝忝ã®ãã¼ã¿ã符å·åããéã«
ã¯ãåå¸¯åæ¯ã«æå®ã®ãããé
åæãã¯ãå叝忝ã«é©
å¿çãªãããé
åã«ãã符å·åãè¡ããããä¾ãã°ãä¸
è¨ï¼ï¼¤ï¼£ï¼´å¦çããã¦å¾ãããï¼ï¼¤ï¼£ï¼´ä¿æ°ãã¼ã¿ãä¸
è¨ãããé
åã«ãã£ã¦ç¬¦å·åããéã«ã¯ãä¸è¨åããã
ã¯æ¯ã®ï¼ï¼¤ï¼£ï¼´å¦çã«ããå¾ãããå叝忝ã®ï¼ï¼¤ï¼£ï¼´
ä¿æ°ãã¼ã¿ã«å¯¾ãã¦ãé©å¿çãªé
åãããæ°ã§ç¬¦å·åã
è¡ããããã¨ã«ãªããFurther, as a frequency division width in the case of quantizing each frequency component divided into frequency bands, for example, there is a band division considering human auditory characteristics. That is, an audio signal may be divided into a plurality of bands (for example, 25 bands) with a bandwidth that is generally called a critical band and has a wider bandwidth in a higher band. Further, at the time of encoding the data for each band at this time, encoding is performed by predetermined bit allocation for each band or adaptive bit allocation for each band. For example, when the MDCT coefficient data obtained by the MDCT process is encoded by the bit allocation, the MDCT for each band obtained by the MDCT process for each block is performed.
Coding is performed on the coefficient data with an adaptive distribution bit number.
ãï¼ï¼ï¼ï¼ãä¸è¨ãããé
åææ³åã³ãã®ããã®è£
ç½®ã¨
ãã¦ã¯ã次ã®ï¼ææ³åã³è£
ç½®ãç¥ããã¦ãããä¾ãã°ã
æç®ãé³å£°ä¿¡å·ã®é©å¿å¤æç¬¦å·åãï¼"Adaptive Transf
ormCoding of Speech Signals", IEEE Transactions of
Accoustics, Speech, andSignal Processing, vol.ASS
P-25, No.4, August 1977 ï¼ã§ã¯ãå叝忝ã®ä¿¡å·ã®å¤§
ããããã¨ã«ããããå²å½ãè¡ã£ã¦ãããã¾ããä¾ãã°
æç®ãè¨ç帯å符å·åå¨ âè´è¦ã·ã¹ãã ã®ç¥è¦ã®è¦æ±
ã«é¢ãããã£ã¸ã¿ã«ç¬¦å·åãï¼"The critical band cod
er --digital encoding of the perceptual requireme
nts of the auditory system", M.A.Kransner MIT, ICA
SSP 1980ï¼ã§ã¯ãè´è¦ãã¹ãã³ã°ãå©ç¨ãããã¨ã§ãå
叝忝ã«å¿
è¦ãªä¿¡å·å¯¾é鳿¯ãå¾ã¦åºå®çãªãããå²å½
ãè¡ãææ³åã³è£
ç½®ãè¿°ã¹ããã¦ãããThe following two methods and apparatuses are known as the above-mentioned bit allocation method and an apparatus therefor. For example,
Reference "Adaptive Transf.
ormCoding of Speech Signals ", IEEE Transactions of
Accoustics, Speech, and Signal Processing, vol.ASS
P-25, No.4, August 1977) allocates bits based on the signal size of each band. In addition, for example, the document "Critical Band Coding-Digital Coding for Perceptual Requirements of Auditory System"("The critical band cod
er --digital encoding of the perceptual requireme
nts of the auditory system ", MAKransner MIT, ICA
SSP 1980) describes a method and a device for performing fixed bit allocation by obtaining a necessary signal-to-noise ratio for each band by using auditory masking.
ãï¼ï¼ï¼ï¼ã[0007]
ãçºæã解決ãããã¨ãã課é¡ãã¨ããã§ãä¸è¿°ããå¾
æ¥ã®é«è½ç符å·åææ³åã³è£
ç½®ã«ããã¦ã¯ãåé³é¢ä¿ã
æã£ããä¾ãã°æ¥½å¨ã®åçé³çã®ãã£ã¸ã¿ã«ãªã¼ãã£ãª
ä¿¡å·ã符å·åããå ´åãè´æä¸éè¦ãªå鳿åãå«ãã
éååãããã¯ã«å¯¾ãã¦ãããé
åã®éä¸åº¦ãåä¸ãã
ããã¨ãå°é£ã§ããããã®ããå
¥åä¿¡å·ã«å¯¾ãããé©å¿
çãªãããé
åãè¡ããã¨ãã§ããªããã¨ã«ãªããç¹ã«
ä¾ãã°ãããã¬ã¼ããä½ãã使ç¨å¯è½ãªããããå°ãªã
å ´åã«ã¯ãè´æä¸è¯å¥½ãªé³è³ªãå¾ããã¨ãã§ããªããBy the way, in the above-mentioned conventional high-efficiency coding method and device, when a digital audio signal having a harmonic relationship, such as a reproduced sound of a musical instrument, is coded, it is important for hearing. It is difficult to improve the degree of concentration of bit allocation with respect to a quantized block including a large overtone component. Therefore, more adaptive bit allocation cannot be performed for the input signal, and particularly when the bit rate is low and the number of usable bits is small, it is not possible to obtain good sound quality in terms of hearing.
ãï¼ï¼ï¼ï¼ãæ¬çºæã¯ããã®ãããªå®æ
ã«éã¿ã¦ãªãã
ããã®ã§ãããè´è¦ç¹æ§ãããæ´»ãããå¹çã®è¯ãé«è½
ç符å·åæ¹æ³åã³è£
ç½®ã並ã³ã«ä¼éåªä½ã®æä¾ãè¡ãã
ä½ãããã¬ã¼ãã«ãããé³è³ªå£å鲿¢ãåã³åä¸ããã
ã¬ã¼ãã«ãããé³è³ªåä¸ãç®çã¨ãããThe present invention has been made in view of the above circumstances, and provides an efficient high-efficiency encoding method and apparatus and a transmission medium that make better use of auditory characteristics.
The purpose is to prevent deterioration of sound quality at a low bit rate and improve sound quality at the same bit rate.
ãï¼ï¼ï¼ï¼ã[0009]
ã課é¡ã解決ããããã®ææ®µãæ¬çºæã®é«è½ç符å·åæ¹
æ³åã³è£
ç½®ã¯ãä¸è¿°ã®ç®çãéæããããã«ææ¡ããã
ãã®ã§ãããå
¥åä¿¡å·ãè¤æ°ã®å¨æ³¢æ°å¸¯åæåã«åè§£ã
ã¦ãæéã¨å¨æ³¢æ°ã«é¢ããè¤æ°ã®ï¼æ¬¡å
ãããã¯å
ã®ä¿¡
å·æåãå¾ãä¸è¨ï¼æ¬¡å
ãããã¯æ¯ã«å½è©²ï¼æ¬¡å
ããã
ã¯å
ã®ä¿¡å·æåã®ç¹å¾´ã表ãéååä¿æ°ãæ±ããããã«
åºã¥ãã¦ãããé
åéãæ±ºå®ããä¸è¨ï¼æ¬¡å
ãããã¯æ¯
ã«å½è©²ï¼æ¬¡å
ãããã¯å
ã®ä¿¡å·æåãéååãã¦æ
å ±å§
縮ããä¸è¨ï¼æ¬¡å
ãããã¯æ¯ã®å§ç¸®æ
å ±ãä¸è¨ï¼æ¬¡å
ã
ããã¯æ¯ã®æ
å ±å§ç¸®ãã©ã¡ã¼ã¿ã¨å
±ã«åºåãããã®ã§ã
ããå
¥åä¿¡å·ã«é©å¿ãããããé
åãè¡ãéã«ã¯ãå
¥å
ä¿¡å·ãæã¤åé³é¢ä¿ãæ¤åºããå鳿åãå«ã¾ããåï¼
次å
ãããã¯ã«å¯¾ãã¦ãããé
åéã®éä¸åº¦ãé«ããã
ãã«ãããã®ã§ãããThe high-efficiency coding method and apparatus of the present invention has been proposed in order to achieve the above-mentioned object, and decomposes an input signal into a plurality of frequency band components, And a signal component in a plurality of two-dimensional blocks regarding frequency are obtained, a quantized coefficient representing a feature of the signal component in the two-dimensional block is obtained for each of the two-dimensional blocks, and a bit allocation amount is determined based on the quantized coefficient. For each two-dimensional block, the signal component in the two-dimensional block is quantized and information is compressed, and the compression information for each two-dimensional block is output together with the information compression parameter for each two-dimensional block. When performing the adaptive bit allocation, the harmonic relationship of the input signal is detected, and each 2
The degree of concentration of the bit allocation amount is increased with respect to the dimensional block.
ãï¼ï¼ï¼ï¼ãããã§ãä¸è¨å
¥åä¿¡å·ã«é©å¿ãããããé
åãè¡ãéã«ã¯ãå
¥åä¿¡å·ã®å¨æ³¢æ°è»¸ä¸ã®ã¹ãã¯ãã©ã
æ
å ±ããï¼æ¬¡å
ãããã¯å
ã®ä¿¡å·æåã表ãéååä¿æ°
ã«åºã¥ãã¦ãä¸è¨å
¥åä¿¡å·ãæã¤åé³é¢ä¿ãæ¤åºããã
ã¾ããä¸è¨åé³é¢ä¿ãæ¤åºããéã«ã¯ãå
¥åä¿¡å·ã®å
¨ã¦
åã¯ä¸é¨ã®å¨æ³¢æ°å¸¯åã®ã¹ãã¯ãã©ã æ
å ±åã³ï¼åã¯ï¼
次å
ãããã¯å
ã®ä¿¡å·æåãç¾ãéååæ
å ±ã«åºã¥ã
ã¦ã使¬¡ã®å鳿ååã³ï¼åã¯åºé³ã¨ç¬¬ä¸åé³ã®å鳿
åã®ã¿ãæ¤åºãããããã®å鳿åã«åºã¥ãã¦ä»ã®åé³
æåã®å¨æ³¢æ°å¸¯åãç®åºããå
¨å¨æ³¢æ°å¸¯åãæã¤åé³é¢
ä¿ãæ¤åºãããã¨ãã§ãããã¾ããå
¥åä¿¡å·ã«é©å¿ãã
ãããé
åãè¡ãéã«ã¯ãå
¨ã¦åã¯ä¸é¨ã®å鳿åãå«
ã¾ããï¼æ¬¡å
ãããã¯ããå¶æ°æ¬¡ã®å鳿åãå«ã¾ãã
å
¨ã¦åã¯ä¸é¨ã®ï¼æ¬¡å
ãããã¯ã奿°æ¬¡ã®å鳿åãå«
ã¾ããå
¨ã¦åã¯ä¸é¨ã®ï¼æ¬¡å
ãããã¯ã«å¯¾ãã¦ãããã
é
åã®éä¸åº¦ãé«ããããã«ãããããã«ãä¸è¨å
¥åä¿¡
å·ã«é©å¿ãããããé
åéã®æ±ºå®ã®éã«ãæ¢å®ãããã¬
ã¼ãã«å¯¾ãã¦ãããé
åéãå°ãªãã¨ãã«ã¯ãå鳿å
ãå«ã¾ããå
¨ã¦åã¯ä¸é¨ã®ï¼æ¬¡å
ãããã¯ã«å¯¾ãã¦åªå
çã«ãããã追å ãã¦ãããã®ç«¯æ°èª¿æ´è¡ã£ãããå¶æ°
å鳿åãå«ã¾ããå
¨ã¦åã¯ä¸é¨ã®ï¼æ¬¡å
ãããã¯ã«å¯¾
ãã¦åªå
çã«ãããã追å ãã¦ãããã®ç«¯æ°èª¿æ´ãè¡ã
ããã«ãããéã«ãä¸è¨å
¥åä¿¡å·ã«é©å¿ãããããé
å
éã®æ±ºå®ã®éã«ãæ¢å®ãããã¬ã¼ãã«å¯¾ãã¦ãããé
å
éãå¤ãã¨ãã«ã¯ãå鳿åãå«ã¾ããªãå
¨ã¦åã¯ä¸é¨
ã®ï¼æ¬¡å
ãããã¯ã«å¯¾ãã¦åªå
çã«ããããåé¤ãã¦ã
ããã®ç«¯æ°èª¿æ´ãè¡ã£ããã奿°å鳿åãå«ã¾ããªã
å
¨ã¦åã¯ä¸é¨ã®ï¼æ¬¡å
ãããã¯ã«å¯¾ãã¦åªå
çã«ããã
ãåé¤ãã¦ãããã®ç«¯æ°èª¿æ´ãè¡ãããã«ãããHere, when performing bit allocation adapted to the input signal, based on the spectrum information on the frequency axis of the input signal and the quantization coefficient representing the signal component in the two-dimensional block, the input signal is Detects the overtone relationship of.
Further, when detecting the above-mentioned overtone relationship, the spectrum information of all or part of the frequency band of the input signal and / or 2
Based on the quantized information representing the signal component in the dimension block, only the low-order harmonic component and / or the fundamental and the first harmonic overtone component are detected, and the frequency band of the other harmonic components based on these harmonic components. Can be calculated to detect the overtone relationship of all frequency bands. Further, when performing bit allocation adapted to an input signal, a two-dimensional block containing all or some harmonic components, all or some two-dimensional blocks containing even harmonic components, and odd-numbered The degree of concentration of bit allocation is increased for all or some of the two-dimensional blocks containing overtone components. Further, when the bit allocation amount adapted to the input signal is determined, when the bit allocation amount is small with respect to the predetermined bit rate, the bit is preferentially applied to all or some of the two-dimensional blocks including the overtone component. Is added to adjust the fraction of bits, or bits are preferentially added to all or some of the two-dimensional blocks including even harmonic components to adjust the fraction of bits. On the contrary, when determining the bit allocation amount adapted to the input signal, if the bit allocation amount is large with respect to the predetermined bit rate, priority is given to all or some of the two-dimensional blocks that do not include the overtone component. The bit is adjusted by deleting the bit, or the bit is adjusted by deleting the bit preferentially for all or some of the two-dimensional blocks that do not include the odd harmonic component.
ãï¼ï¼ï¼ï¼ãã¾ããæ¬çºæã®ä¼éåªä½ã¯ãä¼éã¡ãã£ã¢
ã®ã¿ãªããè¨é²ã¡ãã£ã¢ããå«ã¿ãæ¬çºæã®é«è½ç符å·
åæ¹æ³åã¯è£
ç½®ã«ãã£ã¦å§ç¸®ç¬¦å·åãããå§ç¸®æ
å ±ã¨æ
å ±å§ç¸®ãã©ã¡ã¼ã¿ã¨ãä¼éåã¯è¨é²ãããã®ã§ãããFurther, the transmission medium of the present invention includes not only the transmission medium but also the recording medium, and transmits or records the compressed information and the information compression parameter compression-encoded by the high efficiency encoding method or apparatus of the present invention. To do.
ãï¼ï¼ï¼ï¼ã[0012]
ãä½ç¨ãæ¬çºæã«ããã°ããããé
åãè¡ãéãå
¥åä¿¡
å·ãæã¤åé³é¢ä¿ãæ¤åºãããã®åå鳿åãå«ãéå
åã®ããã®ï¼æ¬¡å
ãããã¯ã«å¯¾ãããããé
åã®éä¸åº¦
ãåä¸ããããã¨ã«ãããè´æç¹æ§ãããæ´»ãããå¹ç
ã®è¯ããããé
åãè¡ããã¨ãã§ãããããã¯ããã¼ã
ã£ã³ã°å¹çãè½ã¨ããã¨ã®ãªããå¹ççãªå§ç¸®ãè¡ãã
ã¨ãå¯è½ã¨ãªããAccording to the present invention, when bit allocation is performed, the harmonic relationship of the input signal is detected, and the degree of concentration of bit allocation to the two-dimensional block for quantization including each harmonic component is improved. , It is possible to perform efficient bit allocation by making better use of the auditory characteristics, and it is possible to perform efficient compression without lowering the block floating efficiency.
ãï¼ï¼ï¼ï¼ã以ä¸ã®ãã¨ã«ãããåä¸ã®ãããã¬ã¼ãã«
ããã¦ããè¯å¥½ãªé³è³ªãå¾ããã¨ãå¯è½ã¨ãªããã¾ãã
åçã®é³è³ªãå¾ãããã«ãããä½ããããã¬ã¼ãã§å®æ½
å¯è½ã¨ãªããFrom the above, it becomes possible to obtain better sound quality at the same bit rate. Also,
It can be performed at a lower bit rate to obtain the same sound quality.
ãï¼ï¼ï¼ï¼ã[0014]
ã宿½ä¾ã以ä¸ãå³é¢ãåç
§ããæ¬çºæã®å®æ½ä¾ã«ã¤ã
ã¦èª¬æãããEmbodiments of the present invention will be described below with reference to the drawings.
ãï¼ï¼ï¼ï¼ãæ¬å®æ½ä¾ã§ã¯ããªã¼ãã£ãªï¼°ï¼£ï¼ä¿¡å·çã®
å
¥åãã£ã¸ã¿ã«ä¿¡å·ãã帯ååå²ç¬¦å·åï¼ï¼³ï¼¢ï¼£ï¼ãé©
å¿å¤æç¬¦å·åï¼ï¼¡ï¼´ï¼£ï¼ãåã³é©å¿ãããé
åï¼ï¼¡ï¼°ï¼£
âABï¼ã®åæè¡ãç¨ãã¦é«è½ç符å·åããããã®æè¡
ã«ã¤ãã¦ãå³ï¼ãåç
§ããªãã説æãããIn this embodiment, an input digital signal such as an audio PCM signal is subjected to band division coding (SBC), adaptive transform coding (ATC), and adaptive bit allocation (APC).
-High efficiency coding using each technique of AB). This technique will be described with reference to FIG.
ãï¼ï¼ï¼ï¼ãå³ï¼ã«ç¤ºãæ¬å®æ½ä¾ã®å
·ä½çãªé«è½ç符å·
åè£
ç½®ã§ã¯ãå
¥åãã£ã¸ã¿ã«ä¿¡å·ããã£ã«ã¿ãªã©ã«ãã
è¤æ°ã®å¨æ³¢æ°å¸¯åã«åå²ããã¨å
±ã«ãå卿³¢æ°å¸¯åæ¯ã«
ç´äº¤å¤æãè¡ã£ã¦ãå¾ããã卿³¢æ°è»¸ã®ã¹ãã¯ãã«ãã¼
ã¿ããå¾è¿°ãã人éã®è´è¦ç¹æ§ãèæ
®ããããããè¨ç
帯åå¹
ï¼ã¯ãªãã£ã«ã«ãã³ãï¼æ¯ã«é©å¿çã«ãããé
å
ãã¦ç¬¦å·åãã¦ããããã®æãé«åã§ã¯è¨ç帯åå¹
ãæ´
ã«åå²ãã帯åãç¨ããããã¡ãããä¸è¨ãã£ã«ã¿ãªã©
ã«ããéããããã³ã°ã®å¨æ³¢æ°åå²å¹
ã¯ãçåå²å¹
ã¨ã
ã¦ããããIn the concrete high-efficiency coding apparatus of the present embodiment shown in FIG. 1, the input digital signal is obtained by dividing the input digital signal into a plurality of frequency bands by a filter and performing orthogonal transformation for each frequency band. The frequency axis spectrum data is coded by adaptively allocating bits for each so-called critical band (critical band) in consideration of human auditory characteristics described later. At this time, a band obtained by further dividing the critical bandwidth is used in the high band. Of course, the non-blocking frequency division width by the filter or the like may be equal division width.
ãï¼ï¼ï¼ï¼ãããã«ãæ¬çºæå®æ½ä¾ã«ããã¦ã¯ãç´äº¤å¤
æã®åã«å
¥åä¿¡å·ã«å¿ãã¦é©å¿çã«ãããã¯ãµã¤ãºï¼ç´
äº¤å¤æã®ãããã¯é·ï¼ãå¤åãããã¨å
±ã«ãã¯ãªãã£ã«
ã«ãã³ãåä½è¥ããã¯é«åã§ã¯è¨ç帯åå¹
ï¼ã¯ãªãã£ã«
ã«ãã³ãï¼ãæ´ã«ç´°ååãã帯åã®ä¿¡å·ã§æ§æãããã
ããã¯ã§ããã¼ãã£ã³ã°å¦çãè¡ã£ã¦ããããªãããã®
ã¯ãªãã£ã«ã«ãã³ãã¨ã¯ã人éã®è´è¦ç¹æ§ãèæ
®ãã¦å
å²ããã卿³¢æ°å¸¯åã§ãããããç´é³ã®å¨æ³¢æ°è¿åã®å
ãå¼·ãã®ç帯åãã³ããã¤ãºã«ãã£ã¦å½è©²ç´é³ããã¹ã¯
ãããã¨ãã®ãã®ãã¤ãºã®æã¤å¸¯åã®ãã¨ã§ããããã®
ã¯ãªãã£ã«ã«ãã³ãã¯ãé«åã»ã©å¸¯åå¹
ãåºããªã£ã¦ã
ããä¾ãã°ï¼ãï¼ï¼ï½ï¼¨ï½ã®å
¨å¨æ³¢æ°å¸¯åã¯ä¾ãã°ï¼ï¼
ã®ã¯ãªãã£ã«ã«ãã³ãã«åå²ããã¦ãããFurther, in the embodiment of the present invention, the block size (block length of the orthogonal transform) is adaptively changed according to the input signal before the orthogonal transform, and the critical bandwidth (in the critical band unit or in the high band) is changed. The floating process is performed in a block composed of signals in a band in which the critical band) is further subdivided. The critical band is a frequency band divided in consideration of human auditory characteristics, and when the pure tone is masked by a narrow band noise of the same strength in the vicinity of the frequency of a pure tone, the noise of that pure tone is masked. It is the bandwidth that you have. The critical band has a wider bandwidth in a higher frequency range, and for example, the entire frequency band of 0 to 20 kHz is 25, for example.
Is divided into critical bands.
ãï¼ï¼ï¼ï¼ãããªãã¡ãå³ï¼ã«ããã¦ãå
¥å端åï¼ï¼ã«
ã¯ä¾ãã°ï¼ãï¼ï¼ï½ï¼¨ï½ã®ãªã¼ãã£ãªï¼°ï¼£ï¼ä¿¡å·ãä¾çµ¦
ããã¦ããããã®å
¥åä¿¡å·ã¯ãä¾ãã°ããããï¼±ï¼ï¼¦ãª
ã©ã®å¸¯ååå²ãã£ã«ã¿ï¼ï¼ã«ããï¼ãï¼ï¼ï½ï¼¨ï½å¸¯åã¨
ï¼ï¼ï½ãï¼ï¼ï½ï¼¨ï½å¸¯åã¨ã«åå²ãããï¼ãï¼ï¼ï½ï¼¨ï½
帯åã®ä¿¡å·ã¯åããããããï¼±ï¼ï¼¦çã®å¸¯ååå²ãã£ã«
ã¿ï¼ï¼ã«ããï¼ãï¼ï¼ï¼ï½ï¼¨ï½å¸¯åã¨ï¼ï¼ï¼ï½ãï¼ï¼ï½
Hï½å¸¯åã¨ã«åå²ããããThat is, in FIG. 1, an audio PCM signal of 0 to 22 kHz, for example, is supplied to the input terminal 10. This input signal is divided into a band of 0 to 11 kHz and a band of 11 to 22 kHz by a band dividing filter 11 such as a so-called QMF, and 0 to 11 kHz.
Similarly, the band signals are 0 to 5.5 kHz band and 5.5k to 11k by the band division filter 12 such as so-called QMF.
And the Hz band.
ãï¼ï¼ï¼ï¼ãä¸è¨å¸¯ååå²ãã£ã«ã¿ï¼ï¼ããã®ï¼ï¼ï½ã
ï¼ï¼ï½ï¼¨ï½å¸¯åã®ä¿¡å·ã¯ãç´äº¤å¤æåè·¯ã®ä¸ä¾ã§ããï¼
DCTï¼Modified Discrete Cosine Transformï¼åè·¯ï¼
ï¼ã«éãããä¸è¨å¸¯ååå²ãã£ã«ã¿ï¼ï¼ããã®ï¼ï¼ï¼ï½
ãï¼ï¼ï½ï¼¨ï½å¸¯åã®ä¿¡å·ã¯ï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ã«éããã
ä¸è¨å¸¯ååå²ãã£ã«ã¿ï¼ï¼ããã®ï¼ãï¼ï¼ï¼ï½ï¼¨ï½å¸¯å
ã®ä¿¡å·ã¯ï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ã«éããããã¨ã«ããããã
ããï¼ï¼¤ï¼£ï¼´å¦çãããããªããåï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ã
ï¼ï¼ã§ã¯ãå叝忝ã«è¨ãããããã¯æ±ºå®åè·¯ï¼ï¼ãï¼
ï¼ãï¼ï¼ã«ããæ±ºå®ããããããã¯ãµã¤ãºï¼ç´äº¤å¤æã
ããã¯ãµã¤ãºï¼ã«åºã¥ãã¦ï¼ï¼¤ï¼£ï¼´å¦çããªãããã11k from the band division filter 11
The signal in the 22 kHz band is M, which is an example of an orthogonal transformation circuit.
DCT (Modified Discrete Cosine Transform) circuit 1
5.5k from the band division filter 12 described above.
The signal in the ~ 11 kHz band is sent to the MDCT circuit 14,
The signals in the 0 to 5.5 kHz band from the band division filter 12 are sent to the MDCT circuit 15 to be subjected to MDCT processing. Each MDCT circuit 13-
In block 15, block decision circuits 19 and 2 provided for each band
MDCT processing is performed based on the block size (orthogonal transform block size) determined by 0 and 21.
ãï¼ï¼ï¼ï¼ãããã§ãä¸è¨ãããã¯æ±ºå®åè·¯ï¼ï¼ãï¼ï¼
ã«ããæ±ºå®ãããåï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ãï¼ï¼ã§ã®ããã
ã¯ãµã¤ãºã®å
·ä½ä¾ãå³ï¼ã®ï¼ï¼¡ï¼åã³ï¼ï¼¢ï¼ã«ç¤ºãããª
ããå³ï¼ã®ï¼ï¼¡ï¼ã«ã¯ç´äº¤å¤æãããã¯ãµã¤ãºãé·ãå ´
åï¼ãã³ã°ã¢ã¼ãã«ãããç´äº¤å¤æãããã¯ãµã¤ãºï¼
ããå³ï¼ã®ï¼ï¼¢ï¼ã«ã¯ç´äº¤å¤æãããã¯ãµã¤ãºãçãå ´
åï¼ã·ã§ã¼ãã¢ã¼ãã«ãããç´äº¤å¤æãããã¯ãµã¤ãºï¼
ã示ãã¦ãããHere, the block decision circuits 19 to 21.
Specific examples of the block sizes in the MDCT circuits 13 to 15 determined by are shown in (A) and (B) of FIG. In addition, in FIG. 2A, when the orthogonal transform block size is long (orthogonal transform block size in the long mode).
2B shows a case where the orthogonal transform block size is short (orthogonal transform block size in short mode).
Is shown.
ãï¼ï¼ï¼ï¼ããã®å³ï¼ã®å
·ä½ä¾ã«ããã¦ã¯ãï¼ã¤ã®ãã£
ã«ã¿åºåã¯ãããããï¼ã¤ã®ç´äº¤å¤æãããã¯ãµã¤ãºã
æã¤ãããªãã¡ãä½åå´ã®ï¼ãï¼ï¼ï¼ï½ï¼¨ï½å¸¯åã®ä¿¡å·
åã³ä¸åã®ï¼ï¼ï¼ï½ãï¼ï¼ï½ï¼¨ï½å¸¯åã®ä¿¡å·ã«å¯¾ãã¦
ã¯ãé·ããããã¯é·ã®å ´åï¼å³ï¼ã®ï¼ï¼¡ï¼ï¼ã¯ï¼ããã
ã¯å
ã®ãµã³ãã«æ°ãï¼ï¼ï¼ãµã³ãã«ã¨ããçããããã¯
ãé¸ã°ããå ´åï¼å³ï¼ã®ï¼ï¼¢ï¼ï¼ã«ã¯ï¼ãããã¯å
ã®ãµ
ã³ãã«æ°ãï¼ï¼ãµã³ãã«æ¯ã®ãããã¯ã¨ãã¦ããããã
ã«å¯¾ãã¦é«åå´ã®ï¼ï¼ï½ãï¼ï¼ï½ï¼¨ï½å¸¯åã®ä¿¡å·ã«å¯¾ã
ã¦ã¯ãé·ããããã¯é·ã®å ´åï¼å³ï¼ã®ï¼ï¼¡ï¼ï¼ã¯ï¼ãã
ãã¯å
ã®ãµã³ãã«æ°ãï¼ï¼ï¼ãµã³ãã«ã¨ããçãããã
ã¯ãé¸ã°ããå ´åï¼å³ï¼ã®ï¼ï¼¢ï¼ï¼ã«ã¯ï¼ãããã¯å
ã®
ãµã³ãã«æ°ãï¼ï¼ãµã³ãã«æ¯ã®ãããã¯ã¨ãã¦ãããã
ã®ããã«ãã¦çããããã¯ãé¸ã°ããå ´åã«ã¯å帯åã®
ç´äº¤å¤æãããã¯ã®ãµã³ãã«æ°ãåãã¨ãã¦é«åç¨æé
åè§£è½ãä¸ãããªããã¤ãããã¯åã«ä½¿ç¨ããã¦ã¤ã³ã
ã¦ã®ç¨®é¡ãæ¸ããã¦ãããIn the specific example of FIG. 2, the three filter outputs each have two orthogonal transform block sizes. That is, for a signal in the low frequency band of 0 to 5.5 kHz and a signal in the middle frequency band of 5.5 to 11 kHz, if the block length is long ((A) of FIG. 2), the samples in one block are sampled. The number of samples is 128, and when a short block is selected ((B) of FIG. 2), the number of samples in one block is a block for every 32 samples. On the other hand, for a signal in the 11 kHz to 22 kHz band on the high frequency side, in the case of a long block length ((A) of FIG. 2), the number of samples in one block is set to 256 samples, and a short block is selected. In this case ((B) of FIG. 2), the number of samples in one block is a block for every 32 samples. When a short block is selected in this way, the number of samples of orthogonal transform blocks in each band is set to be the same, the time resolution is increased in the higher frequency range, and the types of windows used for blocking are reduced.
ãï¼ï¼ï¼ï¼ãåã³å³ï¼ã«ããã¦ãä¸è¨ãããã¯æ±ºå®åè·¯
ï¼ï¼ãï¼ï¼ã§æ±ºå®ããããããã¯ãµã¤ãºã示ãæ
å ±ã¯ã
å¾è¿°ã®é©å¿ãããå²å½ç¬¦å·ååè·¯ï¼ï¼ãï¼ï¼ãï¼ï¼ã«é
ãããã¨å
±ã«ãåºå端åï¼ï¼ãï¼ï¼ãï¼ï¼ããåºåãã
ããReferring again to FIG. 1, the information indicating the block size determined by the block determination circuits 19 to 21 is:
It is sent to adaptive bit allocation coding circuits 16, 17, and 18 which will be described later, and is also output from output terminals 23, 25, and 27.
ãï¼ï¼ï¼ï¼ãåï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ãï¼ï¼ã«ã¦ï¼ï¼¤ï¼£ï¼´å¦
çããã¦å¾ããã卿³¢æ°é åã®ã¹ãã¯ãã«ãã¼ã¿ããã
ã¯ï¼ï¼¤ï¼£ï¼´ä¿æ°ãã¼ã¿ã¯ãããããè¨ç帯åï¼ã¯ãªãã£
ã«ã«ãã³ãï¼ã¾ãã¯é«åã§ã¯æ´ã«ã¯ãªãã£ã«ã«ãã³ãã
åå²ãã叝忝ã«ã¾ã¨ãããã¦ãé©å¿ãããå²å½ç¬¦å·å
åè·¯ï¼ï¼ãï¼ï¼ã«éããã¦ãããThe spectrum data in the frequency domain or the MDCT coefficient data obtained by the MDCT processing in each of the MDCT circuits 13 to 15 is summarized by a so-called critical band or a band obtained by further dividing the critical band in the high band. And transmitted to the adaptive bit allocation encoding circuits 16-18.
ãï¼ï¼ï¼ï¼ãé©å¿ãããå²å½ç¬¦å·ååè·¯ï¼ï¼ãï¼ï¼ã§
ã¯ãä¸è¨ãããã¯ãµã¤ãºã®æ
å ±ãåã³è¨ç帯åï¼ã¯ãªã
ã£ã«ã«ãã³ãï¼ã¾ãã¯é«åã§ã¯æ´ã«ã¯ãªãã£ã«ã«ãã³ã
ãåå²ãã叝忝ã«å²ãå½ã¦ããããããæ°ã«å¿ãã¦ã
åã¹ãã¯ãã«ãã¼ã¿ï¼ãããã¯ï¼ï¼¤ï¼£ï¼´ä¿æ°ãã¼ã¿ï¼ã
åéååï¼æ£è¦åãã¦éååï¼ããããã«ãã¦ãããIn the adaptive bit allocation encoding circuits 16 to 18, according to the information of the block size and the number of bits allocated for each band obtained by dividing the critical band (critical band) or the critical band in the high band,
Each spectrum data (or MDCT coefficient data) is requantized (normalized and quantized).
ãï¼ï¼ï¼ï¼ããããåé©å¿ãããå²å½ç¬¦å·ååè·¯ï¼ï¼ã
ï¼ï¼ã«ãã£ã¦ç¬¦å·åããããã¼ã¿ã¯ãåºå端åï¼ï¼ãï¼
ï¼ãï¼ï¼ãä»ãã¦åãåºããããã¾ããå½è©²é©å¿ããã
å²å½ç¬¦å·ååè·¯ï¼ï¼ãï¼ï¼ã§ã¯ãã©ã®ãããªä¿¡å·ã®å¤§ã
ãã«é¢ããæ£è¦åããªããããã示ãã¹ã±ã¼ã«ãã¡ã¯ã¿
ã¨ãã©ã®ãããªãããé·ã§éååãããããã示ããã
ãé·æ
å ±ãæ±ãã¦ãããããããåæã«åºå端åï¼ï¼ã
ï¼ï¼ãï¼ï¼ããåºåããããEach of these adaptive bit allocation encoding circuits 16 to
The data encoded by 18 is output to the output terminals 22, 2
It is taken out via 4, 26. Further, the adaptive bit allocation encoding circuits 16 to 18 also have a scale factor indicating what kind of signal magnitude is normalized and bit length information indicating what bit length is quantized. I am looking for these, and these are also output terminals 22,
It is output from 24 and 26.
ãï¼ï¼ï¼ï¼ãã¾ããå³ï¼ã«ãããåï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ã
ï¼ï¼ã®åºåããã¯ãä¸è¨è¨ç帯åï¼ã¯ãªãã£ã«ã«ãã³
ãï¼ã¾ãã¯é«åã§ã¯æ´ã«ã¯ãªãã£ã«ã«ãã³ããåå²ãã
叝忝ã®ã¨ãã«ã®ããä¾ãã°å½è©²ãã³ãå
ã§ã®åæ¯å¹
å¤
ã®ï¼ä¹å¹³åã®å¹³æ¹æ ¹ãè¨ç®ãããã¨çã«ããæ±ããã
ãããã¡ãããä¸è¨ã¹ã±ã¼ã«ãã¡ã¯ã¿ãã®ãã®ã以å¾ã®
ãããé
åã®çºã«ç¨ããããã«ãã¦ãããããã®å ´åã«
ã¯æ°ããªã¨ãã«ã®è¨ç®ã®æ¼ç®ãä¸è¦ã¨ãªãããããã¼ã
è¦æ¨¡ã®ç¯ç´ã¨ãªããã¾ããåãã³ãæ¯ã®ã¨ãã«ã®ã®ä»£ã
ãã«ãæ¯å¹
å¤ã®ãã¼ã¯å¤ãå¹³åå¤çãç¨ãããã¨ãå¯è½
ã§ãããFurther, the MDCT circuits 13 to 13 in FIG.
From the output of 15, by calculating the energy of each band obtained by further dividing the critical band in the above critical band (critical band) or high band, for example, calculating the square root of the root mean square of each amplitude value in the band. Desired. Of course, the scale factor itself may be used for subsequent bit allocation. In this case, the calculation of new energy is not required, and the hardware scale is saved. Further, instead of the energy for each band, it is also possible to use the peak value, the average value, etc. of the amplitude values.
ãï¼ï¼ï¼ï¼ã次ã«ãé©å¿ãããå²å½ç¬¦å·ååè·¯ï¼ï¼ãï¼
ï¼ã«ããã¦ãä¸è¨ãããé
åã®å
·ä½çãªææ³ã説æã
ããNext, the adaptive bit allocation encoding circuits 16 to 1
8, a concrete method of the bit allocation will be described.
ãï¼ï¼ï¼ï¼ããã®å ´åã®é©å¿ãããå²å½ç¬¦å·ååè·¯ã®å
ä½ãå³ï¼ã§èª¬æããã¨ãï¼ï¼¤ï¼£ï¼´ä¿æ°ã®å¤§ãããåãã
ãã¯æ¯ã«æ±ãããããã®ï¼ï¼¤ï¼£ï¼´ä¿æ°ãå
¥å端åï¼ï¼ï¼
ã«ä¾çµ¦ããããå½è©²å
¥å端åï¼ï¼ï¼ã«ä¾çµ¦ãããï¼ï¼¤ï¼£
ï¼´ä¿æ°ã¯ã叝忝ã®ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼åã³ã¹ãã¯
ãã«ã®ãªããããç®åºåè·¯ï¼ï¼ï¼ã«ä¸ããããã叝忝
ã®ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ã§ã¯ãã¯ãªãã£ã«ã«ãã³ãã¾
ãã¯é«åã«ããã¦ã¯ã¯ãªãã£ã«ã«ãã³ããæ´ã«ååå²ã
ãããããã®å¸¯åã«é¢ããä¿¡å·ã¨ãã«ã®ãç®åºããã帯
忝ã®ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ã§ç®åºãããããããã®
帯åã«é¢ããã¨ãã«ã®ã¯ãã¨ãã«ã®ä¾åãããé
ååè·¯
ï¼ï¼ï¼ãã¹ãã¯ãã«ã®ãªããããç®åºåè·¯ï¼ï¼ï¼ãåã³
æ¬çºæã«ãããåé³é¢ä¿æ¤åºææ®µã§ããå鳿忤åºå
è·¯ï¼ï¼ï¼ã«ä¾çµ¦ããããThe operation of the adaptive bit allocation encoding circuit in this case will be described with reference to FIG. 3. The magnitude of the MDCT coefficient is obtained for each block, and the MDCT coefficient is input terminal 801.
Is supplied to. MDC supplied to the input terminal 801
The T coefficient is supplied to the energy calculation circuit 803 and the spectrum smoothness calculation circuit 808 for each band. The energy calculation circuit 803 for each band calculates the signal energy for each band obtained by further dividing the critical band or the critical band in the high band. The energy related to each band calculated by the energy calculation circuit 803 for each band is supplied to the energy-dependent bit allocation circuit 804, the spectrum smoothness calculation circuit 808, and the overtone component detection circuit 813 which is the overtone relationship detection means in the present invention. To be done.
ãï¼ï¼ï¼ï¼ãå鳿忤åºåè·¯ï¼ï¼ï¼ã§ã¯ã叝忝ã¨ã
ã«ã®ç®åºåè·¯ï¼ï¼ï¼ããã®å¸¯åæ¯ã®ã¨ãã«ã®ãç¨ãã¦å
¥
åä¿¡å·ã®åé³é¢ä¿ã«ãã卿³¢æ°å¸¯åã®æ¤åºãè¡ããThe overtone component detection circuit 813 detects the frequency band related to the overtone of the input signal using the energy for each band from the energy calculation circuit 803 for each band.
ãï¼ï¼ï¼ï¼ãå³ï¼ãç¨ãã¦åé³ã«ã¤ãã¦èª¬æãããåé³
ã¨ã¯ã卿æ§ãæããä¿¡å·ãä¾ãã°æ¥½é³çããå³ä¸ã®ï¼¦
ï¼ï¼ï¼ï¼¦ï¼ï¼ï¼ï¼¦ï¼ï¼ï¼ï¼¦ï¼ï¼ã»ã»ã»ã¨ãã£ãæ´æ°åã®
é¢ä¿ãæã¤å¨æ³¢æ°æåã®ãã¨ã§ãããFï¼ã¯åºé³åã¯ã
ãããï¼ï¼¦ï¼ã¯ç¬¬äºåé³ãï¼ï¼¦ï¼ã¯ç¬¬ä¸åé³ãï¼ï¼¦ï¼ã¯
第ååé³ã¨å¼ã°ãã¦ãããã¤ã¾ãåºé³ã«å¯¾ããï½åã®å¨
æ³¢æ°ãæã¤æåã第ï½åé³ã¨å¼ã¶ãã¾ãããããã®åé³
ã¯ãåºé³ï¼¦ï¼ã®å¨æ³¢æ°ã«å¯¾ããå¶æ°åï¼ï¼ï½ï¼åã¯å¥æ°
åï¼ï¼ï½ï¼ï¼ï¼ã®æåã«åå¥ããï¼ï½ï¼ï¼ï¼ï¼ï¼ï¼ï¼ã»
ã»ã»ï¼ãããããå¶æ°æ¬¡åé³ã奿°æ¬¡åé³ã¨å¼ã°ããã
ä¾ãã°ã楽å¨ã®ã¯ã©ãªãããã®æ¼å¥é³ã¯ã奿°æ¬¡åé³ã«
æ¯ã¹å¶æ°æ¬¡åé³ã®å¼·åº¦ãèããå°ããã¨ããç¹å¾´ãæ
ã¤ãOvertones will be described with reference to FIG. An overtone is a signal having periodicity, such as a musical sound, which is indicated by F in the figure.
It is a frequency component having an integral multiple relationship such as 0, 2F0, 3F0, 4F0, .... F0 is called the fundamental tone or pitch, 2F0 is called the second overtone, 3F0 is called the third overtone, and 4F0 is called the fourth overtone. That is, a component having a frequency n times that of the fundamental tone is called an nth harmonic. Further, these overtones are separated into even-numbered (2n) or odd-numbered (2n + 1) components (n = 1, 2, 3, ...) Of the frequency of the fundamental tone F0.
ã» ã»), And they are called even harmonics and odd harmonics, respectively.
For example, the performance sound of a clarinet of a musical instrument is characterized in that the intensity of even-order overtones is significantly smaller than that of odd-order overtones.
ãï¼ï¼ï¼ï¼ãå鳿忤åºåè·¯ï¼ï¼ï¼ã§ã¯ãå
¥åä¿¡å·ã®
å鳿åãæ¤åºããæ¹æ³ã¨ãã¦ãä¾ãã°ã飿¥ãããã
ãã¯ã®ãããã¯ããã¼ãã£ã³ã°ä¿æ°ã®å·®ãææ¨ã¨ãã¦ä½¿
ãããããã¯ããã¼ãã£ã³ã°ä¿æ°ã®å·®ã大ãããªããã
ãã¯ã®å¨æ³¢æ°å¸¯åãæ´æ°åã®é¢ä¿ã«ããã°ããã®ããã
ã¯ã«å
¥åä¿¡å·ã®å鳿åãå«ã¾ãã¦ããå¯è½æ§ãé«ãã
ãªããã®ææ³ã§ã¯ãï¼ï¼¤ï¼£ï¼´ä¿æ°ã§ã¯ãªããããã¯ãã
ã¼ãã£ã³ã°ä¿æ°ã®å·®ãæ±ããã®ã§ãè¨ç®éãå°ãªããª
ãããã¼ãã¦ã§ã¢è¦æ¨¡ãå°ãããããã¨ãå¯è½ã§ãããThe overtone component detection circuit 813 uses the difference between the block floating coefficients of adjacent blocks as an index as a method of detecting the overtone component of the input signal. If the frequency band of a block in which the difference between the block floating coefficients is large is in an integral multiple relationship, it is highly likely that the block contains a harmonic component of the input signal.
In this method, the difference between the block floating coefficients is obtained instead of the MDCT coefficient, so the amount of calculation is reduced and the hardware scale can be reduced.
ãï¼ï¼ï¼ï¼ãã¾ãããããã¯ããã¼ãã£ã³ã°ä¿æ°ã®ä»£ã
ãã«ãï¼ï¼¤ï¼£ï¼´ä¿æ°ãç´æ¥ä½¿ã£ã¦å鳿åãæ¤åºããã
ã¨ãå¯è½ã§ããããã®ææ³ãç¨ããã°ãããç²¾ç·»ãªæ¤åº
ãå¯è½ã¨ãªããç¹ã«ãï¼ã¤ã®ãããã¯å
ã«è¤æ°ã®å鳿
åãå«ã¾ãã¦ãã¾ããããªå ´åã«ã¯æå¹ãªææ®µã§ãããIt is also possible to detect the overtone component by directly using the MDCT coefficient instead of the block floating coefficient. If this method is used, more precise detection becomes possible. In particular, this is an effective means when one block contains a plurality of overtone components.
ãï¼ï¼ï¼ï¼ãå
¥åä¿¡å·ã®å鳿åãæ¤åºããå¥ã®æ¹æ³ã¨
ãã¦ãåºé³ã¨ç¬¬ä¸åé³ã ããæ±ãããããã®å¨æ³¢æ°é¢ä¿
ãããã¹ã¦ã®å鳿åã®å¨æ³¢æ°ãäºæ³ããæ¹æ³ãèãã
ãããããã¯ãåé³ã®å¨æ³¢æ°ã¯æ´æ°åã®é¢ä¿ãæã¤ã¨ã
ããä¸è¨ãããåé³ãæã¤æ§è³ªãå©ç¨ãã¦ããããã®æ¹
æ³ãç¨ãããã¨ã«ãããå
¨å¸¯åã®é£æ¥ããåãããã¯ã
ãã¼ãã£ã³ã°ä¿æ°åã¯ï¼ï¼¤ï¼£ï¼´ä¿æ°ã®å·®ã調ã¹ãªãã¦
ããåºé³åã³ç¬¬ä¸åé³ãå«ã¾ãã帯åã ãã調ã¹ãã ã
ã§æ¸ãã®ã§ãè¨ç®éãå°ãªããªãããã¼ãã¦ã§ã¢è¦æ¨¡ã
å°ãããããã¨ãå¯è½ã§ãããAs another method for detecting the overtone component of the input signal, a method of obtaining only the fundamental tone and the first overtone and predicting the frequencies of all the overtone components based on the frequency relationship between them can be considered. This utilizes the above-mentioned property of overtones that the frequencies of overtones have an integer multiple relationship. By using this method, it is only necessary to check the band containing the fundamental and the first overtone without checking the difference between adjacent block floating coefficients or MDCT coefficients in all bands, so the amount of calculation is reduced, It is possible to reduce the hardware scale.
ãï¼ï¼ï¼ï¼ãå鳿忤åºåè·¯ï¼ï¼ï¼ã§æ¤åºããããå
鳿åãå«ã¾ãã帯åã«é¢ããæ
å ±ã¯ãã¨ãã«ã®ä¾åã®
ãããé
ååè·¯ï¼ï¼ï¼åã³ããã端æ°èª¿æ´åè·¯ï¼ï¼ï¼ã«
ä¾çµ¦ããããInformation on the band containing the overtone component detected by the overtone component detection circuit 813 is supplied to the energy-dependent bit distribution circuit 804 and the bit fraction adjustment circuit 814.
ãï¼ï¼ï¼ï¼ãã¨ãã«ã®ä¾åãããé
ååè·¯ï¼ï¼ï¼ã§ã¯ã
叝忝ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ããã®ã¨ãã«ã®ã使ç¨å¯
è½ç·ãããçºçåè·¯ï¼ï¼ï¼ããã®ä½¿ç¨å¯è½ç·ããããå
ã³å鳿忤åºåè·¯ï¼ï¼ï¼ããã®å鳿åãå«ã¾ãã帯
åãç¨ãã¦ãå
¥åä¿¡å·ã®ã¨ãã«ã®ã«ä¾åãããããé
å
ããæ¬å®æ½ä¾ã§ã¯ï¼ï¼ï¼ï½ï½ï½ï½ã®å
ã®ããå²åãç¨ã
ã¦è¡ããå
¥åä¿¡å·ã®ãã¼ããªãã£ãé«ãç¨ãããªãã¡å
¥
åä¿¡å·ã®ã¹ãã¯ãã©ã ã®å¹å¸ã大ããç¨ããã®ãããé
ãä¸è¨ï¼ï¼ï¼ï½ï½ï½ï½ã«å ããå²åãå¢å ããããªãã
å
¥åä¿¡å·ã®å¹å¸ãæ¤åºããã«ã¯ã飿¥ãããããã¯ã®ã
ããã¯ããã¼ãã£ã³ã°ä¿æ°ã®å·®ã®çµ¶å¯¾å¤ã®åãææ¨ã¨ã
ã¦ä½¿ããããã¦ãæ±ãããã使ç¨å¯è½ãªãããéã«ã¤
ããå帯åã®ã¨ãã«ã®ã®å¯¾æ°å¤ã«æ¯ä¾ãããããé
åã
è¡ãããã®éãå鳿åãå«ã¾ãã帯åã«å¯¾ãã¦ãä»ã®
帯åããå¤ãã®ãããã®é
åãè¡ããå³ï¼ãç¨ãã¦èª¬æ
ããã¨ãå³ä¸ï¼³ï¼°ã¯å
¥åä¿¡å·ã®ã¹ãã¯ãã©ã ãï¼¢ï¼ãï¼¢
ï¼ã¯éååãããã¯ã示ãããã®å³ã®å
¥åä¿¡å·ã®å鳿
åã¯ãåºé³ããããã¯ï¼¢ï¼ã第äºåé³ããããã¯ï¼¢ï¼ã
第ä¸åé³ããããã¯ï¼¢ï¼ã第ååé³ããããã¯ï¼¢ï¼ã«ã
ãããå«ã¾ãã¦ãããå³ä¸ï¼¬ï¼ã¯ãåé³é¢ä¿ãèæ
®ãã
ã«ãããé
åãè¡ã£ãå ´åã®åãããã¯ã«ãããéåå
éé³ã®ã¬ãã«ã表ããå³ä¸ï¼¬ï¼ã¯ãåé³é¢ä¿ãèæ
®ãã¦
ãããé
åãè¡ã£ãå ´åã®åãããã¯ã«ãããéååé
é³ã®ã¬ãã«ã表ããå³ä¸ã®ï¼¬ï¼ã¨ï¼¬ï¼ãæ¯è¼ããã¨åã
ãããã«ãå鳿åãèæ
®ãããããé
åãè¡ã£ãå ´å
ã«ã¯ãå鳿åãå«ã¾ãã¦ãããããã¯å
ã®éååéé³
ã¬ãã«ãä½ä¸ããè´æä¸è¯å¥½ãªé³è³ªãå¾ããã¨ãå¯è½ã§
ãããIn the energy-dependent bit allocation circuit 804,
Bits depending on the energy of the input signal by using the band including the energy from the band-by-band energy calculation circuit 803, the total usable bit from the total usable bit generation circuit 802, and the overtone component from the overtone component detection circuit 813. In this embodiment, the distribution is performed using a certain ratio of 128 kbps. The higher the tonality of the input signal, that is, the larger the unevenness of the spectrum of the input signal, the more the ratio of this bit amount to the 128 kbps increases. In addition,
To detect the unevenness of the input signal, the sum of the absolute values of the differences between the block floating coefficients of adjacent blocks is used as an index. Then, for the obtained usable bit amount, bit distribution proportional to the logarithmic value of the energy of each band is performed. At that time, more bits are allocated to the band containing the overtone component than to the other bands. Explaining with reference to FIG. 5, SP in the figure is the spectrum of the input signal, B1 to B
Reference numeral 8 indicates a quantization block. As for the overtone component of the input signal in this figure, the fundamental tone is block B2, the second overtone is block B4,
The third overtone is included in the block B6, and the fourth overtone is included in the block B8. In the figure, L1 represents the level of the quantization noise in each block when bit allocation is performed without considering the overtone relationship, and L2 in the figure is each block when bit allocation is performed in consideration of the overtone relationship. Represents the level of quantization noise at. As can be seen by comparing L1 and L2 in the figure, when bit distribution is performed in consideration of the overtone component, the quantization noise level in the block including the overtone component is lowered, and the sound quality is good for hearing. It is possible to obtain
ãï¼ï¼ï¼ï¼ãã¾ããå鳿åãèæ
®ãã¦ãããé
åãè¡
ãéãå¶æ°æ¬¡åé³ã¨å¥æ°æ¬¡åé³ã«åå¥ãã¦ãããé
åã
è¡ããã¨ãå¯è½ã§ãããä¸è¿°ããããã«åé³ã¯ãå¶æ°æ¬¡
åé³ã¨å¥æ°æ¬¡åé³ã¨ããæåã«åå¥ãããã種ã
ã®å
¥å
ä¿¡å·ã®ä¸ã«ã¯ãä¾ãã°æ¥½å¨ã®ã¯ã©ãªãããã®æ¼å¥é³ã®ã
ãã«ã奿°æ¬¡åé³ã®æ¹ãå¶æ°æ¬¡åé³ã¨æ¯è¼ãã¦å¼·åº¦ã大
ããæ§è³ªãæã¤ãã®ãããããã®ãããªå
¥åä¿¡å·ã«å¯¾ã
ã¦ãããé
åããå ´åãå¶æ°æ¬¡åé³ãã奿°æ¬¡åé³ãå«
ã¾ãããããã¯ã«å¯¾ãããå¤ãã®ããããé
åãããã
ãã«è¨å®ãããã¨ã«ãããè´æä¸è¯å¥½ãªé³è³ªãå¾ããã¨
ãå¯è½ãªå ´åããããã¾ãéã«ãå¶æ°æ¬¡åé³ã®æ¹ã奿°
次åé³ã¨æ¯è¼ãã¦å¼·åº¦ã大ããæ§è³ªãæã¤å
¥åä¿¡å·ãã
ãããã®ãããªå
¥åä¿¡å·ã«å¯¾ãã奿°æ¬¡åé³ãå«ããã
ãã¯ãããå¶æ°æ¬¡åé³ãå«ããããã¯ã«å¯¾ãããå¤ãã®
ããããé
åãããã¨ã«ãããè´æä¸è¯å¥½ãªé³è³ªãå¾ã
ãã¨ãå¯è½ãªå ´åããããIn addition, when bit distribution is performed in consideration of harmonic components, it is possible to divide into even-order harmonic overtones and odd-order harmonic overtones. As described above, the overtones are divided into even-order overtones and odd-order overtones. Among various input signals, for example, there is a property that the odd harmonics have a greater strength than the even harmonics, such as the playing sound of a clarinet of a musical instrument. When bits are distributed to such an input signal, by setting so that more bits are distributed to a block including odd-order overtones than even-order harmonics, it is possible to obtain a good sound quality in terms of hearing. May be possible. On the other hand, on the contrary, there is an input signal in which the even-numbered overtone has a greater strength than the odd-numbered overtone. In such an input signal, it may be possible to obtain a good sound quality by allocating more bits to a block including an even-order harmonic than a block including an odd-order harmonic.
ãï¼ï¼ï¼ï¼ãã¾ããå鳿忤åºåè·¯ï¼ï¼ï¼ã«ãã£ã¦æ¤
åºãããååé³ãå«ã¾ããå
¨ãããã¯ã«å¯¾ãã¦ãããé
åã®éä¸åº¦ãåä¸ããããã¨ãæã¾ããããä¾ãã°ãåº
鳿ååã³ä½æ¬¡ã®å鳿åãå«ã¾ãããããã¯ã®ã¿ãå
ã³åã¯å¶æ°æ¬¡åé³ã®ã¿åã¯å¥æ°æ¬¡åé³ã®ã¿ãªã©ãä¸é¨ã®
ãããã¯ã ãã«å¯¾ãã¦é©ç¨ãããã¨ã«ãã£ã¦ãè´æä¸è¯
好ãªé³è³ªãå¾ããã¨ãå¯è½ã§ãããFurther, it is desirable to improve the degree of concentration of bit distribution with respect to all blocks containing each overtone detected by the overtone component detecting circuit 813. For example, a fundamental component and a lower harmonic component are included. It is also possible to obtain good sound quality by applying it to only some blocks such as only blocks and / or only even harmonics or odd harmonics.
ãï¼ï¼ï¼ï¼ãè´è¦è¨±å®¹éé³ã¬ãã«ã«ä¾åãããããé
å
ç®åºåè·¯ï¼ï¼ï¼ã¯ãã¾ãä¸è¨ã¯ãªãã£ã«ã«ãã³ãæ¯ã«å
å²ãããã¹ãã¯ãã«ãã¼ã¿ã«åºã¥ãããããããã¹ãã³
ã°å¹æçãèæ
®ããåã¯ãªãã£ã«ã«ãã³ãæ¯ã®è¨±å®¹ãã¤
ãºéãæ±ããæ¬¡ã«è´è¦è¨±å®¹éé³ã¹ãã¯ãã«ãä¸ãããã
ã«ä½¿ç¨å¯è½ç·ãããããã¨ãã«ã®ä¾åããããå¼ããã
ããåãé
åããããThe bit allocation calculation circuit 805, which depends on the permissible noise level for hearing, first determines the permissible noise amount for each critical band in consideration of so-called masking effect based on the spectrum data divided for each of the critical bands. Bits obtained by subtracting energy-dependent bits from the total available bits are distributed so as to give a perceptible noise spectrum to the.
ãï¼ï¼ï¼ï¼ããã®ããã«ãã¦æ±ããããã¨ãã«ã®ä¾åã®
ãããæ°ã¨è´è¦è¨±å®¹éé³ã¬ãã«ã«ä¾åãããããæ°ã¯å
ç®å¨ï¼ï¼ï¼ã«ããã¦å ç®ãããããã端æ°èª¿æ´åè·¯ï¼ï¼
ï¼ã«ä¾çµ¦ããããThe energy-dependent bit number and the hearing-acceptable noise level-dependent bit number thus obtained are added in the adder 806, and the bit fraction adjustment circuit 81 is added.
4 is supplied.
ãï¼ï¼ï¼ï¼ãããã端æ°èª¿æ´åè·¯ï¼ï¼ï¼ã§ã¯ãå ç®å¨ï¼
ï¼ï¼ããã®ãããã使ç¨å¯è½ç·ãããçºçåè·¯ï¼ï¼ï¼ã
ãã®ä½¿ç¨å¯è½ç·ããããåã³å鳿忤åºåè·¯ï¼ï¼ï¼ã
ãã®å鳿åãå«ã¾ãã帯åãç¨ãã¦ãæ¬å®æ½ä¾ã§ã¯ï¼
ï¼ï¼ï½ï½ï½ï½ã®ãããã¬ã¼ãã«åãããããã«ãããã®
端æ°èª¿æ´ãè¡ããIn the bit fraction adjusting circuit 814, the adder 8
0 in this embodiment, the total available bit from the available total bit generation circuit 802, and the band containing the harmonic component from the harmonic component detection circuit 813.
Fractional adjustment of bits is performed to match the bit rate of 28 kbps.
ãï¼ï¼ï¼ï¼ãä¾ãã°ãå ç®å¨ï¼ï¼ï¼ã®åºåãï¼ï¼ï¼ï½ï½
ï½ï½ãè¶ãããããæ°ã«ãªã£ã¦ãããããªå ´åã«ã¯ãè´
æä¸æãå½±é¿ãåã¼ããªã帯åãä¾ãã°é«åå´ã®ããã
ã¯ããä½åå´ã«åãã£ã¦é 次ï¼ããããã¤æ¸å°ãããæ¹
æ³ãåãé«åå´ã®ãããã¯ããé
åãããæ°ãï¼ã«ã
ããã¤ã¾ã帯åå¶éãè¡ãæ¹æ³çããããæ¬å®æ½ä¾ã§
ã¯ãå鳿åãå«ã¾ãã¦ããªããããã¯ã«å¯¾ãåªå
çã«
é«åå´ããä½åå´ã«åãã£ã¦ï¼ããããã¤æ¸å°ãããæ¹
æ³ãç¨ããï¼ï¼ï¼ï½ï½ï½ï½ã¨åçã®ãããã¬ã¼ãã«ãªã
ããã«ç«¯æ°èª¿æ´ãè¡ããå³ï¼ãç¨ãã¦èª¬æããã¨ãä¾ã
ã°ãï¼ãããåã ãï¼ï¼ï¼ï½ï½ï½ï½ãè¶ãããããæ°ã«
ãªã£ã¦ããã¨ä»®å®ãããã¾ãæåã«æãé«åå´ã®ããã
ã¯ï¼¢ï¼ã«çç®ããã¨ããã®ãããã¯ï¼¢ï¼ã¯å鳿åãå«
ãã§ããã®ã§ããããã®åæ¸ã¯è¡ããªããæ¬¡ã«ã飿¥ã
ãä½åå´ã®ãããã¯ï¼¢ï¼ã«çç®ããã¨ããã®ãããã¯ï¼¢
ï¼ã¯å鳿åãå«ãã§ããªãã®ã§ãï¼ãããåã®åæ¸ã
è¡ãããã®æç¹ã§ï¼ï¼ï¼ï½ï½ï½ï½ã®ãããã¬ã¼ãã¨åç
ã®ãããæ°ã¨ãªã端æ°èª¿æ´ã¯å®äºããããªããä¾ãã°é«
åå´ããä½åå´ã¸é 次ãããæ°ã忏ãã¦ãããæãä½
ã卿³¢æ°æåãå«ããããã¯ï¼¢ï¼ã¾ã§å°éããæããã
ã«ãããæ°åæ¸ã®å¿
è¦æ§ãããå ´åã«ã¯ãå度æãé«ã
卿³¢æ°æåãå«ããããã¯ï¼¢ï¼ã«æ»ããåæ§ã®åä½ãç¹°
ãè¿ãè¡ãããªããã®ããã«ãããã大å¹
ã«ä¸è¶³ãã¦ã
ãå ´åãªã©ãå鳿åãå«ã¾ãããããã¯ã«å¯¾ãããã
ã忏ããå¯è½æ§ããããã¾ããåé³ã奿°æ¬¡åé³ã¨å¶
æ°æ¬¡åé³ã¨ã«åå¥ããå
¨ã¦åã¯ä¸é¨ã®å¥æ°æ¬¡åé³åã¯å¶
æ°æ¬¡åé³ãå«ã¾ãã¦ããªããããã¯ã«å¯¾ãåªå
çã«é«å
å´ããï¼ããããã¤æ¸å°ãããæ¹æ³çãèãããããFor example, the output of the adder 806 is 128 kb
In the case where the number of bits exceeds ps, the band that has the least effect on hearing, for example, a method of sequentially decreasing one bit from the block on the high frequency side toward the low frequency side, or the high frequency side There is a method of setting the number of allocated bits to 0 from the block, that is, performing band limitation. In the present embodiment, a method of preferentially decreasing each block from the high frequency side to the low frequency side by 1 bit is used for the block not including the overtone component, and the fraction adjustment is performed so that the bit rate becomes equal to 128 kbps. To do. Explaining with reference to FIG. 5, it is assumed that the number of bits exceeds 128 kbps by one bit. First, focusing on the block B9 on the highest frequency side, since this block B9 contains harmonic components, no bit reduction is performed. Next, focusing attention on the adjacent low-frequency block B8, this block B8
Since 8 does not include the overtone component, reduction by 1 bit is performed. At this point, the bit number becomes equal to the bit rate of 128 kbps, and the fraction adjustment is completed. Note that, for example, when the number of bits is sequentially reduced from the high frequency side to the low frequency side, and when the block B1 including the lowest frequency component is reached, it is necessary to further reduce the number of bits. Returning to the block B9 including the component, the same operation is repeated. It should be noted that there is a possibility that the number of bits may be reduced for a block including an overtone component, such as when the number of bits is significantly insufficient. In addition, a method of classifying overtones into odd-order overtones and even-order overtones, and preferentially decreasing one or more bits from the high frequency side for all or some of the blocks that do not include odd-order overtones or even-order overtones Can also be considered.
ãï¼ï¼ï¼ï¼ãéã«ãä¾ãã°å ç®å¨ï¼ï¼ï¼ã®åºåãï¼ï¼ï¼
ï½ï½ï½ï½ãä¸åããããæ°ã®å ´åã«ã¯ãè´æä¸æãå½±é¿
ãåã¼ã帯åãä¾ãã°ä½åå´ã®ãããã¯ããé«åå´ã«å
ãã£ã¦é 次ï¼ããããã¤å¢å ãããæ¹æ³çããããæ¬å®
æ½ä¾ã§ã¯ãå鳿åãå«ã¾ãã¦ãããããã¯ã«å¯¾ãåªå
çã«ä½åå´ããï¼ããããã¤å¢å ãããæ¹æ³ãç¨ããï¼
ï¼ï¼ï½ï½ï½ï½ã¨åããããæ°ã«èª¿æ´ãããå³ï¼ãç¨ãã¦
説æããã¨ãä¾ãã°ãï¼ãããåã ãï¼ï¼ï¼ï½ï½ï½ï½ã
è¶ãããããæ°ã«ãªã£ã¦ããã¨ä»®å®ãããã¾ãæåã«æ
ãä½åå´ã®ãããã¯ï¼¢ï¼ã«çç®ããã¨ããã®ãããã¯ï¼¢
ï¼ã¯å鳿åãå«ãã§ããªãã®ã§ããããæ°ã®å¢å ã¯è¡
ããªããæ¬¡ã«ã飿¥ããé«åå´ã®ãããã¯ï¼¢ï¼ã«çç®ã
ãã¨ããã®ãããã¯ï¼¢ï¼ãå鳿åãå«ãã§ããªãã®
ã§ããããæ°ã®å¢å ã¯è¡ããªããæ¬¡ã«ã飿¥ããé«åå´
ããããã¯ï¼¢ï¼ã«çç®ããã¨ããã®ãããã¯ï¼¢ï¼ã¯åé³
æåï¼åºé³ï¼ãå«ãã§ããã®ã§ãï¼ãããåã®å¢å ãè¡
ãããã®æç¹ã§ï¼ï¼ï¼ï½ï½ï½ï½ã®ãããã¬ã¼ãã¨åçã®
ãããæ°ã¨ãªã端æ°èª¿æ´ã¯å®äºããããªããä¾ãã°ä½å
å´ããé«åå´ã¸é 次ãããæ°ãå¢å ããã¦ãããæãé«
ã卿³¢æ°å¸¯åãå«ããããã¯ï¼¢ï¼ã¾ã§å°éããæããã
ã«ãããæ°å¢å ã®å¿
è¦æ§ãããå ´åã«ã¯ãå度æãä½ã
卿³¢æ°å¸¯åãå«ããããã¯ï¼¢ï¼ã«æ»ããåæ§ã®åä½ãç¹°
ãè¿ãè¡ãããªããã®ããã«ãããã大å¹
ã«ä½ã£ã¦ãã
å ´åãªã©ãå鳿åãå«ã¾ããªããããã¯ã«å¯¾ãã¦ãã
ããæ°ãå¢å ããå¯è½æ§ããããã¾ããåé³ã奿°æ¬¡å
é³ã¨å¶æ°æ¬¡åé³ã¨ã«åå¥ããå
¨ã¦åã¯ä¸é¨ã®å¥æ°æ¬¡åé³
åã¯å¶æ°æ¬¡åé³ãå«ã¾ãã¦ãããããã¯ã«å¯¾ãåªå
çã«
ä½åå´ããï¼ããããã¤å¢å ãããæ¹æ³çãèããã
ããOn the contrary, for example, the output of the adder 806 is 128
When the number of bits is less than kbps, there is a method of sequentially increasing the bit by 1 bit from a block on the low frequency side toward a high frequency side, for example, a band that most affects the hearing. In the present embodiment, a method of preferentially increasing by 1 bit from the low frequency side for a block containing a harmonic component is used.
Adjust to the same number of bits as 28 kbps. Explaining with reference to FIG. 5, it is assumed that the number of bits exceeds 128 kbps by one bit. First, focusing on the block B1 on the lowest frequency side, this block B1
Since 1 does not include a harmonic component, the number of bits is not increased. Next, focusing on the adjacent block B2 on the high frequency side, since this block B2 also does not include a harmonic component, the number of bits is not increased. Next, focusing on the adjacent high-frequency side block B3, since this block B3 contains a harmonic component (fundamental tone), it is increased by one bit. At this point, the bit number becomes equal to the bit rate of 128 kbps, and the fraction adjustment is completed. Note that, for example, when the number of bits is sequentially increased from the low frequency side to the high frequency side, and when it reaches the block B9 including the highest frequency band, if it is necessary to further increase the number of bits, the lowest frequency is again set. Returning to the block B1 including the band, the same operation is repeated. It should be noted that there is a possibility that the number of bits may be increased even for a block that does not include an overtone component, such as when the number of bits is excessive. In addition, a method of classifying overtones into odd-order harmonic overtones and even-order harmonic overtones, and preferentially increasing by 1 bit from the low-frequency side for all or some of the blocks containing odd-order harmonic overtones or even-order harmonic overtones Can also be considered.
ãï¼ï¼ï¼ï¼ããã®æ§ã«åé³é¢ä¿ãèæ
®ããããã端æ°èª¿
æ´ææ³ãç¨ãããã¨ã«ãããå鳿åãå«ã¾ãã帯åã®
éååéé³ãæ¸å°ããããã¨ãã§ããè´æä¸è¯å¥½ãªé³è³ª
ãå¾ããã¨ãå¯è½ã¨ãªããç¹ã«ãå ç®å¨ï¼ï¼ï¼ã®åºåã
ï¼ï¼ï¼ï½ï½ï½ï½ãè¶ãããããã¬ã¼ãã«ãªã£ã¦ããå ´
åãé«åã®å鳿åãåé¤ããå ´åãå°ãªããªããå
¥å
ä¿¡å·ãæã¤å鳿§é ãç ´å£ãã¦ãã¾ãå¯è½æ§ãæ¸å°ãã
ãããè¯å¥½ãªé³è³ªãå¾ããã¨ãã§ãããç¹ã«ãæ¬å®æ½ä¾
ã§ã®ãããã¬ã¼ãï¼ï¼ï¼ï½ï½ï½ï½ãããä½ããããã¬ã¼
ããç¨ããé«è½ç符å·åè£
ç½®ã«ããã¦æå¹ã«ä½ç¨ãããAs described above, by using the fractional bit adjustment method considering the overtone relation, it is possible to reduce the quantization noise in the band including the overtone component, and it is possible to obtain a good sound quality in hearing. In particular, when the output of the adder 806 has a bit rate exceeding 128 kbps, it is less likely that high-frequency overtone components will be deleted, and the possibility of destroying the overtone structure of the input signal is reduced. Good sound quality can be obtained. In particular, it works effectively in a high-efficiency coding apparatus using a bit rate lower than the bit rate of 128 kbps in this embodiment.
ãï¼ï¼ï¼ï¼ããã®ããã«ãã¦ï¼ï¼ï¼ï½ï½ï½ï½ã®ãããã¬
ã¼ãã«èª¿æ´ããããããã¯ãå³ï¼ã®é©å¿ãããå²å½ç¬¦å·
ååè·¯ï¼ï¼ãï¼ï¼ã«ãã£ã¦åã¯ãªãã£ã«ã«ãã³ãæ¯è¥ã
ãã¯é«åã«ããã¦ã¯ã¯ãªãã£ã«ã«ãã³ããæ´ã«è¤æ°å¸¯å
ã«åå²ãã帯åã«å²ãå½ã¦ããããããæ°ã«å¿ãã¦ãå
ã¹ãã¯ãã«ãã¼ã¿ï¼ãããã¯ï¼ï¼¤ï¼£ï¼´ä¿æ°ãã¼ã¿ï¼ãå
éååãããããã«ãªã£ã¦ããããã®ããã«ãã¦ç¬¦å·å
ããããã¼ã¿ã¯ãå³ï¼ã®åºå端åï¼ï¼ãï¼ï¼ãï¼ï¼ãä»
ãã¦åãåºããããThe bits adjusted to the bit rate of 128 kbps in this way are divided into a plurality of bands by dividing the critical band in each critical band or in the high band by the adaptive bit allocation coding circuits 16 to 18 in FIG. Each spectrum data (or MDCT coefficient data) is re-quantized according to the number of bits assigned to. The data encoded in this way is taken out via the output terminals 22, 24 and 26 of FIG.
ãï¼ï¼ï¼ï¼ãããã«è©³ããä¸è¨è´è¦è¨±å®¹éé³ã¹ãã¯ãã«
ä¾åã®ãããé
ååè·¯ï¼ï¼ï¼ä¸ã®è´è¦è¨±å®¹éé³ã¹ãã¯ã
ã«ç®åºåè·¯ã«ã¤ãã¦èª¬æããã¨ãï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ãï¼
ï¼ã§å¾ãããï¼ï¼¤ï¼£ï¼´ä¿æ°ãå½è©²ãããé
ååè·¯ï¼ï¼ï¼
ä¸ã®è¨±å®¹éé³ã¹ãã¯ãã«ç®åºåè·¯ã«ä¸ãããããThe auditory permissible noise spectrum calculating circuit in the bit allocation circuit 805 depending on the permissible auditory noise spectrum will be described in more detail. The MDCT circuits 13 to 1 will be described below.
The MDCT coefficient obtained in 5 is the bit allocation circuit 805.
It is given to the allowable noise spectrum calculation circuit therein.
ãï¼ï¼ï¼ï¼ãå³ï¼ã¯ä¸è¨è¨±å®¹éé³ã¹ãã¯ãã«ç®åºåè·¯ã
ã¾ã¨ãã¦èª¬æããä¸å
·ä½ä¾ã®æ¦ç¥æ§æã示ããããã¯å
è·¯å³ã§ããããã®å³ï¼ã«ããã¦ãå
¥å端åï¼ï¼ï¼ã«ã¯ã
ï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ãï¼ï¼ãï¼ï¼ããã®å¨æ³¢æ°é åã®ã¹ã
ã¯ãã«ãã¼ã¿ãä¾çµ¦ããã¦ãããFIG. 6 is a block circuit diagram showing a schematic configuration of a specific example in which the allowable noise spectrum calculating circuits are collectively described. In FIG. 6, the input terminal 521 has:
The spectrum data in the frequency domain is supplied from the MDCT circuits 13, 14, and 15.
ãï¼ï¼ï¼ï¼ããã®å¨æ³¢æ°é åã®å
¥åãã¼ã¿ã¯ã叝忝ã®
ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ã«éããã¦ãä¸è¨ã¯ãªãã£ã«ã«
ãã³ãï¼è¨ç帯åï¼æ¯ã®ã¨ãã«ã®ããä¾ãã°å½è©²ãã³ã
å
ã§ã®åæ¯å¹
å¤ï¼ä¹ã®ç·åãè¨ç®ãããã¨çã«ããæ±ã
ãããããã®åãã³ãæ¯ã®ã¨ãã«ã®ã®ä»£ããã«ãæ¯å¹
å¤
ã®ãã¼ã¯å¤ãå¹³åå¤çãç¨ãããããã¨ãããããã®ã¨
ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ããã®åºåã¨ãã¦ãä¾ãã°åãã³
ãã®ç·åå¤ã®ã¹ãã¯ãã«ã¯ãä¸è¬ã«ãã¼ã¯ã¹ãã¯ãã«ã¨
ç§°ããã¦ãããå³ï¼ã¯ãã®ãããªåã¯ãªãã£ã«ã«ãã³ã
æ¯ã®ãã¼ã¯ã¹ãã¯ãã«ï¼³ï¼¢ã示ãã¦ããããã ãããã®
å³ï¼ã§ã¯ãå³ç¤ºãç°¡ç¥åãããããä¸è¨ã¯ãªãã£ã«ã«ã
ã³ãã®ãã³ãæ°ãï¼ï¼ãã³ãï¼ï¼¢ï¼ãï¼¢ï¼ï¼ï¼ã§è¡¨ç¾ã
ã¦ãããThe input data in the frequency domain is sent to the energy calculation circuit 522 for each band, and the energy of each critical band (critical band) is calculated, for example, as the sum of the squared amplitude values in the band. It is required by doing. Instead of the energy for each band, a peak value, an average value, etc. of the amplitude value may be used. As an output from the energy calculation circuit 522, for example, the spectrum of the total sum value of each band is generally called a Bark spectrum. FIG. 7 shows the Bark spectrum SB for each such critical band. However, in FIG. 7, in order to simplify the illustration, the number of bands of the critical band is represented by 12 bands (B1 to B12).
ãï¼ï¼ï¼ï¼ãããã§ãä¸è¨ãã¼ã¯ã¹ãã¯ãã«ï¼³ï¼¢ã®ãã
ãããã¹ãã³ã°ã«æ¼ããå½±é¿ãèæ
®ããããã«ã該ãã¼
ã¯ã¹ãã¯ãã«ï¼³ï¼¢ã«æå®ã®éã¿ä»ã颿°ãæãã¦å ç®ã
ããããªç³è¾¼ã¿ï¼ã³ã³ããªã¥ã¼ã·ã§ã³ï¼å¦çãæ½ããã
ã®ãããä¸è¨å¸¯åæ¯ã®ã¨ãã«ã®ç®åºåè·¯ï¼ï¼ï¼ã®åºåã
ãªãã¡è©²ãã¼ã¯ã¹ãã¯ãã«ï¼³ï¼¢ã®åå¤ã¯ãç³è¾¼ã¿ãã£ã«
ã¿åè·¯ï¼ï¼ï¼ã«éãããã該ç³è¾¼ã¿ãã£ã«ã¿åè·¯ï¼ï¼ï¼
ã¯ãä¾ãã°ãå
¥åãã¼ã¿ãé æ¬¡é
å»¶ãããè¤æ°ã®é
å»¶ç´
åã¨ããããé
å»¶ç´ åããã®åºåã«ãã£ã«ã¿ä¿æ°ï¼éã¿
ä»ã颿°ï¼ãä¹ç®ããè¤æ°ã®ä¹ç®å¨ï¼ä¾ãã°åãã³ãã«
対å¿ããï¼ï¼åã®ä¹ç®å¨ï¼ã¨ãåä¹ç®å¨åºåã®ç·åãã¨
ãç·åå ç®å¨ã¨ããæ§æããããã®ã§ãããHere, in order to consider the influence of so-called masking of the Bark spectrum SB, a convolution process is performed such that the Bark spectrum SB is multiplied by a predetermined weighting function and added. Therefore, the output of the energy calculation circuit 522 for each band, that is, each value of the Bark spectrum SB is sent to the convolution filter circuit 523. The convolution filter circuit 523
Is, for example, a plurality of delay elements that sequentially delay input data, and a plurality of multipliers (for example, 25 multipliers corresponding to each band) that multiply outputs from these delay elements by a filter coefficient (weighting function). , A sum total adder that sums the outputs of the respective multipliers.
ãï¼ï¼ï¼ï¼ããªããä¸è¨ãã¹ãã³ã°ã¨ã¯ã人éã®è´è¦ä¸
ã®ç¹æ§ã«ãããããä¿¡å·ã«ãã£ã¦ä»ã®ä¿¡å·ããã¹ã¯ãã
ã¦èãããªããªãç¾è±¡ããããã®ã§ããããã®ãã¹ãã³
ã°å¹æã«ã¯ãæéé åã®ãªã¼ãã£ãªä¿¡å·ã«ããæé軸ã
ã¹ãã³ã°å¹æã¨ã卿³¢æ°é åã®ä¿¡å·ã«ããåæå»ãã¹ã
ã³ã°å¹æã¨ãããããããã®ãã¹ãã³ã°å¹æã«ãããã
ã¹ãã³ã°ãããé¨åã«ãã¤ãºããã£ãã¨ãã¦ãããã®ã
ã¤ãºã¯èãããªããã¨ã«ãªãããã®ãããå®éã®ãªã¼ã
ã£ãªä¿¡å·ã§ã¯ããã®ãã¹ãã³ã°ãããç¯å²å
ã®ãã¤ãºã¯
許容å¯è½ãªãã¤ãºã¨ããããThe above-mentioned masking means a phenomenon in which one signal is masked by another signal and becomes inaudible due to human auditory characteristics. This masking effect is caused by a time domain audio signal. There are an axial masking effect and a simultaneous time masking effect by a signal in the frequency domain. Due to these masking effects, even if there is noise in the masked portion, this noise cannot be heard. Therefore, in the actual audio signal, the noise within the masked range is regarded as an acceptable noise.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ç³è¾¼ã¿ãã£ã«ã¿åè·¯ï¼ï¼ï¼ã®å
ä¹ç®å¨ã®ä¹ç®ä¿æ°ï¼ãã£ã«ã¿ä¿æ°ï¼ã®ä¸å
·ä½ä¾ã示ã
ã¨ãä»»æã®ãã³ãã«å¯¾å¿ããä¹ç®å¨ï¼ã®ä¿æ°ãï¼ã¨ãã
ã¨ããä¹ç®å¨ï¼âï¼ã§ä¿æ°ï¼ï¼ï¼ï¼ããä¹ç®å¨ï¼âï¼ã§
ä¿æ°ï¼ï¼ï¼ï¼ï¼ï¼ããä¹ç®å¨ï¼âï¼ã§ä¿æ°ï¼ï¼ï¼ï¼ï¼ï¼
ï¼ï¼ï¼ããä¹ç®å¨ï¼ï¼ï¼ã§ä¿æ°ï¼ï¼ï¼ããä¹ç®å¨ï¼ï¼ï¼
ã§ä¿æ°ï¼ï¼ï¼ï¼ããä¹ç®å¨ï¼ï¼ï¼ã§ä¿æ°ï¼ï¼ï¼ï¼ï¼ãå
é
å»¶ç´ åã®åºåã«ä¹ç®ãããã¨ã«ãããä¸è¨ãã¼ã¯ã¹ã
ã¯ãã«ï¼³ï¼¢ã®ç³è¾¼ã¿å¦çãè¡ãããããã ããï¼ã¯ï¼ã
ï¼ï¼ã®ä»»æã®æ´æ°ã§ãããA specific example of the multiplication coefficient (filter coefficient) of each multiplier of the convolution filter circuit 523 will be described. When the coefficient of the multiplier M corresponding to an arbitrary band is 1, the multiplier M â1 gives a coefficient of 0.15, multiplier M-2 gives a coefficient of 0.0019, and multiplier M-3 gives a coefficient of 0.0000.
086, multiplier M + 1 gives a coefficient of 0.4, multiplier M + 2
By multiplying the output of each delay element by a coefficient of 0.06 and a coefficient of 0.007 by a multiplier M + 3, the convolution processing of the Bark spectrum SB is performed. However, M is 1 to
It is an arbitrary integer of 25.
ãï¼ï¼ï¼ï¼ã次ã«ãä¸è¨ç³è¾¼ã¿ãã£ã«ã¿åè·¯ï¼ï¼ï¼ã®åº
åã¯å¼ç®å¨ï¼ï¼ï¼ã«éãããã該å¼ç®å¨ï¼ï¼ï¼ã¯ãä¸è¨
ç³è¾¼ãã é åã§ã®å¾è¿°ãã許容å¯è½ãªãã¤ãºã¬ãã«ã«å¯¾
å¿ããã¬ãã«Î±ãæ±ãããã®ã§ããããªããå½è©²è¨±å®¹å¯
è½ãªãã¤ãºã¬ãã«ï¼è¨±å®¹ãã¤ãºã¬ãã«ï¼ã«å¯¾å¿ããã¬ã
ã«Î±ã¯ãå¾è¿°ããããã«ãéã³ã³ããªã¥ã¼ã·ã§ã³å¦çã
è¡ããã¨ã«ãã£ã¦ãã¯ãªãã£ã«ã«ãã³ãã®åãã³ãæ¯ã®
許容ãã¤ãºã¬ãã«ã¨ãªããããªã¬ãã«ã§ãããNext, the output of the convolution filter circuit 523 is sent to the subtractor 524. The subtractor 524 calculates a level α corresponding to an allowable noise level described later in the convoluted area. The level α corresponding to the permissible noise level (permissible noise level) is a level at which the critical noise band becomes the permissible noise level for each band by performing inverse convolution processing, as described later. is there.
ãï¼ï¼ï¼ï¼ãããã§ãä¸è¨å¼ç®å¨ï¼ï¼ï¼ã«ã¯ãä¸è¨ã¬ã
ã«Î±ãæ±ãããããã®è¨±å®¹é¢æ°ï¼ãã¹ãã³ã°ã¬ãã«ã表
ç¾ãã颿°ï¼ãä¾çµ¦ãããããã®è¨±å®¹é¢æ°ã墿¸ããã
ãã¨ã§ä¸è¨ã¬ãã«Î±ã®å¶å¾¡ãè¡ã£ã¦ãããå½è©²è¨±å®¹é¢æ°
ã¯ã次ã«èª¬æãããããªï¼ï½âï½ï½ï¼é¢æ°çºçåè·¯ï¼ï¼
ï¼ããä¾çµ¦ããã¦ãããã®ã§ãããHere, the subtractor 524 is supplied with an allowance function (function expressing a masking level) for obtaining the level α. The level α is controlled by increasing or decreasing this allowance function. The permissible function is the (n-ai) function generating circuit 52 as described below.
It is supplied from No. 5.
ãï¼ï¼ï¼ï¼ãããªãã¡ã許容ãã¤ãºã¬ãã«ã«å¯¾å¿ããã¬
ãã«Î±ã¯ãã¯ãªãã£ã«ã«ãã³ãã®ãã³ãã®ä½åããé ã«
ä¸ããããçªå·ãï½ã¨ããã¨ã次ã®å¼ã§æ±ãããã¨ãã§
ãããThat is, the level α corresponding to the allowable noise level can be obtained by the following equation, where i is the number given in order from the low band of the critical band.
ãï¼ï¼ï¼ï¼ãαï¼ï¼³âï¼ï½âï½ï½ï¼ ãã®å¼ã«ããã¦ãï½ï¼ï½ã¯å®æ°ã§ï½ï¼ï¼ãï¼³ã¯ç³è¾¼ã¿å¦
çããããã¼ã¯ã¹ãã¯ãã«ã®å¼·åº¦ã§ãããå¼ä¸ï¼ï½âï½
ï½ï¼ãè¨±å®¹é¢æ°ã¨ãªããä¾ã¨ãã¦ï½ï¼ï¼ï¼ï¼ï½ï¼âï¼ï¼
ï¼ãç¨ãããã¨ãã§ãããÎ = S- (n-ai) In this equation, n and a are constants and a> 0, and S is the intensity of the convolved Bark spectrum.
i) is the tolerance function. As an example, n = 38, a = -0.
5 can be used.
ãï¼ï¼ï¼ï¼ããã®ããã«ãã¦ãä¸è¨ã¬ãã«Î±ãæ±ãã
ãããã®ãã¼ã¿ã¯ãå²ç®å¨ï¼ï¼ï¼ã«ä¼éããããå½è©²å²
ç®å¨ï¼ï¼ï¼ã§ã¯ãä¸è¨ç³è¾¼ã¿ãããé åã§ã®ä¸è¨ã¬ãã«
αãéã³ã³ããªã¥ã¼ã·ã§ã³ããããã®ãã®ã§ããããã
ãã£ã¦ããã®éã³ã³ããªã¥ã¼ã·ã§ã³å¦çãè¡ããã¨ã«ã
ããä¸è¨ã¬ãã«Î±ãããã¹ãã³ã°ã¹ã¬ãã·ã§ã¼ã«ããå¾
ãããããã«ãªããããªãã¡ããã®ãã¹ãã³ã°ã¹ã¬ãã·
ã§ã¼ã«ãã許容ãã¤ãºã¹ãã¯ãã«ã¨ãªãããªããä¸è¨é
ã³ã³ããªã¥ã¼ã·ã§ã³å¦çã¯ãè¤éãªæ¼ç®ãå¿
è¦ã¨ãã
ããæ¬å®æ½ä¾ã§ã¯ç°¡ç¥åããå²ç®å¨ï¼ï¼ï¼ãç¨ãã¦éã³
ã³ããªã¥ã¼ã·ã§ã³ãè¡ã£ã¦ãããIn this way, the level α is obtained, and this data is transmitted to the divider 526. The divider 526 is for deconvolution of the level α in the convolved area. Therefore, by performing the inverse convolution processing, the masking threshold can be obtained from the level α. That is, this masking threshold becomes the allowable noise spectrum. Although the above-mentioned inverse convolution processing requires complicated calculation, in this embodiment, the inverse convolution is performed using the simplified divider 526.
ãï¼ï¼ï¼ï¼ã次ã«ãä¸è¨ãã¹ãã³ã°ã¹ã¬ãã·ã§ã¼ã«ã
ã¯ãåæåè·¯ï¼ï¼ï¼ãä»ãã¦æ¸ç®å¨ï¼ï¼ï¼ã«ä¼éãã
ããããã§ãå½è©²æ¸ç®å¨ï¼ï¼ï¼ã«ã¯ãä¸è¨å¸¯åæ¯ã®ã¨ã
ã«ã®æ¤åºåè·¯ï¼ï¼ï¼ããã®åºåãããªãã¡åè¿°ãããã¼
ã¯ã¹ãã¯ãã«ï¼³ï¼¢ããé
å»¶åè·¯ï¼ï¼ï¼ãä»ãã¦ä¾çµ¦ãã
ã¦ããããããã£ã¦ããã®æ¸ç®å¨ï¼ï¼ï¼ã§ä¸è¨ãã¹ãã³
ã°ã¹ã¬ãã·ã§ã¼ã«ãã¨ãã¼ã¯ã¹ãã¯ãã«ï¼³ï¼¢ã¨ã®æ¸ç®æ¼
ç®ãè¡ããããã¨ã§ãå³ï¼ã«ç¤ºãããã«ãä¸è¨ãã¼ã¯ã¹
ãã¯ãã«ï¼³ï¼¢ã¯ãå½è©²ãã¹ãã³ã°ã¹ã¬ãã·ã§ã¼ã«ãï¼ï¼³
ã®ã¬ãã«ã§ç¤ºãã¬ãã«ä»¥ä¸ããã¹ãã³ã°ããããã¨ã«ãª
ãããªããä¸è¨é
å»¶åè·¯ï¼ï¼ï¼ã¯ãä¸è¨åæåè·¯ï¼ï¼ï¼
以åã®ååè·¯ã§ã®é
å»¶éãèæ
®ãã¦ã¨ãã«ã®æ¤åºåè·¯ï¼
ï¼ï¼ããã®ãã¼ã¯ã¹ãã¯ãã«ï¼³ï¼¢ãé
å»¶ãããããã«è¨
ãããã¦ãããNext, the masking threshold is transmitted to the subtractor 528 via the synthesizing circuit 527. Here, the output from the energy detection circuit 522 for each band, that is, the above-described Bark spectrum SB is supplied to the subtractor 528 via the delay circuit 529. Therefore, the subtractor 528 performs a subtraction operation on the masking threshold and the Bark spectrum SB, so that the Bark spectrum SB shows the masking threshold MS as shown in FIG.
The level below the level indicated by will be masked. The delay circuit 529 is the same as the synthesis circuit 527.
Energy detection circuit 5 considering the delay amount in each circuit before
It is provided to delay the Bark spectrum SB from 22.
ãï¼ï¼ï¼ï¼ãå½è©²æ¸ç®å¨ï¼ï¼ï¼ããã®åºåã¯ã許容éé³
è£æ£åè·¯ï¼ï¼ï¼ãä»ããåºå端åï¼ï¼ï¼ãä»ãã¦åãåº
ãããä¾ãã°é
åãããæ°æ
å ±ãäºãè¨æ¶ãããROï¼
çï¼å³ç¤ºããï¼ã«éãããããã®ï¼²ï¼¯ï¼çã¯ãä¸è¨æ¸ç®
åè·¯ï¼ï¼ï¼ãã許容éé³è£æ£åè·¯ï¼ï¼ï¼ãä»ãã¦å¾ãã
ãåºåï¼ä¸è¨åãã³ãã®ã¨ãã«ã®ã¨ä¸è¨ãã¤ãºã¬ãã«è¨
å®ææ®µã®åºåã¨ã®å·®åã®ã¬ãã«ï¼ã«å¿ããåãã³ãæ¯ã®
é
åãããæ°æ
å ±ãåºåãããThe output from the subtractor 528 is taken out via the allowable noise correction circuit 530 and the output terminal 531. For example, the ROM in which the distribution bit number information is stored in advance.
Etc. (not shown). The ROM or the like is distributed for each band according to the output (the level of the difference between the energy of each band and the output of the noise level setting means) obtained from the subtraction circuit 528 through the allowable noise correction circuit 530. Outputs bit number information.
ãï¼ï¼ï¼ï¼ããã®ããã«ãã¦ã¨ãã«ã®ä¾åãããã¨è´è¦
許容éé³ã¬ãã«ã«ä¾åãããããã¯å ç®ããã¦ãã®é
å
ãããæ°æ
å ±ãä¸è¨é©å¿ãããå²å½ç¬¦å·ååè·¯ï¼ï¼ãï¼
ï¼ã«éããããã¨ã§ãããã§ï¼ï¼¤ï¼£ï¼´åè·¯ï¼ï¼ãï¼ï¼ã
ãã®å¨æ³¢æ°é åã®åã¹ãã¯ãã«ãã¼ã¿ãããããã®ãã³
ãæ¯ã«å²ãå½ã¦ããããããæ°ã§éååãããããã§ã
ããIn this way, the energy-dependent bits and the bits depending on the permissible noise level of the hearing are added, and the information on the number of allocated bits thereof is the adaptive bit allocation coding circuits 16-1.
By sending the data to the M.C.R.8, the spectrum data in the frequency domain from the MDCT circuits 13 to 15 are quantized by the number of bits assigned to each band.
ãï¼ï¼ï¼ï¼ãããªãã¡è¦ç´ããã°ãä¸è¨é©å¿ãããå²å½
符å·ååè·¯ï¼ï¼ãï¼ï¼ã§ã¯ãä¸è¨ã¯ãªãã£ã«ã«ãã³ãã®
åãã³ã叝忝ï¼ã¯ãªãã£ã«ã«ãã³ãæ¯ï¼è¥ããã¯é«å
ã«ããã¦ã¯å½è©²ã¯ãªãã£ã«ã«ãã³ããæ´ã«è¤æ°å¸¯åã«å
å²ãã帯åã®ã¨ãã«ã®è¥ããã¯ãã¼ã¯å¤ã¨ãä¸è¨ãã¤ãº
ã¬ãã«è¨å®ææ®µã®åºåã¨ã®å·®åã®ã¬ãã«ã«å¿ãã¦é
åã
ãããããæ°ã§ä¸è¨åãã³ãæ¯ã®ã¹ãã¯ãã«ãã¼ã¿ãé
ååãããã¨ã«ãªããIn summary, in the adaptive bit allocation encoding circuits 16 to 18, in each band band (each critical band) of the critical band or in the high band, the critical band is further divided into a plurality of bands. The spectrum data for each band is quantized by the number of bits distributed according to the level of the difference between the energy or peak value and the output of the noise level setting means.
ãï¼ï¼ï¼ï¼ãã¨ããã§ãä¸è¿°ããåæåè·¯ï¼ï¼ï¼ã§ã®å
æã®éã«ã¯ãæå°å¯è´ã«ã¼ãçºçåè·¯ï¼ï¼ï¼ããä¾çµ¦ã
ããå³ï¼ã«ç¤ºããããªäººéã®è´è¦ç¹æ§ã§ããããããæ
å°å¯è´ã«ã¼ãRCã示ããã¼ã¿ã¨ãä¸è¨ãã¹ãã³ã°ã¹ã¬
ãã·ã§ã¼ã«ãï¼ï¼³ã¨ãåæãããã¨ãã§ããããã®æå°
å¯è´ã«ã¼ãã«ããã¦ãéé³çµ¶å¯¾ã¬ãã«ããã®æå°å¯è´ã«
ã¼ã以ä¸ãªãã°è©²éé³ã¯èãããªããã¨ã«ãªãããã®æ
å°å¯è´ã«ã¼ãã¯ãã³ã¼ãã£ã³ã°ãåãã§ãã£ã¦ãä¾ãã°
åçæã®åçããªã¥ã¼ã ã®éãã§ç°ãªããã®ã¨ãªããã
ç¾å®çãªãã£ã¸ã¿ã«ã·ã¹ãã ã§ã¯ãä¾ãã°ï¼ï¼ãããã
ã¤ãããã¯ã¬ã³ã¸ã¸ã®é³æ¥½ã®ã¯ããæ¹ã«ã¯ãã»ã©éãã
ãªãã®ã§ãä¾ãã°ï¼ï½ï¼¨ï½ä»è¿ã®æãè³ã«èããããã
卿³¢æ°å¸¯åã®éååéé³ãèãããªãã¨ããã°ãä»ã®å¨
æ³¢æ°å¸¯åã§ã¯ãã®æå°å¯è´ã«ã¼ãã®ã¬ãã«ä»¥ä¸ã®éåå
éé³ã¯èãããªãã¨èããããããããã£ã¦ããã®ãã
ã«ä¾ãã°ã·ã¹ãã ã®æã¤ãã¤ãããã¯ã¬ã³ã¸ã®ï¼ï½ï¼¨ï½
ä»è¿ã®éé³ãèãããªãä½¿ãæ¹ãããã¨ä»®å®ãããã®æ
å°å¯è´ã«ã¼ãRCã¨ãã¹ãã³ã°ã¹ã¬ãã·ã§ã¼ã«ãï¼ï¼³ã¨
ãå
±ã«åæãããã¨ã§è¨±å®¹ãã¤ãºã¬ãã«ãå¾ãããã«ã
ãã¨ããã®å ´åã®è¨±å®¹ãã¤ãºã¬ãã«ã¯ãå³ï¼ä¸ã®æç·ã§
示ãé¨åã¾ã§ã¨ãããã¨ãã§ããããã«ãªãããªããæ¬
宿½ä¾ã§ã¯ãä¸è¨æå°å¯è´ã«ã¼ãã®ï¼ï½ï¼¨ï½ã®ã¬ãã«
ããä¾ãã°ï¼ï¼ãããç¸å½ã®æä½ã¬ãã«ã«åããã¦ã
ããã¾ãããã®å³ï¼ã¯ãä¿¡å·ã¹ãã¯ãã«ï¼³ï¼³ãåæã«ç¤º
ãã¦ãããBy the way, at the time of synthesizing by the above-mentioned synthesizing circuit 527, data showing a so-called minimum audible curve RC which is the human auditory characteristic as shown in FIG. The masking threshold MS can be combined. In this minimum audible curve, if the absolute noise level is below this minimum audible curve, the noise will not be heard. This minimum audible curve is different even if the coding is the same, for example, due to the difference in playback volume during playback,
In a realistic digital system, there is not much difference in how music is put into a 16-bit dynamic range, so if the quantization noise in the most audible frequency band around 4 kHz is inaudible, for example, other It is considered that quantization noise below the level of this minimum audible curve is inaudible in the frequency band. Therefore, for example, the dynamic range of the system is 4 kHz.
Assuming that the noise is not heard in the vicinity, and the allowable noise level is obtained by synthesizing the minimum audible curve RC and the masking threshold MS together, the allowable noise level in this case is shown in FIG. It will be possible to go up to the part shown by the diagonal line of. In this embodiment, the level of 4 kHz of the minimum audible curve is set to the minimum level equivalent to 20 bits, for example. Further, FIG. 9 also shows the signal spectrum SS at the same time.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨è¨±å®¹éé³è£æ£åè·¯ï¼ï¼ï¼ã§ã¯ã
è£æ£æ
å ±åºååè·¯ï¼ï¼ï¼ããéããã¦ããä¾ãã°çã©ã¦
ããã¹ã«ã¼ãã®æ
å ±ã«åºã¥ãã¦ãä¸è¨æ¸ç®å¨ï¼ï¼ï¼ãã
ã®åºåã«ããã許容éé³ã¬ãã«ãè£æ£ãã¦ããããã
ã§ãçã©ã¦ããã¹ã«ã¼ãã¨ã¯ã人éã®è´è¦ç¹æ§ã«é¢ãã
ç¹æ§æ²ç·ã§ãããä¾ãã°ï¼ï½ï¼¨ï½ã®ç´é³ã¨åã大ããã«
èãããå卿³¢æ°ã§ã®é³ã®é³å§ãæ±ãã¦æ²ç·ã§çµãã ã
ã®ã§ãã©ã¦ããã¹ã®çæåº¦æ²ç·ã¨ãå¼ã°ãããã¾ããã®
çã©ã¦ããã¹æ²ç·ã¯ãå³ï¼ã«ç¤ºããæå°å¯è´ã«ã¼ãRC
ã¨ç¥åãæ²ç·ãæããã®ã§ããããã®çã©ã¦ããã¹æ²ç·
ã«ããã¦ã¯ãä¾ãã°ï¼ï½ï¼¨ï½ä»è¿ã§ã¯ï¼ï½ï¼¨ï½ã®ã¨ãã
ããé³å§ãï¼ãï¼ï¼ï½ï¼¢ä¸ãã£ã¦ãï¼ï½ï¼¨ï½ã¨åã大ã
ãã«èãããéã«ãï¼ï¼ï¼¨ï½ä»è¿ã§ã¯ï¼ï½ï¼¨ï½ã§ã®é³å§
ãããç´ï¼ï¼ï½ï¼¢é«ããªãã¨åã大ããã«èãããªãã
ãã®ãããä¸è¨æå°å¯è´ã«ã¼ãã®ã¬ãã«ãè¶ããéé³
ï¼è¨±å®¹ãã¤ãºã¬ãã«ï¼ã¯ããã®çã©ã¦ããã¹æ²ç·ã«å¿ã
ãã«ã¼ãã§ä¸ãããã卿³¢æ°ç¹æ§ãæã¤ããã«ããã®ã
è¯ããã¨ããããããã®ãããªãã¨ãããä¸è¨çã©ã¦ã
ãã¹æ²ç·ãèæ
®ãã¦ä¸è¨è¨±å®¹ãã¤ãºã¬ãã«ãè£æ£ããã
ã¨ã¯ã人éã®è´è¦ç¹æ§ã«é©åãã¦ãããã¨ãããããFurther, in the allowable noise correction circuit 530,
The allowable noise level in the output from the subtractor 528 is corrected based on the information of the equal loudness curve sent from the correction information output circuit 533, for example. Here, the equal loudness curve is a characteristic curve relating to human auditory characteristics, for example, a curve obtained by obtaining the sound pressure of sound at each frequency that sounds the same as a pure tone of 1 kHz, and connecting the curves. Also called sensitivity curve. Further, this equal loudness curve is the minimum audible curve RC shown in FIG.
It draws almost the same curve as. In this equal loudness curve, for example, at 4 kHz, even if the sound pressure drops by 8 to 10 dB from 1 kHz, it sounds as loud as 1 kHz, and conversely, at around 50 Hz, it must be about 15 dB higher than the sound pressure at 1 kHz. It doesn't sound the same.
Therefore, it is understood that the noise exceeding the level of the minimum audible curve (allowable noise level) should have the frequency characteristic given by the curve corresponding to the equal loudness curve. From this, it can be seen that correcting the permissible noise level in consideration of the equal loudness curve is suitable for human hearing characteristics.
ãï¼ï¼ï¼ï¼ã以ä¸è¿°ã¹ãè´è¦è¨±å®¹éé³ã¬ãã«ã«ä¾åãã
ã¹ãã¯ãã«å½¢ç¶ã使ç¨å¯è½ç·ãããï¼ï¼ï¼ï¼«ï½ï½ï½ã®å
ã®ããå²åãç¨ãããããé
åã§ã¤ããããã®å²åã¯å
¥
åä¿¡å·ã®ãã¼ããªãã£ãé«ããªãã»ã©æ¸å°ãããThe spectrum shape depending on the permissible hearing noise level described above is created by bit allocation using a certain ratio of the total usable bits of 128 Kbps. This ratio decreases as the tonality of the input signal increases.
ãï¼ï¼ï¼ï¼ã次ã«ï¼ã¤ã®ãããé
åææ³ã®éã§ã®ããã
éå岿æ³ã«ã¤ãã¦èª¬æãããNext, a bit amount dividing method between the two bit allocation methods will be described.
ãï¼ï¼ï¼ï¼ãå³ï¼ã«æ»ã£ã¦ãï¼ï¼¤ï¼£ï¼´åè·¯åºåãä¾çµ¦ã
ããå
¥å端åï¼ï¼ï¼ããã®ä¿¡å·ã¯ãã¹ãã¯ãã«ã®æ»ãã
ãç®åºåè·¯ï¼ï¼ï¼ã«ãä¸ããããããã§ã¹ãã¯ãã«ã®æ»
ããããç®åºããããæ¬å®æ½ä¾ã§ã¯ãä¿¡å·ã¹ãã¯ãã«ã®
絶対å¤ã®é£æ¥å¤éã®å·®ã®çµ¶å¯¾å¤ã®åãä¿¡å·ã¹ãã¯ãã«ã®
絶対å¤ã®åã§å²ã£ãå¤ããä¸è¨ã¹ãã¯ãã«ã®æ»ãããã¨
ãã¦ç®åºãã¦ãããReturning to FIG. 3, the signal from the input terminal 801 to which the output of the MDCT circuit is supplied is also given to the spectrum smoothness calculating circuit 808, and the smoothness of the spectrum is calculated here. In this embodiment, a value obtained by dividing the sum of the absolute values of the differences between the adjacent values of the signal spectrum by the sum of the absolute values of the signal spectrum is calculated as the smoothness of the spectrum.
ãï¼ï¼ï¼ï¼ãä¸è¨ã¹ãã¯ãã«ã®æ»ãããç®åºåè·¯ï¼ï¼ï¼
ã®åºåã¯ããããåå²ç決å®åè·¯ï¼ï¼ï¼ã«ä¸ããããã
ãã§ã¨ãã«ã®ä¾åã®ãããé
åã¨ãè´è¦è¨±å®¹éé³ã¹ãã¯
ãã«ã«ãããããé
åéã®ãããåå²çã¨ã決å®ãã
ãããããåå²çã¯ã¹ãã¯ãã«ã®æ»ãããç®åºåè·¯ï¼ï¼
ï¼ã®åºåå¤ã大ããã»ã©ãã¹ãã¯ãã«ã®æ»ããããç¡ã
ã¨èãã¦ãã¨ãã«ã®ä¾åã®ãããé
åããããè´è¦è¨±å®¹
éé³ã¹ãã¯ãã«ã«ãããããé
åã«éç¹ããããããã
é
åãè¡ãããããåå²ç決å®åè·¯ï¼ï¼ï¼ã¯ããããã
ã¨ãã«ã®ä¾åã®ãããé
ååã³è´è¦è¨±å®¹éé³ã¹ãã¯ãã«
ã«ãããããé
åã®å¤§ãããã³ã³ããã¼ã«ãããã«ãã
ã©ã¤ã¤ï¼ï¼ï¼åã³ï¼ï¼ï¼ã«å¯¾ãã¦ã³ã³ããã¼ã«åºåãé
ããããã§ãä»®ã«ã¹ãã¯ãã«ãæ»ããã§ãããã¨ãã«ã®
ä¾åã®ãããé
åã«éããããããã«ããã«ããã©ã¤ã¤
ï¼ï¼ï¼ã¸ã®ãããåå²ç決å®åè·¯ï¼ï¼ï¼ã®åºåãï¼ï¼ï¼
ã®å¤ãåã£ãã¨ãããã«ããã©ã¤ã¤ï¼ï¼ï¼ã¸ã®ãããå
å²ç決å®åè·¯ï¼ï¼ï¼ã®åºåã¯ï¼âï¼ï¼ï¼ï¼ï¼ï¼ï¼ã¨ã
ãããããï¼ã¤ã®ãã«ããã©ã¤ã¤ã®åºåã¯å ç®å¨ï¼ï¼ï¼
ã§è¶³ãåãããã¦æçµçãªãããé
åæ
å ±ã¨ãªã£ã¦ãåº
å端åï¼ï¼ï¼ããåºåããããThe above-mentioned spectrum smoothness calculation circuit 808
Is supplied to the bit division rate determination circuit 809, and the bit division rate between the energy-dependent bit allocation and the bit allocation according to the perceptual noise spectrum is determined. The bit division rate is the smoothness calculation circuit 80 of the spectrum.
It is considered that the larger the output value of 8, the smoother the spectrum is, and the bit allocation is performed with more emphasis on the bit allocation by the perceptual noise spectrum than the energy-dependent bit allocation. The bit division ratio determination circuit 809 sends control outputs to multipliers 811 and 812 which control the amount of bit distribution depending on the energy and the permissible noise spectrum of the auditory sense, respectively. Here, if the spectrum is smooth and the output of the bit division rate determination circuit 809 to the multiplier 811 is 0.8 so as to emphasize the energy-dependent bit allocation.
, The output of the bit division rate determination circuit 809 to the multiplier 812 is 1-0.8 = 0.2. The outputs of these two multipliers are the adders 806.
Are added together to form final bit allocation information, which is output from the output terminal 807.
ãï¼ï¼ï¼ï¼ããã®ã¨ãã®ãããé
åã®æ§åãå³ï¼ï¼ãå³
ï¼ï¼ã«ç¤ºããã¾ããããã«å¯¾å¿ããéååéé³ã®æ§åã
å³ï¼ï¼ãå³ï¼ï¼ã«ç¤ºããå³ï¼ï¼ã¯ä¿¡å·ã®ã¹ãã¯ãã«ãå²
åå¹³å¦ã§ããå ´åã示ãã¦ãããå³ï¼ï¼ã¯ä¿¡å·ã¹ãã¯ã
ã«ãé«ããã¼ããªãã£ã示ãå ´åã示ãã¦ãããã¾ãã
å³ï¼ï¼åã³å³ï¼ï¼ã®å³ä¸ï¼±ï¼³ã¯ä¿¡å·ã¬ãã«ä¾ååã®ãã
ãéã示ããå³ä¸ï¼±ï¼®ã¯è´è¦è¨±å®¹éé³ã¬ãã«ä¾åã®ãã
ãå²å½åã®ãããéã示ãã¦ãããå³ï¼ï¼åã³å³ï¼ï¼ã®
å³ä¸ï¼¬ã¯ä¿¡å·ã¬ãã«ã示ããå³ä¸ï¼®ï¼³ã¯ä¿¡å·ã¬ãã«ä¾å
åã«ããéé³ä½ä¸åããå³ä¸ï¼®ï¼®ã¯è´è¦è¨±å®¹éé³ã¬ãã«
ä¾åã®ãããå²å½åã«ããéé³ä½ä¸åã示ãã¦ãããThe state of bit allocation at this time is shown in FIGS. 12 and 13 show the state of quantization noise corresponding to this. FIG. 10 shows the case where the spectrum of the signal is flat, and FIG. 11 shows the case where the signal spectrum shows high tonality. Also,
In FIGS. 10 and 11, QS indicates the bit amount corresponding to the signal level, and QN in the diagrams indicates the bit amount corresponding to the bit allocation depending on the permissible hearing noise level. 12 and 13, L represents the signal level, NS in the figures represents the noise reduction due to the signal level dependency, and NN in the figures represents the noise reduction due to the permissible noise level dependent bit allocation. .
ãï¼ï¼ï¼ï¼ãå
ããä¿¡å·ã®ã¹ãã¯ãã«ããå²åå¹³å¦ã§ã
ãå ´åã示ãå³ï¼ï¼ã«ããã¦ãè´è¦è¨±å®¹éé³ã¬ãã«ã«ä¾
åãããããé
åã¯ãå
¨å¸¯åã«æ¸¡ã大ããä¿¡å·é鳿¯ã
åãããã«å½¹ç«ã¤ããããä½ååã³é«åã§ã¯æ¯è¼çå°ãª
ããããé
åã使ç¨ããã¦ãããããã¯è´è¦çã«ãã®å¸¯
åã®éé³ã«å¯¾ããæåº¦ãå°ããããã§ãããä¿¡å·ã¨ãã«
ã®ã¬ãã«ã«ä¾åãããããé
åã®åã¯éã¨ãã¦ã¯å°ãªã
ãããã¯ã¤ããªéé³ã¹ãã¯ãã«ãçããããã«ããã®å ´
åã«ã¯ä¸ä½åã®ä¿¡å·ã¬ãã«ã®é«ã卿³¢æ°é åã«éç¹çã«
é
åããã¦ãããFirst, in FIG. 12, which shows a case where the spectrum of the signal is flat, the bit allocation depending on the permissible noise level of the hearing aids in obtaining a large signal-to-noise ratio over the entire band. However, relatively low bit allocations are used in the low and high frequencies. This is because auditory sensitivity to noise in this band is low. The bit allocation depending on the signal energy level is small in amount, but in this case, it is concentrated in the high frequency region of the signal level in the middle and low frequencies so as to generate a white noise spectrum.
ãï¼ï¼ï¼ï¼ãããã«å¯¾ãã¦ãå³ï¼ï¼ã«ç¤ºãããã«ãä¿¡å·
ã¹ãã¯ãã«ãé«ããã¼ããªãã£ã示ãå ´åã«ã¯ãä¿¡å·ã¨
ãã«ã®ã¬ãã«ã«ä¾åãããããé
åéãå¤ããªããéå
åéé³ã®ä½ä¸ã¯æ¥µãã¦çã帯åã®éé³ã使¸ããããã«
使ç¨ããããè´è¦è¨±å®¹éé³ã¬ãã«ã«ä¾åãããããé
å
ã®éä¸ã¯ããããããã¤ããªããOn the other hand, as shown in FIG. 13, when the signal spectrum shows a high tonality, the amount of bit allocation depending on the signal energy level increases, and the quantization noise is reduced in a very narrow band. Used to reduce The concentration of bit allocation depending on the permissible noise level of the hearing is less tight than this.
ãï¼ï¼ï¼ï¼ãå³ï¼ï¼ã«ç¤ºãããã«ããã®ä¸¡è
ã®ãããé
åã®åã«ãããå¤ç«ã¹ãã¯ãã«å
¥åä¿¡å·ã§ã®ç¹æ§ã®åä¸
ãéæããããAs shown in FIG. 13, the improvement of the characteristics in the isolated spectrum input signal is achieved by the sum of the bit allocations of the both.
ãï¼ï¼ï¼ï¼ãå³ï¼ï¼ã¯ãå³ï¼ã«ç¤ºããé«è½ç符å·åè£
ç½®
ã«ãã£ã¦ç¬¦å·åãããä¿¡å·ããåã³å¾©å·åããããã®åº
æ¬çãªæ¬çºæå®æ½ä¾ã®é«è½ç復å·åè£
ç½®ã示ãã¦ãããFIG. 14 shows a basic high efficiency decoding apparatus of the present invention for decoding the signal coded by the high efficiency coding apparatus shown in FIG. 1 again.
ãï¼ï¼ï¼ï¼ããã®å³ï¼ï¼ã«ããã¦ãå帯åã®éååãã
ãï¼ï¼¤ï¼£ï¼´ä¿æ°ã¯å¾©å·åè£
ç½®å
¥å端åï¼ï¼ï¼ãï¼ï¼ï¼ã
ï¼ï¼ï¼ã«ä¸ãããã使ç¨ããããããã¯ãµã¤ãºæ
å ±ã¯å
¥
å端åï¼ï¼ï¼ãï¼ï¼ï¼ãï¼ï¼ï¼ã«ä¸ããããã復å·åå
è·¯ï¼ï¼ï¼ãï¼ï¼ï¼ãï¼ï¼ï¼ã§ã¯é©å¿ãããé
åæ
å ±ãç¨
ãã¦ãããå²å½ãè§£é¤ãããIn FIG. 14, the quantized MDCT coefficients of each band are the decoding device input terminals 122 and 124,
The block size information given to 126 and used is given to the input terminals 123, 125, 127. The decoding circuits 116, 117 and 118 cancel the bit allocation using the adaptive bit allocation information.
ãï¼ï¼ï¼ï¼ã次ã«ãIï¼ï¼¤ï¼£ï¼´ï¼é夿´é¢æ£ã³ãµã¤ã³å¤
æï¼åè·¯ï¼ï¼ï¼ãï¼ï¼ï¼ãï¼ï¼ï¼ã§ã¯å¨æ³¢æ°é åã®ä¿¡å·
ãæéé åã®ä¿¡å·ã«å¤æãããããããã®é¨å帯åã®æ
éé åä¿¡å·ã¯ãIQï¼ï¼¦ï¼éï¼±ï¼ï¼¦ï¼åè·¯ï¼ï¼ï¼ãï¼ï¼
ï¼ã«ãããå
¨ä½åä¿¡å·ã«å¾©å·åãããåºå端åï¼ï¼ï¼ã¸
éããããNext, the IMDCT (Inverse Change Discrete Cosine Transform) circuits 113, 114 and 115 convert the frequency domain signals into time domain signals. The time domain signals of these sub-bands are IQMF (inverse QMF) circuits 112 and 11
1, it is decoded into a full-range signal and sent to the output terminal 110.
ãï¼ï¼ï¼ï¼ã次ã«ãæ¬çºæå®æ½ä¾ã®ä¼éåªä½ã¯ãä¸è¿°ã
ããããªæ¬çºæå®æ½ä¾ã®é«è½ç符å·åè£
ç½®ã«ãã符å·å
ãããä¿¡å·ãä¼éãããã®ã§ãããããã§ã®ä¼éã«ã¯è¨
é²ãå«ã¾ãããããã£ã¦ãæ¬çºæã®è¨é²åªä½ï¼è¨é²ã¡ã
ã£ã¢ï¼ã¨ãã¦ã¯ãä¾ãã°ãå
ãã£ã¹ã¯ï¼å
ç£æ°ãã£ã¹
ã¯ï¼ç£æ°ãã£ã¹ã¯çã®ãã£ã¹ã¯ç¶ã®è¨é²åªä½ã«ä¸è¨ç¬¦å·
åä¿¡å·ãè¨é²ããããã®ããç£æ°ãã¼ãçã®ãã¼ãç¶è¨
é²åªä½ã«ä¸è¨ç¬¦å·åä¿¡å·ãè¨é²ããããã®ãæãã¯ã符
å·åä¿¡å·ãè¨æ¶ãããåå°ä½ã¡ã¢ãªï¼ï¼©ï¼£ã«ã¼ããªã©ã
æãããã¨ãã§ãããã¾ããæ¬çºæã®ä¼éåªä½ã¨ãã¦ä¼
éç³»ãæããå ´åããã®ä¼éåªä½ï¼ä¼éã¡ãã£ã¢ï¼ã¨ã
ã¦ã¯ãä¾ãã°é»ç·è¥ããã¯å
ã±ã¼ãã«ã黿³¢çãæãã
ãã¨ãã§ãããNext, the transmission medium of the embodiment of the present invention transmits the signal coded by the high-efficiency encoder of the embodiment of the present invention as described above. Therefore, the recording medium (recording medium) of the present invention includes, for example, a disc-shaped recording medium such as an optical disc, a magneto-optical disc, and a magnetic disc in which the above-mentioned encoded signal is recorded, and a magnetic tape. Examples thereof include a tape-shaped recording medium in which the above-mentioned coded signal is recorded, a semiconductor memory in which the coded signal is stored, an IC card, and the like. When a transmission system is used as the transmission medium of the present invention, examples of the transmission medium (transmission medium) include electric wires, optical cables, and radio waves.
ãï¼ï¼ï¼ï¼ããªããæ¬çºæã¯ãã®å®æ½ä¾ã«ã®ã¿éå®ãã
ããã®ã§ã¯ãªããä¾ãã°ãä¸è¨ã®è¨é²åçåªä½ã¨ä¸è¨ä»
ã®è¨é²åçåªä½ã¨ã¯ä¸ä½åããã¦ããå¿
è¦ã¯ãªãããã®
éããã¼ã¿è»¢éç¨åç·çã§çµã¶ãã¨ãå¯è½ã§ãããæ´ã«
ä¾ãã°ããªã¼ãã£ãªï¼°ï¼£ï¼ä¿¡å·ã®ã¿ãªããããã£ã¸ã¿ã«
é³å£°ï¼ã¹ãã¼ãï¼ä¿¡å·ããã£ã¸ã¿ã«ãããªä¿¡å·çã®ä¿¡å·
å¦çè£
ç½®ã«ãé©ç¨å¯è½ã§ãããã¾ããä¸è¿°ããæå°å¯è´
ã«ã¼ãã®åæå¦çãè¡ããªãæ§æã¨ãã¦ãããããã®å ´
åã«ã¯ãå³ï¼ä¸ã®æå°å¯è´ã«ã¼ãçºçåè·¯ï¼ï¼ï¼ãåæ
åè·¯ï¼ï¼ï¼ãä¸è¦ã¨ãªããä¸è¨å¼ç®å¨ï¼ï¼ï¼ããã®åºå
ã¯ãç´ã¡ã«æ¸ç®å¨ï¼ï¼ï¼ã¸ä¼éããããã¨ã«ãªããThe present invention is not limited to this embodiment. For example, the recording / reproducing medium described above and the other recording / reproducing medium do not need to be integrated, and data is transferred between them. It is also possible to connect by a line or the like. Furthermore, for example, the present invention can be applied not only to audio PCM signals but also to signal processing devices for digital audio (speech) signals, digital video signals, and the like. Further, the above-described minimum audible curve synthesizing process may not be performed. In this case, the minimum audible curve generating circuit 532 and the synthesizing circuit 527 in FIG. 6 are unnecessary, and the output from the subtractor 526 is immediately transmitted to the subtractor 528.
ãï¼ï¼ï¼ï¼ãã¾ãããããé
åææ³ã¯å¤ç¨®å¤æ§ã§ããã
æãç°¡åã«ã¯åºå®ã®ãããé
åãããã¯ä¿¡å·ã®å帯åã¨
ãã«ã®ã«ããç°¡åãªãããé
åãããã¯åºå®åã¨å¯å¤å
ãçµã¿åããããããé
åãªã©ã使ããã¨ãå¯è½ã§ã
ããThere are various bit allocation methods,
In the simplest case, it is possible to use fixed bit allocation, simple bit allocation by each band energy of the signal, or bit allocation combining fixed and variable components.
ãï¼ï¼ï¼ï¼ãã¾ããå
¥åä¿¡å·ãåé³é¢ä¿ãæã¤ãå¦ãã
夿ãããããé
åãå¤ããææ³ãèãããããä¾ã
ã°ãå³ï¼ä¸ã®å鳿忤åºåè·¯ï¼ï¼ï¼ã«ããã¦æç¢ºãªå
é³é¢ä¿ãæ¤åºã§ããªãã£ãå ´åãã¨ãã«ã®ä¾åã®ããã
é
ååè·¯ï¼ï¼ï¼åã³ããã端æ°èª¿æ´åè·¯ï¼ï¼ï¼ã¸ã®åºå
ãè¡ããªãæ§æã«ãããã¨ãå¯è½ã§ãããA method is also conceivable in which it is determined whether or not the input signal has a harmonic relationship and the bit allocation is changed. For example, when a clear overtone relation cannot be detected by the overtone component detection circuit 813 in FIG. 3, it is possible to adopt a configuration in which output to the energy-dependent bit distribution circuit 804 and the bit fraction adjustment circuit 814 is not performed. .
ãï¼ï¼ï¼ï¼ãæ¬çºæå®æ½ä¾ã¯ã以ä¸ã®ãããªç¨®ã
ã®å¤å½¢
ãèãããããVarious modifications as described above can be considered in the embodiment of the present invention.
ãï¼ï¼ï¼ï¼ã[0078]
ãçºæã®å¹æã以ä¸ã®èª¬æãããæãããªããã«ãæ¬çº
æã«ããã¦ã¯ä»¥ä¸ã®å¹æãå¾ããã¨ãã§ãããããªã
ã¡ãå
¥åä¿¡å·ãæã¤åé³é¢ä¿ãèæ
®ãã¦ãããé
åãè¡
ããã¨ã«ãããå
¥åä¿¡å·ã«å¯¾ãé©å¿çãªãããã®é
åã
è¡ããã¨ãå¯è½ã¨ãªããåä¸ã®ãããã¬ã¼ãã§ã¯ãè´æ
ä¸ãè¯å¥½ãªé³è³ªãå¾ããã¨ãå¯è½ã¨ãªããã¾ããä½ãã
ããã¬ã¼ãã§ã¯ãé³è³ªå£åã鲿¢ãããã¨ãã§ãããAs is clear from the above description, the following effects can be obtained in the present invention. In other words, by allocating bits in consideration of the harmonic relationship of the input signal, it becomes possible to allocate bits adaptively to the input signal, and at the same bit rate, good sound quality can be obtained in terms of hearing. It becomes possible. Further, at a low bit rate, it is possible to prevent sound quality deterioration.
ãå³ï¼ãæ¬çºæå®æ½ä¾ã®é«è½ç符å·åè£
ç½®ã®æ§æä¾ã示
ããããã¯åè·¯å³ã§ãããFIG. 1 is a block circuit diagram showing a configuration example of a high-efficiency coding apparatus according to an embodiment of the present invention.
ãå³ï¼ãæ¬å®æ½ä¾è£
ç½®ã§ã®å¨æ³¢æ°åã³æéé åã«ããã
ç´äº¤å¤æãããã¯ãµã¤ãºã®å
·ä½ä¾ã示ãå³ã§ãããFIG. 2 is a diagram showing a specific example of orthogonal transform block sizes in the frequency and time domains in the device of this embodiment.
ãå³ï¼ãæ¬çºæå®æ½ä¾ã®ãããé
åæ©è½ã®æ§æä¾ã示ã
ãããã¯åè·¯å³ã§ãããFIG. 3 is a block circuit diagram showing a configuration example of a bit allocation function of the embodiment of the present invention.
ãå³ï¼ãå ¥åä¿¡å·ã®åé³é¢ä¿ã示ãå³ã§ãããFIG. 4 is a diagram showing a harmonic relationship of input signals.
ãå³ï¼ãå
¥åä¿¡å·ã®åé³é¢ä¿ãèæ
®ããå ´åã¨èæ
®ããª
ãå ´åã«ããããåãããã¯ã®éååéé³éã®å¤åã示
ãå³ã§ãããFIG. 5 is a diagram showing changes in the amount of quantization noise in each block, with and without consideration of a harmonic relationship of an input signal.
ãå³ï¼ãæ¬çºæå®æ½ä¾ã®è´è¦ãã¹ãã³ã°ã¹ã¬ãã·ã§ã¼ã«
ãç®å®æ©è½ã®æ§æä¾ã示ããããã¯åè·¯å³ã§ãããFIG. 6 is a block circuit diagram showing a configuration example of a hearing masking threshold calculation function of an embodiment of the present invention.
ãå³ï¼ãåè¨ç帯åä¿¡å·ã«ãããã¹ãã³ã°ã示ãå³ã§ã
ããFIG. 7 is a diagram showing masking by each critical band signal.
ãå³ï¼ãåè¨ç帯åä¿¡å·ã«ãããã¹ãã³ã°ã¹ã¬ãã·ã§ã¼
ã«ãã示ãå³ã§ãããFIG. 8 is a diagram showing a masking threshold by each critical band signal.
ãå³ï¼ãæ
å ±ã¹ãã¯ãã«ããã¹ãã³ã°ã¹ã¬ã·ã§ã¼ã«ãã
æå°å¯è´éã示ãå³ã§ãããFIG. 9: Information spectrum, masking threshold,
It is a figure which shows the minimum audible limit.
ãå³ï¼ï¼ãä¿¡å·ã¹ãã¯ãã«ãå¹³å¦ãªæ
å ±ä¿¡å·ã«å¯¾ããä¿¡
å·ã¬ãã«ä¾ååã³è´è¦è¨±å®¹éé³ã¬ãã«ä¾åã®ãããé
å
ã示ãå³ã§ããã10 is a diagram showing signal level-dependent and auditory permissible noise level-dependent bit allocation for an information signal having a flat signal spectrum. FIG.
ãå³ï¼ï¼ãä¿¡å·ã¹ãã¯ãã«ã®ãã¼ããªãã£ãé«ãæ
å ±ä¿¡
å·ã«å¯¾ããä¿¡å·ã¬ãã«ä¾ååã³è´è¦è¨±å®¹éé³ã¬ãã«ä¾å
ã®ãããé
åã示ãå³ã§ãããFIG. 11 is a diagram showing signal level-dependent and auditory permissible noise level-dependent bit allocation for an information signal having a high tonality of a signal spectrum.
ãå³ï¼ï¼ãä¿¡å·ã¹ãã¯ãã«ãå¹³å¦ãªæ
å ±ä¿¡å·ã«å¯¾ããé
ååéé³ã¬ãã«ã示ãå³ã§ãããFIG. 12 is a diagram showing a quantization noise level for an information signal having a flat signal spectrum.
ãå³ï¼ï¼ããã¼ããªãã£ãé«ãæ
å ±ä¿¡å·ã«å¯¾ããéåå
éé³ã¬ãã«ã示ãå³ã§ãããFIG. 13 is a diagram showing a quantization noise level for an information signal having high tonality.
ãå³ï¼ï¼ãæ¬çºæå®æ½ä¾ã®é«è½ç復å·åè£
ç½®ã®æ§æä¾ã
示ããããã¯åè·¯å³ã§ãããFIG. 14 is a block circuit diagram showing a configuration example of a high efficiency decoding device according to an embodiment of the present invention.
ï¼ï¼ï¼ ï¼ï¼¤ï¼£ï¼´åè·¯åºåå ¥å端å ï¼ï¼ï¼ 使ç¨å¯è½ç·ãããçºçåè·¯ ï¼ï¼ï¼ 叝忝ã®ã¨ãã«ã®ç®åºåè·¯ ï¼ï¼ï¼ ã¨ãã«ã®ä¾åã®ãããé ååè·¯ ï¼ï¼ï¼ è´è¦è¨±å®¹éé³ã¬ãã«ä¾åã®ãããé ååè·¯ ï¼ï¼ï¼ å ç®å¨ ï¼ï¼ï¼ å帯åã®ãããå²å½éåºå端å ï¼ï¼ï¼ ã¹ãã¯ãã«ã®æ»ãããç®åºåè·¯ ï¼ï¼ï¼ ãããåå²ç決å®åè·¯ ï¼ï¼ï¼ãï¼ï¼ï¼ ãã«ããã©ã¤ã¤ ï¼ï¼ï¼ å鳿忤åºåè·¯ ï¼ï¼ï¼ ããã端æ°èª¿æ´å路 801 MDCT circuit output input terminal 802 Total usable bit generation circuit 803 Energy calculation circuit for each band 804 Energy-dependent bit allocation circuit 805 Hearing permissible noise level-dependent bit allocation circuit 806 Adder 807 Bit allocation amount output terminal for each band 808 Spectrum smoothness calculation circuit 809 Bit division rate determination circuit 811, 812 Multiplier 813 Overtone component detection circuit 814 Bit fraction adjustment circuit
âââââââââââââââââââââââââââââââââââââââââââââââââââââ ããã³ããã¼ã¸ã®ç¶ã (51)Int.Cl.6 èå¥è¨å· åºå æ´ççªå· FI æè¡è¡¨ç¤ºç®æ Hï¼ï¼ï¼ 7/02 9382â5K 7/30 A 9382â5K Hï¼ï¼ï¼¢ 14/00 ï¼¥ âââââââââââââââââââââââââââââââââââââââââââââââââââ âââ Continuation of the front page (51) Int.Cl. 6 Identification code Office reference number FI Technical indication location H03M 7/02 9382-5K 7/30 A 9382-5K H04B 14/00 E
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4