ãï¼ï¼ï¼ï¼ã[0001]
ãçºæã®å±ããæè¡åéãæ¬çºæã¯ãä¾ãã°ï¼¡ï¼´ï¼²ï¼¡ï¼£
ï¼ã®ãããªå©å¾å¶å¾¡æ¹æ³ãæ¡ç¨ãã符å·åæ¹æ³ã«é©ç¨ã
ããããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³åã³è£
ç½®ã並ã³ã«ç¬¦
å·ååã³å¾©å·åã·ã¹ãã ã«é¢ããç¹ã«ãæ¥æ¿ãªç«ä¸ãã
ä¿¡å·ã§ããã¢ã¿ãã¯ä¿¡å·ã®åã«çºçããããªã¨ã³ã¼éé³
ã¨ãæ¥æ¿ãªç«ä¸ããä¿¡å·ã§ãããªãªã¼ã¹ä¿¡å·ã®å¾ã«çºç
ãããã¹ãã¨ã³ã¼éé³ã¨ãæå§ããããã®ãªã¼ãã£ãªä¿¡
å·ã®ç¬¦å·åæ¹æ³åã³è£
ç½®ã並ã³ã«ç¬¦å·ååã³å¾©å·åã·ã¹
ãã ã«é¢ãããTECHNICAL FIELD The present invention relates to, for example, ATRAC.
The present invention relates to an audio signal encoding method and apparatus and an encoding and decoding system applied to an encoding method adopting a gain control method such as No. 3, and particularly to an audio signal which occurs before an attack signal which is a sharp rising signal. The present invention relates to an audio signal encoding method and apparatus, and an encoding and decoding system for suppressing pre-echo noise and post-echo noise generated after a release signal which is a sharp falling signal.
ãï¼ï¼ï¼ï¼ã[0002]
ã徿¥ã®æè¡ãããªã¨ã³ã¼éé³åã³ãã¹ãã¨ã³ã¼éé³ã«
é¢ããå¤å
¸çãªåé¡ã¯ããªã¼ãã£ãªä¿¡å·ã®å¤æç¬¦å·åã«
ããã¦å
¬ç¥ã§ããï¼ä¾ãã°ãç¹éï¼ï¼ï¼ï¼âï¼ï¼ï¼ï¼ï¼
ï¼å·å
¬å ±åç
§ãï¼ãå³ï¼ï¼ï¼ï½ï¼ã¯å©å¾å¶å¾¡ããããã
ãªã¨ã³ã¼ããã¹ãã¨ã³ã¼ãçºçããã¨ãã®å¾æ¥æè¡ã®ãª
ã¼ãã£ãªä¿¡å·ã®ç¬¦å·ååã³å¾©å·åè£
ç½®ã®æ§æã示ããã
ãã¯å³ã§ãããå³ï¼ï¼ï¼ï½ï¼ã¯å©å¾å¶å¾¡ãè¡ããããªã¨
ã³ã¼ããã¹ãã¨ã³ã¼ãæå§ããã¨ãã®å¾æ¥æè¡ã®ãªã¼ã
ã£ãªä¿¡å·ã®ç¬¦å·ååã³å¾©å·åè£
ç½®ã®æ§æã示ããããã¯
å³ã§ãããBACKGROUND OF THE INVENTION The classical problem of pre-echo noise and post-echo noise is well known in transform coding of audio signals (e.g. JP 2000-25919 A).
See Publication No. 7. ). FIG. 19 (a) is a block diagram showing the configuration of a conventional audio signal encoding / decoding apparatus when pre-echo and post-echo occur without gain control, and FIG. 19 (b) shows gain control. FIG. 3 is a block diagram showing a configuration of a conventional audio signal encoding / decoding device when performing pre-echo and post-echo.
ãï¼ï¼ï¼ï¼ãå³ï¼ï¼ï¼ï½ï¼ã«å³ç¤ºããããå¤å½¢é¢æ£ã³ãµ
ã¤ã³å¤æï¼ä»¥ä¸ãï¼ï¼¤ï¼£ï¼´ã¨ãããï¼å¦çé¨ï¼ï¼ï¼åã³
éååå¨ï¼ï¼ï¼ãåãã符å·åå¨ã¨ãééååå¨ï¼ï¼ï¼
åã³éï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼ï¼ãåãã復å·åå¨ã¨ãåç
§
ããã¨ãã¢ã¿ãã¯ä¿¡å·ï¼ããªãã¡ãæ¥æ¿ãªé³ã®ç«ä¸ãã
ãå«ãä¿¡å·ï¼ãå«ãå
¸åçãªå¤æãããã¯ã符å·åå¨ã«
å
¥åãããã¨ããéååå¨ï¼ï¼ï¼ã¯éé³ãå°å
¥ããæ¬¡ã
ã§ã復å·åå¨ã¯å¾©å
ããããããã¯ã«ããã¦ä¸è¨éé³ã
æéé åã«ã¤ãã¦åçã«åå¸ããããããªãã¡ã復å
ã
ãã夿ãããã¯å
ã«ããªã¨ã³ã¼éé³ã¨ãã¹ãã¨ã³ã¼é
é³ãçºçããç¾è±¡ã¯ãå
é¨ã«ã¢ã¿ãã¯ä¿¡å·ãå«ã夿ã
ããã¯ä¸ã«ãã¢ã¿ãã¯ä¿¡å·ãéååãããã¨ã«ãã£ã¦çº
çããéååéé³ãåçã«åå¸ããããã¨ã«ãã£ã¦å¼ã
èµ·ãããããã¢ã¿ãã¯ä¿¡å·ã¯ãä¿¡å·ã®å¼·åº¦ã®æ¥æ¿ãªå¢å¤§
ã®åã«ãå°ããæ¯å¹
ã®æéãåå¨ããããããã¹ã¦ãå¤
æãããã¯å
ã«åå¨ããã¨ãããã¨ã«ãã£ã¦ç¹å¾´ã¥ãã
ãããã¾ãä¸è¬ã«ãä¿¡å·ã¯æ¥æ¿ãªãµã¼ã¸ã«ç¶ãã¦çæé
ã§ãã¼ãç¶ã«æ¯å¹
ãå°ãããªãããä¿¡å·ã®ãã®æ¥æ¿ãªé³
ã®ç«ä¸ãããå«ãé¨åã¯ãªãªã¼ã¹ä¿¡å·ã¨å¼ã°ãããªãªã¼
ã¹ä¿¡å·ãã¾ããã¢ã¿ãã¯ä¿¡å·ã¨åæ§ã«ãããªã¨ã³ã¼éé³
ã¨ãã¹ãã¨ã³ã¼éé³ãçºçããããAn encoder having a modified discrete cosine transform (hereinafter referred to as MDCT) processing unit 101 and a quantizer 102 shown in FIG. 19A, and an inverse quantizer 103.
And a decoder including the inverse MDCT processing unit 104, quantization when a typical transform block including an attack signal (that is, a signal including a sharp rise of a sound) is input to the encoder. The unit 102 introduces noise, and the decoder then distributes the noise evenly in the time domain in the reconstructed block. In other words, the phenomenon in which pre-echo noise and post-echo noise occur in the restored transform block is that the quantization noise generated by quantizing the attack signal is evenly distributed on the transform block containing the attack signal inside. Caused by. The attack signal is characterized by the presence of periods of small amplitude, all within the transform block, before the sudden increase in signal strength. In general, a signal decreases in amplitude in a taper shape in a short time following a sudden surge, but the part of the signal including this sharp fall of the sound is called a release signal, and the release signal is also called an attack signal. Similarly, pre-echo noise and post-echo noise are generated.
ãï¼ï¼ï¼ï¼ãããªã¨ã³ã¼ã¯ã次ã®ï¼ã¤ã®çç±ã«ãã£ã¦ã
è´è¦çã«ããã¹ãã¨ã³ã¼ãããè³éãã§ããã第ï¼ã«ã
ããªã¨ã³ã¼ã¯ã大ããªå¼·åº¦ã®ã¢ã¿ãã¯ä¿¡å·ã®é¨åã®ä¸ã«
é ãã¦ãã¾ããã¡ãªãã¹ãã¨ã³ã¼ã¨ã¯ç°ãªãããã®å¤§ã
ãããéç³ããå°ããªæ¯å¹
ã®ä¿¡å·ãã¯ããã«è¶
éãã¦ã
ãã第ï¼ã«ã人éã®è³ã§ã¯ããã¹ãã³ã°ãããé³ãã¢ã¿
ãã¯ä¿¡å·ã®é¨åã®æéç徿¹ã«åå¨ããé æ¹åãã¹ãã³
ã°ç¶ç¶æéã¨æ¯è¼ãã¦ããã¹ãã³ã°ãããé³ãã¢ã¿ãã¯
ä¿¡å·ã®é¨åã®æéçåæ¹ã«åå¨ããéæ¹åãã¹ãã³ã°ç¶
ç¶æéããã¯ããã«çãã¨ããç¹å¾´ãæãããå
·ä½çã«
ã¯ãé æ¹åãã¹ãã³ã°ç¶ç¶æéã¯ç´ï¼ï¼ï¼ï½ï½ï½
ï½ã§ã
ããéæ¹åãã¹ãã³ã°ç¶ç¶æéã¯ç´ï¼ï½ï½ï½
ï½ã§ãããPre-echo is due to the following two reasons.
Hearingly, it is more annoying than post-echo. First,
The pre-echo, unlike the post-echo, which tends to be hidden under the portion of the attack signal of high intensity, is much larger in magnitude than the superimposed small amplitude signal. Second, in the human ear, the masked sound is ahead of the attack signal portion in time compared to the forward masking duration where the masked sound is behind the attack signal portion in time. The reverse masking duration is significantly shorter. Specifically, the forward masking duration is about 200 msec and the backward masking duration is about 5 msec.
ãï¼ï¼ï¼ï¼ããã®åé¡ãæ±ãããã«ã以ä¸ã®ãããªï¼ã¤
ã®å
¬ç¥ã®æ¹æ³ãåå¨ããã ï¼ï½ï¼ãããã¯ãµã¤ãºåãæãæ³ ï¼ï½ï¼æéæ¹åã®é鳿³¢å½¢æ´å½¢æ³ ï¼ï½ï¼å©å¾å¶å¾¡æ³There are three known methods for dealing with this problem: (A) Block size switching method (b) Time direction noise waveform shaping method (c) Gain control method
ãï¼ï¼ï¼ï¼ããããã¯ãµã¤ãºåãæãæ³ã¯ãé常ã®å¤æ
ãããã¯ãããçãé¨åãããã¯ã«åå²ããåé¨åãã
ãã¯ã«å¯¾ãã¦ããçã夿ãé©ç¨ããããã®ãã¨ã¯ãã
ãªã¨ã³ã¼éé³ã®æ¡æ£ããçãé¨åãããã¯å
ã«å
å«ãã
ããã¨ãæ´å©ãããæéæ¹åã®é鳿³¢å½¢æ´å½¢æ³ã¯ãã¢ã¿
ãã¯ä¿¡å·ãæéé åã«ããããã®éå䏿§ã«ãã£ã¦ç¹å¾´
ä»ãããã¦ããã®ã§ãã¹ãã¯ãã«ãã¼ã¿ã«äºæ¸¬ç¬¦å·åã
é©ç¨ãããã¨ã«ãã£ã¦ãæéâ卿³¢æ°ã®å対æ§ãå©ç¨ã
ããæå¾ã«ãå©å¾å¶å¾¡æ³ã¯ãä¿®æ£é¢æ°ï¼ï¼ï¼¦ï¼ãç¨ã
ã¦ãå°ããæ¯å¹
ã®é¨åã®ä¿¡å·å¼·åº¦ãå¢å¹
ãããã¨ã«ãã£
ã¦ãåã³ï¼åã¯å¤§ããæ¯å¹
ã®é¨åã®ä¿¡å·å¼·åº¦ãæå¶ãã
ãã¨ã«ãã£ã¦ãæéé åã«ãããä¿¡å·ãä¿®æ£ããï¼å©å¾
å¶å¾¡ããï¼ãThe block size switching method divides a normal transform block into shorter sub-blocks and applies a shorter transform to each sub-block. This helps to include the pre-echo noise spread within the short sub-blocks. The temporal noise shaping method exploits the time-frequency duality by applying predictive coding to the spectral data, since the attack signal is characterized by its non-uniformity in the time domain. Finally, the gain control method uses a modification function (MF) to amplify the signal strength of the small amplitude portion and / or suppress the signal strength of the large amplitude portion to obtain a signal in the time domain. Is corrected (gain control).
ãï¼ï¼ï¼ï¼ã徿¥æè¡ã«ä¿ãå©å¾å¶å¾¡æ³ã®ï¼ã¤ã®ä¾ãå³
ï¼ï¼ï¼ï½ï¼ã«å³ç¤ºããã¦ãããããã§ã¯ãå³ï¼ï¼ï¼ï½ï¼
ã®ç¬¦å·åå¨åã³å¾©å·åå¨ã«å¯¾ãã¦ãããã«å©å¾å¶å¾¡é¨ï¼
ï¼ï¼ã¨éå©å¾å¶å¾¡é¨ï¼ï¼ï¼ãè¨ãããã¦ãããå³ï¼ï¼
ï¼ï½ï¼ã¨åæ§ã®ã¢ã¿ãã¯ä¿¡å·ãå«ã夿ãããã¯ãå
¥å
ãããã¨ããå©å¾å¶å¾¡é¨ï¼ï¼ï¼ã¯å¼±ãä¿¡å·ãå¢å¹
ããå
ã³ï¼åã¯å¼·ãä¿¡å·ãæå§ããããã®å¢å¹
åã³ï¼åã¯æå§
ã®æ
å ±ã¯ã符å·åãããä¿¡å·ã¨ã¨ãã«å¾©å·åå¨ã«éã
ããä¸è¨æ
å ±ã«åºã¥ãã¦ãéå©å¾å¶å¾¡é¨ï¼ï¼ï¼ã¯ãå©å¾
å¶å¾¡é¨ï¼ï¼ï¼ã«ããã¦å¢å¹
ãããé¨åãæå§ããåã³ï¼
åã¯æå§ãããé¨åãå¢å¹
ãããéååå¨ï¼ï¼ï¼ã¯éé³
ãå°å
¥ãã復å
ããããããã¯ã«ããã¦ä¸è¨éé³ãæé
é åã«ã¤ãã¦åçã«åå¸ãããããéå©å¾å¶å¾¡é¨ï¼ï¼ï¼
ã®å¾æ®µã«ããã¦ãéé³ã¯ãå³ï¼ï¼ï¼ï½ï¼ã®å©å¾å¶å¾¡ãªã
ã®ä¾ã¨æ¯è¼ããã¨ãä¿¡å·ãäºãå°ããªæ¯å¹
ã ã£ãé åã«
ããã¦æå§ããã¦ãããã¨ãããããOne example of a gain control method according to the prior art is shown in FIG. 19 (b). Here, FIG. 19 (a)
Gain control unit 1 for the encoder and the decoder of
05 and an inverse gain control unit 106 are provided. FIG. 19
When a conversion block including an attack signal similar to that in (a) is input, the gain control unit 105 amplifies a weak signal and / or suppresses a strong signal. This amplification and / or suppression information is sent to the decoder together with the encoded signal, and based on the above information, the inverse gain control unit 106 suppresses the amplified part in the gain control unit 105, and /
Alternatively, the suppressed part is amplified. The quantizer 102 introduces noise and evenly distributes the noise in the restored block in the time domain, but the inverse gain controller 106
In the latter stage, it can be seen that the noise is suppressed in the region where the signal had a small amplitude in advance, as compared with the example without gain control in FIG.
ãï¼ï¼ï¼ï¼ã[0008]
ãçºæã解決ãããã¨ãã課é¡ãããªã¨ã³ã¼åã³ãã¹ã
ã¨ã³ã¼ã®åé¡ã«åãçµãããã®é©æ£ãªæ¹æ³ã§ã¯ãæ°ãã
è¦å ãèæ
®ããå¿
è¦ããããã¢ã¿ãã¯ä¿¡å·ã«å
è¡ããå°
ããæ¯å¹
ã®ä¿¡å·ããæ©ãããããã¤é
ãããªãããã«å¢
å¹
ãããããã«ãä¿¡å·ä¸ã®ã¢ã¿ãã¯ä¿¡å·ã®ç«ä¸ããã®é
å§ä½ç½®ï¼ãªã³ã»ããï¼ãææãããã¨ãéè¦ã§ãããä¿¡
å·ããã¾ãæ©ãå¢å¹
ããã¨ãããªã¨ã³ã¼åã³ãã¹ãã¨ã³
ã¼ã¯å¹æçã«é¤å»ãããªããä¿¡å·ããã¾ãé
ãå¢å¹
ãã
ã¨ãã¢ã¿ãã¯ä¿¡å·èªä½ãå¢å¹
ãããå±éºããããå¾ã£
ã¦ãããªã¨ã³ã¼åé¡ãããã«æªåãããçµæã¨ãªããã¢
ã¿ãã¯ä¿¡å·ãããªã¨ã³ã¼éé³ã®çºçã«é¢ä¸ããã®ã¨åæ§
ã«ããªãªã¼ã¹ä¿¡å·ãã¾ãããªã¨ã³ã¼åã³ãã¹ãã¨ã³ã¼é
é³ã®çºçã«é¢ä¸ããã®ã§ããã®ãããªãªãªã¼ã¹ä¿¡å·ã®å½±
é¿ãã¾ãæå§ããªããã°ãªããªããProper methods for addressing the pre-echo and post-echo problems require new factors to be considered. It is important to capture the onset of the rising edge of the attack signal in the signal so that the small amplitude signal preceding the attack signal is amplified not too early and not too late. If the signal is amplified too quickly, the pre-echo and post-echo will not be effectively removed. Amplifying the signal too slowly also carries the risk of amplifying the attack signal itself, thus resulting in aggravating the pre-echo problem. Since the release signal also contributes to the generation of pre-echo and post-echo noise, just as the attack signal contributes to the generation of pre-echo noise, the effect of such release signal must also be suppressed.
ãï¼ï¼ï¼ï¼ãåæ§ã«ãä¿¡å·ã¨ãã«ã®ã¼ã®å¢å¤§ï¼ãµã¼ã¸ï¼
ãå¾ã«ç¶ãã¹ãã¯ãã«éååã®å¦çã«é度ã®è² æ
ããã
ãªãããã«ãå°ããæ¯å¹
ã®ä¿¡å·ãå¢å¹
ããçµæã¨ãã¦ã®
ä¿¡å·å¼·åº¦ã®å¢å¤§ã¯ã大ããæ¯å¹
ã®ä¿¡å·ãé©å½ãªç¨åº¦ã«æ¸
è¡°ãããã¨ã«ãã£ã¦ãã©ã³ã¹ãã¨ãå¿
è¦ããããæ¸è¡°ã
ããä¿¡å·ã®ã復å·åå¨ã§å復ãããéã®æ£ç¢ºããæ¸å°ã
ããªãããã«ãæ¸è¡°ã¯é度ã§ãã£ã¦ã¯ãªããªããææªã®
å ´åãéåº¦ã®æ¸è¡°ã¯ããããããé¤å»ãããã¨ãããã¹
ãã¨ã³ã¼ãããè³éããªãä»å çãªäººå·¥ç©ï¼ã¢ã¼ãã£ã
ã¡ã¯ãï¼ãå°å
¥ãããã¨ããããããã«ãæ¸è¡°ãé©ç¨ã
ããã¹ãç¯å²ãè¨å®ããããã®æ¹æ³ãåæ§ã«éè¦ã§ã
ããSimilarly, an increase in signal energy (surge)
The increase in signal strength as a result of amplifying the small amplitude signal is balanced by attenuating the large amplitude signal to an appropriate degree, so as not to unduly burden the processing of the subsequent spectral quantization. There is a need. The attenuation should not be excessive so as not to reduce the accuracy of the attenuated signal as it is recovered at the decoder. In the worst case, excessive attenuation can introduce additional artifacts that are less annoying than the post-echo we are trying to remove. Therefore, the method for setting the range to which the damping should be applied is also important.
ãï¼ï¼ï¼ï¼ãæå¾ã«ã驿£ãªå©å¾å¶å¾¡æ¹æ³ã¯ãã¢ã¿ãã¯
ä¿¡å·ã¨ãªãªã¼ã¹ä¿¡å·ãè±å¯ã«å«ããã¨ã«ãã£ã¦ç¹å¾´ä»ã
ããããçºè©±é³å£°ã®ãããªä¿¡å·ã«é©å¿ãããã®ã§ããå¿
è¦ããããããã¯ããã®ãããªä¿¡å·ããé度ã®å¢å¹
åã³
éåº¦ã®æ¸è¡°ã«ãã£ã¦å°å
¥ãããã¢ã¼ãã£ãã¡ã¯ãã«å¯¾ã
ã¦é常ã«å½±é¿ãåããããããã§ããããããã®ã¢ã¿ã
ã¯åã³ãªãªã¼ã¹ã¯ãé·ç§»ããã»ã©æ¥æ¿ã§ãªãï¼ããªã
ã¡ãããèªç¶ã§ããï¼å¾åããããããã«ãå½è©²å©å¾å¶
å¾¡æ¹æ³ã¯ãä¿¡å·å¼·åº¦ã«ããããããã®èªç¶ãªé·ç§»ãæ¤åº
ããã¨ãããã¾ãæ¥æ¿ã§ãªãå©å¾å¶å¾¡ã«å¾©å¸°ããå¿
è¦ã
ãããFinally, a proper gain control method needs to be adaptable to speech-like signals, which are characterized by abundant attack and release signals. This is because such signals are very susceptible to artifacts introduced by excessive amplification and excessive attenuation. These attacks and releases tend to have transitions that are less abrupt (ie, more natural), and therefore the gain control method is less aggressive when detecting these natural transitions in signal strength. Need to return to.
ãï¼ï¼ï¼ï¼ã次ã«ãä¸è¿°ã®è¦å ããã¹ã¦èæ
®ã«å
¥ãã¦ã
å©å¾å¶å¾¡ãéæããããã«ãä¿®æ£é¢æ°ãçæãã¦ãçæ
ããä¿®æ£é¢æ°ãä¸è¨ãã¬ã¼ã ã«å¯¾ãã¦ä¹ç®ãããã¨ãã§
ããããã®ä¿®æ£é¢æ°ã¯ãå
è¡ãããµããã³ããã¬ã¼ã å
ã³ç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ãµã³ãã«ä¿¡å·ããªã¼ãã¼
ããã¼ããããã®ã§ãã£ã¦ã¯ãªããªããNext, taking all of the above factors into consideration,
A correction function may be generated and the generated correction function may be multiplied with the frame to achieve gain control. The correction function must not overflow the sampled signals of the preceding subband frame and the current subband frame.
ãï¼ï¼ï¼ï¼ãæ¬çºæã®ç®çã¯ã以ä¸ã®åé¡ç¹ã解決ãã
ä¿®æ£é¢æ°ãããæ£ç¢ºã«çæãã¦å©å¾å¶å¾¡ãå®è¡ãããã¨
ã«ãããããªã¨ã³ã¼éé³åã³ãã¹ãã¨ã³ã¼éé³ã確å®ã«
æå§ãããã¨ãã§ãããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³åã³
è£
ç½®ã並ã³ã«ç¬¦å·ååã³å¾©å·åã·ã¹ãã ãæä¾ãããã¨
ã«ãããAn object of the present invention is to solve the above problems,
Provided are an audio signal encoding method and apparatus, and an encoding and decoding system capable of reliably suppressing pre-echo noise and post-echo noise by generating a correction function more accurately and performing gain control. Especially.
ãï¼ï¼ï¼ï¼ã[0013]
ã課é¡ã解決ããããã®ææ®µãæ¬çºæã®ç¬¬ï¼ã®æ
æ§ã«ä¿
ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã¯ãå
¥åããããªã¼ãã£
ãªä¿¡å·ã«åºã¥ãã¦ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ãã¦ãè¨
ç®ãããä¿®æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦
å©å¾å¶å¾¡ããã¹ãããã¨ãä¸è¨å©å¾å¶å¾¡ããããªã¼ãã£
ãªä¿¡å·ã«å¯¾ãã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ããï¼ã¤ã®
ãã¬ã¼ã æ¯ã«ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦çãè¡ã
ãã¨ã«ãã符å·åããããããã¹ããªã¼ã ä¿¡å·ãå¾ãã¹
ãããã¨ãå«ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«ããã¦ã
ä¸è¨å©å¾å¶å¾¡ããã¹ãããã¯ãå
¥åããããªã¼ãã£ãªä¿¡
å·ããã¬ã¼ã ã®æéãããçãæéã®è¤æ°ã®åºåã«åå²
ããä¸è¨åå²ãããååºåã®ãã¼ã¯ã®çµ¶å¯¾å¤ãè¨ç®ãã
ã¹ãããã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£ãªä¿¡
å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§
ããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ã
ã¨ãã¦èå¥ããã¹ãããã¨ãä¸è¨åå²ãããåºåãæã
ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«
ãä¿¡å·ã®é¨åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºå
ãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ããã¹ãããã¨ãä¸è¨åå²ã
ããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãä¸è¨ä¿®æ£
颿°ã«å¾ã£ã¦å©å¾å¶å¾¡ããã¨ãã«ææããããå¦çãã¹
ãç¾å¨ã®ãã¬ã¼ã ã«ãããç®æ¨ãã¼ã¯å¤ãè¨ç®ããã¹ã
ããã¨ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ã
åºåã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ã
åºåã¨ãä¸è¨è¨ç®ãããç®æ¨ãã¼ã¯å¤ã¨ã«åºã¥ãã¦ãå½
該ç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ãããªãä¿®
æ£é¢æ°ãè¨ç®ããã¹ãããã¨ãå«ããã¨ãç¹å¾´ã¨ãããAccording to a first aspect of the present invention, an audio signal encoding method calculates a correction function for each frame based on an input audio signal, and according to the calculated correction function. Gain control of the audio signal, and encoding of the gain-controlled audio signal by performing an orthogonal transform process and an encoding process for every two adjacent frames that are connected in cascade. A method of encoding an audio signal, the method comprising:
The step of controlling the gain divides the input audio signal into a plurality of sections having a time shorter than the time of a frame, calculates the absolute value of the peak of each of the divided sections, and the divided section. Based on the audio signal having, the step of identifying a section including the start position of the attack signal, which is a part of the signal including a sharp rise of sound, as a correction point, and based on the audio signal having the divided section, The step of identifying the section including the end position of the release signal, which is the portion of the signal including the abrupt fall of the sound, as a correction point, and the gain control according to the correction function based on the audio signal having the divided section Calculating a target peak value in the current frame to be processed, which is sometimes desired; Based on the section including the start position of the attack signal, the section including the end position of the identified release signal, and the calculated target peak value, the correction function value for each section of the current frame Calculating a correction function consisting of
ãï¼ï¼ï¼ï¼ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«ãã
ã¦ãä¸è¨ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåãèå¥ãã
ã¹ãããã¯ãå¦çãã¹ãç¾å¨ã®åºåã«ç¶ã次ã®åºåã®ã
ã¼ã¯ã®çµ¶å¯¾å¤ã¨ãå½è©²ç¾å¨ã®åºå以åã®äºã決ãããã
æ°ã®åºåã«ä¿ãåãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯ã«åºã¥ãã¦ãå½
該ç¾å¨ã®åºåãã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ããå¦ãã
決å®ããã¹ãããã¨ãå½è©²ç¾å¨ã®åºåãä¸è¨ã¢ã¿ãã¯ä¿¡
å·ã®éå§ä½ç½®ãå«ãã¨æ±ºå®ãããã¨ãã¯ãã¢ã¿ãã¯ä¿¡å·
ã®éå§ä½ç½®ãå«ããã¨ã示ãæå®ã®ç¬¬ï¼ã®å¤ãå½è©²ç¾å¨
ã®åºåã«å²ãå½ã¦ãã¹ãããã¨ãå«ããã¨ãç¹å¾´ã¨ã
ããIn the audio signal encoding method, the step of identifying the section including the start position of the attack signal includes the absolute value of the peak of the next section following the current section to be processed and the peak before the current section. A step of determining whether the current segment includes the start position of the attack signal based on a ratio with the absolute value of each peak related to the predetermined number of segments, and the current segment includes the attack. Assigning a predetermined first value indicating that the attack signal includes the start position to the current segment when it is determined to include the signal start position.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨ãªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãèå¥
ããã¹ãããã¯ãå¦çãã¹ãç¾å¨ã®åºåã¨ãå½è©²ç¾å¨ã®
åºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ãããæå¤§ã®ãã¼
ã¯ã®çµ¶å¯¾å¤ã¨ãå½è©²ç¾å¨ã®åºå以å¾ã®äºã決ããããæ°
ã®åºåã«ä¿ãåãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯ã«åºã¥ãã¦ãå½è©²
ç¾å¨ã®åºåããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ããå¦ããæ±º
å®ããã¹ãããã¨ãå½è©²ç¾å¨ã®åºåãä¸è¨ãªãªã¼ã¹ä¿¡å·
ã®çµäºä½ç½®ãå«ãã¨æ±ºå®ãããã¨ãã¯ããªãªã¼ã¹ä¿¡å·ã®
çµäºä½ç½®ãå«ããã¨ã示ãæå®ã®ç¬¬ï¼ã®å¤ãå½è©²ç¾å¨ã®
åºåã«å²ãå½ã¦ãã¹ãããã¨ãå«ããã¨ãç¹å¾´ã¨ãããIn the audio signal encoding method, the step of identifying a section including the end position of the release signal is between a current section to be processed and a correction point preceding the current section. Whether the current segment includes the end position of the release signal based on the ratio of the absolute value of the maximum peak to the absolute value of each peak in a predetermined number of segments after the current segment. And, if it is determined that the current segment includes the end position of the release signal, assigning a predetermined second value indicating the end position of the release signal to the current segment. It is characterized by including and.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨ç®æ¨ãã¼ã¯å¤ãè¨ç®ããã¹ãããã¯ãäº
ãã«é£æ¥ããåï¼å¯¾ã®ä¿®æ£ãã¤ã³ãéã®æå¤§ã®ãã¼ã¯ã®
絶対å¤ã¨ãä¸è¨å²ãå½ã¦ããã第ï¼ã®å¤åã³ç¬¬ï¼ã®å¤
ã¨ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ç¶ã次ã®ãã¬ã¼ã ã®æ
åã®åºåã«ä¿ããã¼ã¯ã®çµ¶å¯¾å¤ã¨ã«åºã¥ãã¦ãå½è©²å¦ç
ãã¹ãç¾å¨ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ãä¸è¨ä¿®æ£é¢æ°
ã«å¾ã£ã¦å©å¾å¶å¾¡ããã¨ãã«ææãããç®æ¨ãã¼ã¯å¤ã
è¨ç®ãããã¨ãç¹å¾´ã¨ãããFurther, in the audio signal encoding method, the step of calculating the target peak value includes the absolute value of the maximum peak between each pair of correction points adjacent to each other and the assigned first value. A gain of the audio signal of the current frame to be processed according to the correction function, based on the value and the second value and the absolute value of the peak of the first segment of the next frame following the current frame to be processed. It is characterized by calculating a desired target peak value when controlled.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°ãè¨ç®ããã¹ãããã¯ãä¸
è¨ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²ã¢ã¿ãã¯
ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®
éã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ããå½è©²ã¢ã¿ãã¯ä¿¡å·ã®é
å§ä½ç½®ãå«ãåºåã¨ãä¸è¨å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®é
ã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ããä¸è¨ç®æ¨ãã¼ã¯å¤ã«
çãããªãããã«è¨ç®ããä¸è¨ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®
ãå«ãåºåã¨ãå½è©²ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºå
ã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã®ååºåã«ä¿ãä¿®æ£é¢æ°
å¤ããå½è©²ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨
å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶
対å¤ããä¸è¨ç®æ¨ãã¼ã¯å¤ã«çãããªãããã«è¨ç®ãã
ãã¨ãç¹å¾´ã¨ãããFurther, in the audio signal encoding method, the step of calculating the correction function includes a section including a start position of the attack signal and a correction point preceding the section including a start position of the attack signal. The correction function value for each section between is calculated so that the absolute value of the maximum peak between the section including the start position of the attack signal and the preceding correction point is equal to the target peak value. However, the correction function value related to each section between the section including the start position of the release signal and the correction point preceding the section including the start position of the release signal, and the section including the start position of the release signal, , The absolute value of the maximum peak with respect to the preceding correction point is calculated so as to be equal to the target peak value.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨å©å¾å¶å¾¡ããã¹ãããã¯ãä¸è¨åå²ããã
åºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«
ã¾ããæå®ã®ãããããªå¾é
ã§èªç¶ã«éä¸ãã第ï¼ã®ä¿¡
å·é¨åãæ¤åºããã¹ãããã¨ãä¸è¨åå²ãããåºåãæ
ãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾ããæ
å®ã®ãããããªå¾é
ã§èªç¶ã«ä¸æãã第ï¼ã®ä¿¡å·é¨åã
æ¤åºããã¹ãããã¨ã®å°ãªãã¨ã䏿¹ãå«ã¿ãä¸è¨å©å¾
å¶å¾¡ããã¹ãããã¯ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãã
ã¦ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºå
ã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºå
ã¨ã«åºã¥ãã¦ãä¸è¨æ¤åºããã第ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢
æ°ã«ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ãä¸è¨èå¥ãããã¢ã¿
ãã¯ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£é¢æ°ãæ¸å°ããã¦ä¸
è¨ç¬¬ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®ãããã¨ã«ãããå½
該ç¾å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°ãè¨ç®ãããã¨ãç¹
å¾´ã¨ãããIn the audio signal encoding method, the step of controlling the gain naturally lowers at a predetermined gentle gradient included in the uttered voice based on the audio signal having the divided sections. At least the step of detecting the first signal portion, and the step of detecting the second signal portion that naturally rises at a predetermined gentle gradient included in the speech based on the audio signal having the divided sections. Including one, the gain controlling step is based on a segment including a start position of the identified attack signal and a segment including an end position of the identified release signal in a current frame to be processed, Based on the identified attack signal while stopping the gain control by the correction function of the detected first signal portion. Reduce the correction function is calculated by calculating the correction function of the second signal portion, and calculates a correction function in the current frame.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¯ã
ä¸è¨ç¾å¨ã®ãã¬ã¼ã ã«å
è¡ãããã¬ã¼ã ã«ããã¦ä¿®æ£ã
ã¤ã³ãã§ããåºåã®æ°ãæå®ã®ç¬¬ï¼ã®ãããå¤ããå°ã
ãããã¤ç¾å¨ã®ãã¬ã¼ã ã«ã¢ã¿ãã¯ä¿¡å·ãå«ã¾ãã¦ããª
ãã¨ãã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ããä¸
è¨ç¬¬ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¯ãä¸è¨ã¢ã¿ãã¯
ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²åºåããå½è©²åºåã«
ç¶ã次ã®ä¿®æ£ãã¤ã³ãã¾ã§ã®éã«ãããæå¤§ã®ãã¼ã¯ã®
絶対å¤ãåå¨ããåºåã¨ã®éã®åé¢åºåæ°ãæå®ã®ç¬¬ï¼
ã®ãããå¤ãã大ããã¨ãã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãå
å¨ããã¨æ±ºå®ãããã¨ãç¹å¾´ã¨ãããFurther, in the audio signal encoding method, the step of detecting the first signal portion includes:
When the number of sections that are correction points in a frame preceding the current frame is less than a predetermined first threshold and the current frame does not include an attack signal, the first signal portion is The step of detecting the presence of the second signal portion, which is determined to be present, determines that the absolute value of the maximum peak between the section including the start position of the attack signal and the next correction point subsequent to the section. The number of separation divisions between existing divisions is the second
It is determined that the second signal portion is present when it is larger than the threshold value of.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°ãè¨ç®ããã¹ãããã¯ãä¸
è¨äºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã ã§ãã第ï¼åã³ç¬¬ï¼ã®
ãã¬ã¼ã ã«ããã¦ãä¸è¨å©å¾å¶å¾¡å¾ã®ãªã¼ãã£ãªä¿¡å·ã®
ãã¡ã®ç¬¬ï¼ã®ãã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ç¬¬ï¼ã®ãã¬ã¼
ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ãç°ãªãã¨ããä¸è¨åãã¼ã¯ã®çµ¶
対å¤ãçãããªãããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®
åºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ããã¹ããããããã«å«ã
ãã¨ãç¹å¾´ã¨ãããFurthermore, in the above-mentioned audio signal encoding method, the step of calculating the correction function includes the step of calculating the gain-controlled audio signal in the first and second frames, which are the two adjacent frames. Modification relating to the first division of the second frame so that when the absolute value of the peak of the first frame and the absolute value of the peak of the second frame are different, the absolute values of the peaks are equal to each other The method further comprises the step of correcting the function value.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨å©å¾å¶å¾¡ããã¹ãããã¯ãä¸è¨å©å¾å¶å¾¡ã®
å¦çã®å¾ã«ãä¸è¨äºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã ã§ãã
第ï¼åã³ç¬¬ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ãäºãã«å®è³ª
çã«é£ç¶ããããã«ã第ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·
ã«å¯¾ãã¦ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿ãä¿®æ£é¢æ°å¤
ãä¹ç®ãã¦è£æ£ããã¹ããããããã«å«ã¿ãä¸è¨ä¿®æ£é¢
æ°ãè¨ç®ããã¹ãããã¯ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®ååºå
ã«ä¿ãä¿®æ£é¢æ°å¤ã®ãã¡ã®ããªã¼ãã£ãªä¿¡å·ãå¢å¹
ãã
æå°ã®ä¿®æ£é¢æ°å¤ã«åºã¥ãã¦ãä¸è¨ãªã¼ãã£ãªä¿¡å·ãå¢
å¹
ããä¿®æ£é¢æ°ã«ããå©å¾å¶å¾¡ãããã¹ã第ï¼ã®ãã¬ã¼
ã ã®ãªã¼ãã£ãªä¿¡å·ããä¸è¨ä¹ç®ãã¦è£æ£ãããã¨ã«ã
ãæ¸è¡°ããããã¨ã鲿¢ããããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼
ã ã®æåã®åºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ããã¹ãããã
ããã«å«ããã¨ãç¹å¾´ã¨ãããIn the audio signal encoding method, in the gain control step, after the gain control process, the audio signals of the first frame and the second frame, which are the two adjacent frames, are mutually separated. The method further comprises the step of multiplying the audio signal of the first frame by a correction function value associated with the first section of the second frame so as to be substantially continuous, and calculating the correction function. A first frame to be gain-controlled by a correction function for amplifying the audio signal based on a minimum correction function value for amplifying the audio signal among correction function values for each section of the first frame, Of the first frame of the second frame so as to prevent the above audio signal from being attenuated by the multiplication and correction. And further comprising a step of correcting a correction function value that.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°ãè¨ç®ããã¹ãããã¯ãå¦çã
ã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãããæå¾ã®ä¿®æ£ãã¤ã³ãã®åºå
ãã徿¹ã«ä½ç½®ããç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ãããæ
大ã®ãã¼ã¯ã®çµ¶å¯¾å¤ãä¸è¨ç®æ¨ãã¼ã¯å¤ã¨ç°ãªãã¨ã
ã«ãä¸è¨æå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ãä¸è¨ç®æ¨ãã¼ã¯å¤ã«ç
ãããªãããã«ãä¸è¨å¾æ¹ã«ä½ç½®ããç¾å¨ã®ãã¬ã¼ã ã®
ååºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ããã¹ããããããã«å«
ããã¨ãç¹å¾´ã¨ãããFurthermore, in the above-mentioned audio signal encoding method, the step of calculating the correction function includes the maximum in each section of the current frame located after the section of the last correction point in the current frame to be processed. When the absolute value of the peak is different from the target peak value, the correction function value related to each section of the current frame located behind is corrected so that the absolute value of the maximum peak becomes equal to the target peak value. The method further comprises the step of:
ãï¼ï¼ï¼ï¼ãã¾ããæ¬çºæã®ç¬¬ï¼ã®æ
æ§ã«ä¿ããªã¼ãã£
ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ã«åº
ã¥ãã¦ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ãã¦ãè¨ç®ãããä¿®
æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦å©å¾å¶å¾¡ã
ãã¹ãããã¨ãä¸è¨å©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡å·ã«å¯¾
ãã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã æ¯
ã«ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦çãè¡ããã¨ã«ãã
符å·åããããããã¹ããªã¼ã ä¿¡å·ãå¾ãã¹ãããã¨ã
å«ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«ããã¦ãä¸è¨å©å¾å¶
御ããã¹ãããã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ããã¬ã¼
ã ã®æéãããçãæéã®è¤æ°ã®åºåã«åå²ããã¹ãã
ãã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åº
ã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ããã¢
ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦
èå¥ããã¹ãããã¨ãä¸è¨åå²ãããåºåãæãããªã¼
ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·
ã®é¨åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãä¿®æ£
ãã¤ã³ãã¨ãã¦èå¥ããã¹ãããã¨ãä¸è¨åå²ãããåº
åãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾
ããæå®ã®ãããããªå¾é
ã§èªç¶ã«éä¸ãã第ï¼ã®ä¿¡å·
é¨åãæ¤åºããã¹ãããã¨ãä¸è¨åå²ãããåºåãæã
ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾ããæå®
ã®ãããããªå¾é
ã§èªç¶ã«ä¸æãã第ï¼ã®ä¿¡å·é¨åãæ¤
åºããã¹ãããã¨ã®å°ãªãã¨ã䏿¹ãå«ã¿ãä¸è¨å©å¾å¶
御ããã¹ãããã¯ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãã
ã¦ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºå
ã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºå
ã¨ã«åºã¥ãã¦ãä¸è¨æ¤åºããã第ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢
æ°ã«ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ãä¸è¨èå¥ãããã¢ã¿
ãã¯ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£é¢æ°ãæ¸å°ããã¦ä¸
è¨ç¬¬ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®ãããã¨ã«ãããå½
該ç¾å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°ãè¨ç®ããã¹ããã
ã¨ãå«ããã¨ãç¹å¾´ã¨ãããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹
æ³ãFurther, in the audio signal encoding method according to the second aspect of the present invention, a correction function is calculated for each frame based on the input audio signal, and the audio signal is converted into the audio signal according to the calculated correction function. A gain control step for the gain control, and a bit stream encoded by performing an orthogonal transformation process and an encoding process for every two adjacent frames that are cascade-connected to the gain-controlled audio signal. In the method of encoding an audio signal, including the step of obtaining a signal, the step of controlling the gain comprises the steps of dividing the input audio signal into a plurality of sections having a time shorter than the time of a frame, and the divided sections. The start of the attack signal, which is the part of the signal that contains a sudden sound rise, based on the audio signal having Identifying the section containing the position as a correction point, and based on the audio signal having the divided section, the section containing the end position of the release signal, which is the portion of the signal containing the sudden fall of the sound, is the correction point. And detecting the first signal portion that naturally falls at a predetermined gentle gradient included in the speech based on the audio signal having the divided section, and the divided section. And / or detecting a second signal portion that naturally rises with a predetermined gentle slope contained in the speech based on the audio signal having In the frame, the section including the start position of the identified attack signal and the end of the identified release signal Based on the segment including the position, the gain control by the correction function of the detected first signal portion is stopped, while the correction function calculated based on the identified attack signal is decreased to reduce the second function. Calculating a correction function of the signal part of the current frame to calculate a correction function in the current frame.
ãï¼ï¼ï¼ï¼ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«ãã
ã¦ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¯ãä¸è¨ç¾
å¨ã®ãã¬ã¼ã ã«å
è¡ãããã¬ã¼ã ã«ããã¦ä¿®æ£ãã¤ã³ã
ã§ããåºåã®æ°ãæå®ã®ç¬¬ï¼ã®ãããå¤ããå°ãããã
ã¤ç¾å¨ã®ãã¬ã¼ã ã«ã¢ã¿ãã¯ä¿¡å·ãå«ã¾ãã¦ããªãã¨ã
ã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ããä¸è¨ç¬¬ï¼
ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¯ãä¸è¨ã¢ã¿ãã¯ä¿¡å·ã®
éå§ä½ç½®ãå«ãåºåã¨ãå½è©²åºåããå½è©²åºåã«ç¶ã次
ã®ä¿®æ£ãã¤ã³ãã¾ã§ã®éã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤
ãåå¨ããåºåã¨ã®éã®åé¢åºåæ°ãæå®ã®ç¬¬ï¼ã®ãã
ãå¤ãã大ããã¨ãã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åãåå¨ãã
ã¨æ±ºå®ãããã¨ãç¹å¾´ã¨ãããIn the above-mentioned audio signal encoding method, the step of detecting the first signal portion includes the step of detecting the number of sections which are correction points in a frame preceding the current frame from a predetermined first threshold value. It is determined that the first signal portion is present when the attack signal is small and the current frame does not include an attack signal, and
The step of detecting the signal part of the separation between the section including the start position of the attack signal and the section in which the absolute value of the maximum peak between the section and the next correction point following the section exists. It is characterized in that the second signal portion is determined to exist when the number of sections is larger than a predetermined second threshold value.
ãï¼ï¼ï¼ï¼ãæ¬çºæã®ç¬¬ï¼ã®æ
æ§ã«ä¿ããªã¼ãã£ãªä¿¡å·
ã®ç¬¦å·åè£
ç½®ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦
ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ãã¦ãè¨ç®ãããä¿®æ£é¢æ°
ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦å©å¾å¶å¾¡ããå©å¾
å¶å¾¡ææ®µã¨ãä¸è¨å©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡å·ã«å¯¾ã
ã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã æ¯ã«
ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦çãè¡ããã¨ã«ãã符
å·åããããããã¹ããªã¼ã ä¿¡å·ãå¾ãææ®µã¨ãåãã
ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«ããã¦ãä¸è¨å©å¾å¶å¾¡æ
段ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ããã¬ã¼ã ã®æéãã
ãçãæéã®è¤æ°ã®åºåã«åå²ããä¸è¨åå²ãããååº
åã®ãã¼ã¯ã®çµ¶å¯¾å¤ãè¨ç®ããåºåã®ãã¼ã¯å¤è¨ç®ææ®µ
ã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥
ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ããã¢ã¿
ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦è
å¥ãã第ï¼ã®èå¥ææ®µã¨ãä¸è¨åå²ãããåºåãæãã
ãªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ã
ä¿¡å·ã®é¨åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåã
ä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ãã第ï¼ã®èå¥ææ®µã¨ãä¸è¨å
å²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãä¸è¨
ä¿®æ£é¢æ°ã«å¾ã£ã¦å©å¾å¶å¾¡ããã¨ãã«ææããããå¦ç
ãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãããç®æ¨ãã¼ã¯å¤ãè¨ç®ãã
ç®æ¨ãã¼ã¯å¤è¨ç®ææ®µã¨ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·
ã®éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·
ã®çµäºä½ç½®ãå«ãåºåã¨ãä¸è¨è¨ç®ãããç®æ¨ãã¼ã¯å¤
ã¨ã«åºã¥ãã¦ãå½è©²ç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ä¿ãä¿®æ£
颿°å¤ãããªãä¿®æ£é¢æ°ãè¨ç®ããä¿®æ£é¢æ°è¨ç®ææ®µã¨
ãåãããã¨ãç¹å¾´ã¨ãããAn audio signal encoding apparatus according to a third aspect of the present invention calculates a correction function for each frame based on an input audio signal, and outputs the correction function to the audio signal according to the calculated correction function. A gain control means for controlling the gain and a bit stream encoded by performing an orthogonal transformation process and an encoding process on every two adjacent frames that are cascade-connected to the gain-controlled audio signal. In the audio signal encoding device including means for obtaining a signal, the gain control means divides the input audio signal into a plurality of sections having a time shorter than a frame time, and the gain control means divides each of the divided sections. Based on the peak value calculation means of the section for calculating the absolute value of the peak and the audio signal having the above-mentioned divided sections, a sudden sound First identifying means for identifying, as a correction point, a section including a start position of an attack signal, which is a portion of a signal including a rising point, and a sharp fall of sound based on the audio signal having the divided section Second identification means for identifying a section including the end position of the release signal, which is a part of the signal, as a correction point, and desired when gain control is performed according to the correction function based on the audio signal having the divided section. A target peak value calculating means for calculating a target peak value in a current frame to be processed, a section including a start position of the identified attack signal, and a section including an end position of the identified release signal, Based on the calculated target peak value, calculate a correction function consisting of the correction function value for each section of the current frame. Characterized by comprising a positive function calculating means.
ãï¼ï¼ï¼ï¼ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«ãã
ã¦ãä¸è¨ç¬¬ï¼ã®èå¥ææ®µã¯ãå¦çãã¹ãç¾å¨ã®åºåã«ç¶
ãæ¬¡ã®åºåã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ãå½è©²ç¾å¨ã®åºå以åã®
äºã決ããããæ°ã®åºåã«ä¿ãåãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯
ã«åºã¥ãã¦ãå½è©²ç¾å¨ã®åºåãã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®
ãå«ããå¦ããæ±ºå®ããå½è©²ç¾å¨ã®åºåãä¸è¨ã¢ã¿ãã¯
ä¿¡å·ã®éå§ä½ç½®ãå«ãã¨æ±ºå®ãããã¨ãã¯ãã¢ã¿ãã¯ä¿¡
å·ã®éå§ä½ç½®ãå«ããã¨ã示ãæå®ã®ç¬¬ï¼ã®å¤ãå½è©²ç¾
å¨ã®åºåã«å²ãå½ã¦ããã¨ãç¹å¾´ã¨ãããIn the above audio signal coding apparatus, the first identifying means includes the absolute value of the peak of the next section following the current section to be processed, and a predetermined number before the current section. Based on the ratio to the absolute value of each peak related to the division, it is determined whether the current division includes the start position of the attack signal, and the current division is determined to include the start position of the attack signal. In this case, a predetermined first value indicating that the attack signal start position is included is assigned to the current section.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«
ããã¦ãä¸è¨ç¬¬ï¼ã®èå¥ææ®µã¯ãå¦çãã¹ãç¾å¨ã®åºå
ã¨ãå½è©²ç¾å¨ã®åºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ã
ããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ãå½è©²ç¾å¨ã®åºå以å¾ã®äº
ãæ±ºããããæ°ã®åºåã«ä¿ãåãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯ã«
åºã¥ãã¦ãå½è©²ç¾å¨ã®åºåããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ã
å«ããå¦ããæ±ºå®ããå½è©²ç¾å¨ã®åºåãä¸è¨ãªãªã¼ã¹ä¿¡
å·ã®çµäºä½ç½®ãå«ãã¨æ±ºå®ãããã¨ãã¯ããªãªã¼ã¹ä¿¡å·
ã®çµäºä½ç½®ãå«ããã¨ã示ãæå®ã®ç¬¬ï¼ã®å¤ãå½è©²ç¾å¨
ã®åºåã«å²ãå½ã¦ããã¨ãç¹å¾´ã¨ãããIn the audio signal coding apparatus, the second identifying means may detect the absolute value of the maximum peak between the current section to be processed and the correction point preceding the current section. , Determines whether or not the current segment includes the end position of the release signal, based on the ratio to the absolute value of each peak relating to a predetermined number of segments after the current segment, When it is determined that the segment includes the end position of the release signal, a predetermined second value indicating that the segment includes the end position of the release signal is assigned to the current segment.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®
ã«ããã¦ãä¸è¨ç®æ¨ãã¼ã¯å¤è¨ç®ææ®µã¯ãäºãã«é£æ¥ã
ãåï¼å¯¾ã®ä¿®æ£ãã¤ã³ãéã®æå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã
ä¸è¨å²ãå½ã¦ããã第ï¼ã®å¤åã³ç¬¬ï¼ã®å¤ã¨ãå¦çãã¹
ãç¾å¨ã®ãã¬ã¼ã ã«ç¶ã次ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿
ããã¼ã¯ã®çµ¶å¯¾å¤ã¨ã«åºã¥ãã¦ãå½è©²å¦çãã¹ãç¾å¨ã®
ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ãä¸è¨ä¿®æ£é¢æ°ã«å¾ã£ã¦å©å¾
å¶å¾¡ããã¨ãã«ææãããç®æ¨ãã¼ã¯å¤ãè¨ç®ãããã¨
ãç¹å¾´ã¨ãããFurther, in the audio signal encoding device, the target peak value calculating means has the absolute value of the maximum peak between each pair of correction points adjacent to each other,
The audio of the current frame to be processed, based on the assigned first and second values and the absolute value of the peak associated with the first segment of the next frame following the current frame to be processed. It is characterized in that a desired target peak value is calculated when the signal is gain controlled according to the correction function.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
è£
ç½®ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°è¨ç®ææ®µã¯ãä¸è¨ã¢ã¿ãã¯
ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²ã¢ã¿ãã¯ä¿¡å·ã®éå§
ä½ç½®ãå«ãåºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã®ååºå
ã«ä¿ãä¿®æ£é¢æ°å¤ããå½è©²ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«
ãåºåã¨ãä¸è¨å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ãããæ
大ã®ãã¼ã¯ã®çµ¶å¯¾å¤ããä¸è¨ç®æ¨ãã¼ã¯å¤ã«çãããªã
ããã«è¨ç®ããä¸è¨ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºå
ã¨ãå½è©²ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã«å
è¡ãã
ä¿®æ£ãã¤ã³ãã¨ã®éã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ããå½è©²
ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨å
è¡ããä¿®
æ£ãã¤ã³ãã¨ã®éã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ããä¸
è¨ç®æ¨ãã¼ã¯å¤ã«çãããªãããã«è¨ç®ãããã¨ãç¹å¾´
ã¨ãããFurther, in the audio signal coding device, the correction function calculating means is arranged between a section including a start position of the attack signal and a correction point preceding the section including a start position of the attack signal. The correction function value related to each section of, the absolute value of the maximum peak between the section including the start position of the attack signal and the preceding correction point is calculated to be equal to the target peak value, A section including the start position of the release signal, and a correction function value related to each section between the section including the start position of the release signal and the correction point preceding the section including the start position of the release signal; It is characterized in that the absolute value of the maximum peak between the preceding correction points is calculated so as to be equal to the target peak value.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«
ããã¦ãä¸è¨å©å¾å¶å¾¡ææ®µã¯ãä¸è¨åå²ãããåºåãæ
ãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾ããæ
å®ã®ãããããªå¾é
ã§èªç¶ã«éä¸ãã第ï¼ã®ä¿¡å·é¨åã
æ¤åºãã第ï¼ã®æ¤åºææ®µã¨ãä¸è¨åå²ãããåºåãæã
ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾ããæå®
ã®ãããããªå¾é
ã§èªç¶ã«ä¸æãã第ï¼ã®ä¿¡å·é¨åãæ¤
åºãã第ï¼ã®æ¤åºææ®µã¨ã®å°ãªãã¨ã䏿¹ãåããä¸è¨
å©å¾å¶å¾¡ææ®µã¯ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ããã¦ã
ä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ã
ä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåã¨ã«
åºã¥ãã¦ãä¸è¨æ¤åºããã第ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ã«
ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ãä¸è¨èå¥ãããã¢ã¿ãã¯
ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£é¢æ°ãæ¸å°ããã¦ä¸è¨ç¬¬
ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®ãããã¨ã«ãããå½è©²ç¾
å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°ãè¨ç®ãããã¨ãç¹å¾´ã¨
ãããFurther, in the above audio signal coding apparatus, the gain control means, based on the audio signal having the divided sections, falls naturally at a predetermined gentle gradient included in the uttered voice. Detecting the second signal portion that naturally rises at a predetermined gentle gradient included in the uttered voice based on the audio signal having the divided sections. And at least one of the two detection means, wherein the gain control means comprises:
A segment including the start position of the identified attack signal,
Based on the section including the ending position of the identified release signal, the gain control by the modification function of the detected first signal portion is stopped, while the modification calculated based on the identified attack signal. It is characterized in that the correction function in the current frame is calculated by reducing the function and calculating the correction function of the second signal portion.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®
ã«ããã¦ãä¸è¨ç¬¬ï¼ã®æ¤åºææ®µã¯ãä¸è¨ç¾å¨ã®ãã¬ã¼ã
ã«å
è¡ãããã¬ã¼ã ã«ããã¦ä¿®æ£ãã¤ã³ãã§ããåºåã®
æ°ãæå®ã®ç¬¬ï¼ã®ãããå¤ããå°ããããã¤ç¾å¨ã®ãã¬
ã¼ã ã«ã¢ã¿ãã¯ä¿¡å·ãå«ã¾ãã¦ããªãã¨ãã«ãä¸è¨ç¬¬ï¼
ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ããä¸è¨ç¬¬ï¼ã®æ¤åºææ®µ
ã¯ãä¸è¨ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²åº
åããå½è©²åºåã«ç¶ã次ã®ä¿®æ£ãã¤ã³ãã¾ã§ã®éã«ãã
ãæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤ãåå¨ããåºåã¨ã®éã®åé¢åº
åæ°ãæå®ã®ç¬¬ï¼ã®ãããå¤ãã大ããã¨ãã«ãä¸è¨ç¬¬
ï¼ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ãããã¨ãç¹å¾´ã¨ãããFurther, in the audio signal encoding device, the first detecting means has a number of sections which are correction points in a frame preceding the current frame smaller than a predetermined first threshold value, And when the current frame does not include an attack signal,
Is determined to be present, the second detection means determines that the absolute value of the maximum peak between the section including the start position of the attack signal and the next correction point following the section is It is characterized in that it is determined that the second signal portion is present when the number of separated partitions from existing partitions is larger than a predetermined second threshold value.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
è£
ç½®ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°è¨ç®ææ®µã¯ãä¸è¨äºãã«é£
æ¥ããï¼ã¤ã®ãã¬ã¼ã ã§ãã第ï¼åã³ç¬¬ï¼ã®ãã¬ã¼ã ã«
ããã¦ãä¸è¨å©å¾å¶å¾¡å¾ã®ãªã¼ãã£ãªä¿¡å·ã®ãã¡ã®ç¬¬ï¼
ã®ãã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ç¬¬ï¼ã®ãã¬ã¼ã ã®ãã¼ã¯
ã®çµ¶å¯¾å¤ã¨ãç°ãªãã¨ããä¸è¨åãã¼ã¯ã®çµ¶å¯¾å¤ãçã
ããªãããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿ã
ä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹å¾´ã¨ãããFurther, in the above audio signal coding apparatus, the correction function calculating means includes, in the first and second frames, which are the two adjacent frames, of the audio signal after the gain control. First
When the absolute value of the peak of the second frame is different from the absolute value of the peak of the second frame, the correction function value related to the first section of the second frame is corrected so that the absolute values of the peaks are equal to each other. It is characterized by doing.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«
ããã¦ãä¸è¨å©å¾å¶å¾¡ææ®µã¯ãä¸è¨å©å¾å¶å¾¡ã®å¦çã®å¾
ã«ãä¸è¨äºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã ã§ãã第ï¼åã³
第ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ãäºãã«å®è³ªçã«é£ç¶
ããããã«ã第ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦
第ï¼ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿ãä¿®æ£é¢æ°å¤ãä¹ç®ã
ã¦è£æ£ããä¹ç®å¨ãããã«åããä¸è¨ä¿®æ£é¢æ°è¨ç®ææ®µ
ã¯ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ã®ã
ã¡ã®ããªã¼ãã£ãªä¿¡å·ãå¢å¹
ããæå°ã®ä¿®æ£é¢æ°å¤ã«åº
ã¥ãã¦ãä¸è¨ãªã¼ãã£ãªä¿¡å·ãå¢å¹
ããä¿®æ£é¢æ°ã«ãã
å©å¾å¶å¾¡ãããã¹ã第ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·
ããä¸è¨ä¹ç®ãã¦è£æ£ãããã¨ã«ããæ¸è¡°ããããã¨ã
鲿¢ããããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿
ãä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹å¾´ã¨ãããFurther, in the audio signal encoding device, the gain control means, after the gain control processing, causes the audio signals of the first and second frames, which are the two adjacent frames, to be substantially mutually. Further comprising a multiplier for multiplying and correcting the audio signal of the first frame by the correction function value of the first section of the second frame so as to be continuously continuous. The audio of the first frame to be gain-controlled by the correction function that amplifies the audio signal based on the minimum correction function value that amplifies the audio signal among the correction function values related to each section of the first frame. Correct the correction function value for the first section of the second frame to prevent the signal from being attenuated by the multiplying and correcting. And wherein the Rukoto.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®
ã«ããã¦ãä¸è¨ä¿®æ£é¢æ°è¨ç®ææ®µã¯ãå¦çãã¹ãç¾å¨ã®
ãã¬ã¼ã ã«ãããæå¾ã®ä¿®æ£ãã¤ã³ãã®åºåãã徿¹ã«
ä½ç½®ããç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ãããæå¤§ã®ãã¼ã¯
ã®çµ¶å¯¾å¤ãä¸è¨ç®æ¨ãã¼ã¯å¤ã¨ç°ãªãã¨ãã«ãä¸è¨æå¤§
ã®ãã¼ã¯ã®çµ¶å¯¾å¤ãä¸è¨ç®æ¨ãã¼ã¯å¤ã«çãããªããã
ã«ãä¸è¨å¾æ¹ã«ä½ç½®ããç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ä¿ã
ä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹å¾´ã¨ãããFurther, in the above audio signal coding apparatus, the correction function calculation means may determine the maximum peak in each section of the current frame located after the section of the last correction point in the current frame to be processed. When the absolute value is different from the target peak value, the correction function value related to each section of the current frame located behind is corrected so that the absolute value of the maximum peak becomes equal to the target peak value. Is characterized by.
ãï¼ï¼ï¼ï¼ãã¾ããæ¬çºæã®ç¬¬ï¼ã®æ
æ§ã«ä¿ããªã¼ãã£
ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ã«åº
ã¥ãã¦ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ãã¦ãè¨ç®ãããä¿®
æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦å©å¾å¶å¾¡ã
ãå©å¾å¶å¾¡ææ®µã¨ãä¸è¨å©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡å·
ã«å¯¾ãã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼
ã æ¯ã«ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦çãè¡ããã¨ã«
ãã符å·åããããããã¹ããªã¼ã ä¿¡å·ãå¾ãææ®µã¨ã
åãããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«ããã¦ãä¸è¨å©å¾
å¶å¾¡ææ®µã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ããã¬ã¼ã ã®æ
éãããçãæéã®è¤æ°ã®åºåã«åå²ããææ®µã¨ãä¸è¨
åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥
æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ããã¢ã¿ãã¯ä¿¡å·
ã®éå§ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ãã第
ï¼ã®èå¥ææ®µã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£
ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨
åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãä¿®æ£ãã¤
ã³ãã¨ãã¦èå¥ãã第ï¼ã®èå¥ææ®µã¨ãä¸è¨åå²ããã
åºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«
ã¾ããæå®ã®ãããããªå¾é
ã§èªç¶ã«éä¸ãã第ï¼ã®ä¿¡
å·é¨åãæ¤åºãã第ï¼ã®æ¤åºææ®µã¨ãä¸è¨åå²ãããåº
åãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾
ããæå®ã®ãããããªå¾é
ã§èªç¶ã«ä¸æããï¼ã®ä¿¡å·é¨
åãæ¤åºãã第ï¼ã®æ¤åºææ®µã¨ã®å°ãªãã¨ã䏿¹ãå
ããä¸è¨å©å¾å¶å¾¡ææ®µã¯ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«
ããã¦ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ã
åºåã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ã
åºåã¨ã«åºã¥ãã¦ãä¸è¨æ¤åºããã第ï¼ã®ä¿¡å·é¨åã®ä¿®
æ£é¢æ°ã«ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ãä¸è¨èå¥ããã
ã¢ã¿ãã¯ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£é¢æ°ãæ¸å°ãã
ã¦ä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®ãããã¨ã«ã
ããå½è©²ç¾å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°ãè¨ç®ããä¿®
æ£é¢æ°è¨ç®ææ®µã¨ãåãããã¨ãç¹å¾´ã¨ãããFurther, the audio signal encoding apparatus according to the fourth aspect of the present invention calculates a correction function for each frame based on the input audio signal, and converts the audio signal into the audio signal according to the calculated correction function. The gain control means for controlling the gain and the gain-controlled audio signal are encoded by performing an orthogonal transform process and an encoding process for every two adjacent frames that are connected in cascade. In the audio signal coding apparatus, which comprises a means for obtaining a bitstream signal, the gain control means divides the input audio signal into a plurality of sections having a time shorter than a frame time, and Based on an audio signal having a distinct section, including the start position of the attack signal, which is the part of the signal that contains a sudden rise in sound The third identifying means for identifying the minute as a correction point, and the section including the end position of the release signal, which is the portion of the signal including the abrupt fall of the sound, are corrected based on the audio signal having the divided section A fourth identifying means for identifying the point and a third signal portion for naturally descending at a predetermined gentle gradient included in the uttered voice based on the audio signal having the divided sections. At least one of the detection means and the fourth detection means for detecting the two signal portions that naturally rise at a predetermined gentle gradient included in the speech based on the audio signal having the divided sections is provided. The gain control means includes a section including a start position of the identified attack signal in the current frame to be processed, and the identified release signal. Based on the segment including the end position, the gain control by the correction function of the detected first signal portion is stopped, while the correction function calculated based on the identified attack signal is decreased to reduce the correction function. And a correction function calculating means for calculating the correction function in the current frame by calculating the correction function of the second signal portion.
ãï¼ï¼ï¼ï¼ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã«ãã
ã¦ãä¸è¨ç¬¬ï¼ã®æ¤åºææ®µã¯ãä¸è¨ç¾å¨ã®ãã¬ã¼ã ã«å
è¡
ãããã¬ã¼ã ã«ããã¦ä¿®æ£ãã¤ã³ãã§ããåºåã®æ°ãæ
å®ã®ç¬¬ï¼ã®ãããå¤ããå°ããããã¤ç¾å¨ã®ãã¬ã¼ã ã«
ã¢ã¿ãã¯ä¿¡å·ãå«ã¾ãã¦ããªãã¨ãã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·
é¨åãåå¨ããã¨æ±ºå®ããä¸è¨ç¬¬ï¼ã®æ¤åºææ®µã¯ãä¸è¨
ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²åºåããå½
該åºåã«ç¶ã次ã®ä¿®æ£ãã¤ã³ãã¾ã§ã®éã«ãããæå¤§ã®
ãã¼ã¯ã®çµ¶å¯¾å¤ãåå¨ããåºåã¨ã®éã®åé¢åºåæ°ãæ
å®ã®ç¬¬ï¼ã®ãããå¤ãã大ããã¨ãã«ãä¸è¨ç¬¬ï¼ã®ä¿¡å·
é¨åãåå¨ããã¨æ±ºå®ãããã¨ãç¹å¾´ã¨ãããIn the audio signal encoding device, the third detecting means has a number of sections, which are correction points, in a frame preceding the current frame being smaller than a predetermined first threshold value, and When the frame does not include an attack signal, it is determined that the first signal portion is present, and the fourth detecting means changes from the section including the start position of the attack signal to the section. It is determined that the second signal portion is present when the number of separation segments between the segment having the maximum absolute value of the peak up to the next correction point is greater than a predetermined second threshold value. It is characterized by doing.
ãï¼ï¼ï¼ï¼ãæ¬çºæã®ç¬¬ï¼ã®æ
æ§ã«ä¿ããªã¼ãã£ãªä¿¡å·
ã®ç¬¦å·ååã³å¾©å·åã·ã¹ãã ã¯ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®
符å·åè£
ç½®ã¨ããªã¼ãã£ãªä¿¡å·ã®å¾©å·åè£
ç½®ã¨ãåãã
ä¸è¨ãªã¼ãã£ãªä¿¡å·ã®å¾©å·åè£
ç½®ã¯ãä¸è¨ç¬¦å·åè£
ç½®ã«
ãã符å·åããããããã¹ããªã¼ã ä¿¡å·ã復å·åããã¤
éç´äº¤å¤æå¦çãè¡ããã¨ã«ãããè¤æ°ã®ãã¬ã¼ã ãã
ãªããªã¼ãã£ãªä¿¡å·ãå¾ãææ®µã¨ãä¸è¨å¾ããããªã¼ã
ã£ãªä¿¡å·ã«å¯¾ãã¦ãä¸è¨ä¿®æ£é¢æ°ã¨ã¯éã®ä¿®æ£é¢æ°ãç¨
ãã¦éå©å¾å¶å¾¡ãè¡ã£ã¦éå©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡
å·ãå¾ã¦åºåããææ®µã¨ãåãããã¨ãç¹å¾´ã¨ãããAn audio signal encoding and decoding system according to a fifth aspect of the present invention comprises the above audio signal encoding device and an audio signal decoding device,
The audio signal decoding device obtains an audio signal composed of a plurality of frames by decoding the bit stream signal encoded by the encoding device and performing an inverse orthogonal transform process; Means for performing an inverse gain control on the audio signal by using a correction function opposite to the above-mentioned correction function, and obtaining and outputting the audio signal subjected to the inverse gain control.
ãï¼ï¼ï¼ï¼ãã¾ããæ¬çºæã®å¥ã®æ
æ§ã«ä¿ã符å·åæ¹æ³
ã¯ããã£ã¸ã¿ã«ãªã¼ãã£ãªä¿¡å·ã®è¤æ°ã®æéãµã³ãã«ã
ããªããã¬ã¼ã ã«å¯¾ãã¦å¢å¹
åã¯æ¸è¡°ã®å©å¾ä¿æ°ã§ãã
ä¿®æ£ã¬ãã«alevcodeã¨ä¸è¨ä¿®æ£ã¬ãã«ã®ä½ç½®ã§ããä¿®æ£
ä½ç½®aloccodeã¨ã決å®ãã¦ãããªã¨ã³ã¼åã³ãã¹ãã¨ã³
ã¼ã®æå§ã¨ã¢ã¼ãã£ãã¡ã¯ãã®æå°åã¨ãéæãããã
ã®æ¹æ³ã§ãã£ã¦ãï¼ï½ï¼è¤æ°ã®æéãµã³ãã«ãããªãä¸
è¨ãã¬ã¼ã ãããããããçããåæ°ã®æéãµã³ãã«ã
ããªãè¤æ°ã®åºåã«åå²ããã¹ãããã¨ãï¼ï½ï¼ãªãªã¼
ã¹ä¿¡å·ã¨ããã¬ã¼ã ã®å¢çã«ãããã¢ã¿ãã¯ä¿¡å·ã¨ã®ã
ã广çãªå¶å¾¡ã¨ã«åãçµãããã«ãç¾å¨ã®ãã¬ã¼ã ã®
次ã«ç¶ããã¬ã¼ã ï¼æªæ¥ãã¬ã¼ã ï¼ã®äºã決ããããæ°
ã®åºåããããã¡ãªã³ã°ããã¹ãããã¨ãï¼ï½ï¼ååºå
ã«ããããã¹ã¦ã®æéãµã³ãã«ã®ä¿¡å·ã¬ãã«ã«ä¿ããã¼
ã¯ã®çµ¶å¯¾å¤MaxPeakãè¨ç®ããã¹ãããã¨ãï¼ï½ï¼æå®
ã®ã¢ã¿ãã¯ä¿¡å·ã®åºæºãç¨ãã¦ã¢ã¿ãã¯ä¿¡å·ã®ç«ä¸ãã
ã®éå§ä½ç½®ãæ¤åºããã¹ãããã¨ãï¼ï½
ï¼å¢å¹
ãå®è¡ã
ããå¿
è¦ãããä¸è¨æ¤åºãããéå§ä½ç½®ããã¢ã¿ãã¯ã
ã¤ã³ãã¨ãã¦ãã¼ãã³ã°ããã¹ãããã¨ãï¼ï½ï¼æå®ã®
ãªãªã¼ã¹ä¿¡å·ã®åºæºãç¨ãã¦ãªãªã¼ã¹ä¿¡å·ã®ç«ä¸ããã®
çµäºä½ç½®ãæ¤åºããã¹ãããã¨ãï¼ï½ï¼ä¸è¨æ¤åºããã
çµäºä½ç½®ããªãªã¼ã¹ãã¤ã³ãã¨ãã¦ãã¼ãã³ã°ãããã¤
ä¸è¨åãªãªã¼ã¹ãã¤ã³ãã«å¯¾ãã¦æã¾ããæ¸è¡°ã®å©å¾ä¿
æ°ãæå®ããã¹ãããã¨ãï¼ï½ï¼äºãã«é£æ¥ãããã¹ã¦
ã®ä¿®æ£ãã¤ã³ãï¼ãã¹ã¦ã®ã¢ã¿ãã¯ãã¤ã³ãåã³ãªãªã¼
ã¹ãã¤ã³ãï¼ã®éã«ããã¦ãæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤MaxP
eakãããããä½ç½®æ±ºãããä¸è¨ãã¼ã¯ã®çµ¶å¯¾å¤ããã¼
ã¯å¤InterModMaxPeakã«è¨æ¶ããä¸è¨ãã¼ã¯ã®çµ¶å¯¾å¤ã®
ä½ç½®ããã¼ã¯ä½ç½®InterModMaxLocã«è¨æ¶ããã¹ããã
ã¨ãï¼ï½ï¼å½è©²ä¿¡å·ãçåãããããã®ç®æ¨ã¨ãªããä¿¡
å·ã¬ãã«ã®ç®æ¨ãã¼ã¯å¤CurrFramePeakãè¨ç®ããã¹ã
ããã¨ãï¼ï½ï¼å½è©²ä¿¡å·ãèªç¶ãªéä¸ã示ãã¦ãããå¦
ãããã§ãã¯ããã¹ãããã¨ãï¼ï½ï¼ä¸è¨èªç¶ãªéä¸ã«
対å¦ããã¹ãããã¨ãï¼ï½ï¼ä¿¡å·ã¬ãã«ã®ç®æ¨ãã¼ã¯å¤
CurrFramePeakãç¨ãã¦ãä¸è¨ãã¼ãã³ã°ããããã¹ã¦
ã®ä¿®æ£ãã¤ã³ãã«å¯¾ããä¿®æ£ã¬ãã«ãè¨ç®ããã¹ããã
ã¨ãï¼ï½ï¼å½è©²ã¢ã¿ãã¯ä¿¡å·ãèªç¶ãªä¸æã示ãã¦ãã
ãå¦ãããã§ãã¯ããã¹ãããã¨ãï¼ï½ï¼ä¸è¨èªç¶ãªä¸
æã«å¯¾å¦ããã¹ãããã¨ãï¼ï½ï¼åä¸ã®ä¿®æ£ã¬ãã«ã®å©
å¾ä¿æ°ãæããã¤äºãã«é£ç¶ããä¿®æ£ãã¤ã³ããçµ±åã
ãã¹ãããã¨ãï¼ï½ï¼è¤æ°ã®æéãµã³ãã«ãããªãç¾å¨
ã®ãã¬ã¼ã ã®å©å¾å¶å¾¡å¾ã®ä¿¡å·ã¬ãã«ããï¼åæ§ã«ãè¤
æ°ã®æéãµã³ãã«ãããªãï¼ç¾å¨ã®ãã¬ã¼ã ã«å
è¡ãã
ãã¬ã¼ã ã®å©å¾å¶å¾¡å¾ã®ä¿¡å·ã¬ãã«ã«å¯¾ãã¦ææãªå·®å
ãæãããå¦ãããã§ãã¯ããã¹ãããã¨ãï¼ï½ï¼ä¸è¨
å·®åã«å¯¾å¦ããã¹ãããã¨ãï¼ï½ï¼ç¾å¨ã®ãã¬ã¼ã ã«å
è¡ãããã¬ã¼ã ã®éåº¦ã®æ¸è¡°ã鲿¢ããããã«ãç¾å¨ã®
ãã¬ã¼ã ã®æåã®ä¿®æ£ãã¤ã³ãã®ä¿®æ£ã¬ãã«ãå¶éãã
ã¹ãããã¨ãï¼ï½ï¼ä¸è¨å©å¾å¶å¾¡ããããã¬ã¼ã ã®çµç«¯
ã®ä¿¡å·ã¬ãã«ããå½è©²ãã¬ã¼ã ã®ä¸è¨çµç«¯ãããåã®é¨
åã«æ¯ã¹ã¦å¼±ããå¦ãããã§ãã¯ããã¹ãããã¨ã
ï¼ï½ï¼ä¿¡å·ã¬ãã«ã®ç®æ¨ãã¼ã¯å¤CurrFramePeakã«çã
ããªãããã«ãä¸è¨çµç«¯ãå¢å¹
ããï¼åã¯æ¸è¡°ããï¼ã¹
ãããã¨ãå«ããã¨ãç¹å¾´ã¨ãããAlso, the encoding method according to another aspect of the present invention uses a correction level alevcode, which is a gain coefficient for amplification or attenuation, and a position of the correction level with respect to a frame composed of a plurality of time samples of a digital audio signal. A method for determining a modified position aloc code to achieve suppression of pre-echo and post-echo and minimization of artifacts, the method comprising: (a) the frame consisting of a plurality of time samples, each of which has an equal number of To address the step of dividing into multiple sections of time samples, and (b) more effective control of the release signal and the attack signal at the frame boundaries, the frame following the current frame (future frame) ) Buffering a predetermined number of partitions, and (c) all time services in each partition. A step of calculating an absolute value MaxPeak of the peak related to the signal level of the sample; (d) a step of detecting a rising start position of the attack signal using a predetermined attack signal reference; and (e) amplification. Marking the detected start position that is necessary as an attack point; (f) detecting the end position of the trailing edge of the release signal using a predetermined release signal reference; (g) the detection Marking the marked end position as a release point and specifying the desired attenuation gain factor for each release point, (h) of all modification points (all attack points and release points) adjacent to each other. Absolute value of the maximum peak between
Positioning each eak, storing the absolute value of the peak in the peak value InterModMaxPeak, and storing the position of the absolute value of the peak in the peak position InterModMaxLoc, and (i) a target for equalizing the signal. , A step of calculating a target peak value CurrFramePeak of the signal level, (j) a step of checking whether or not the signal shows a natural drop, (k) a step of coping with the natural drop, and (l) ) Signal level target peak value
CurrFramePeak is used to calculate correction levels for all marked correction points; (m) check if the attack signal shows a natural rise; and (n) measure the natural Coping with the ascent, (o) integrating correction points having gain factors of the same modification level and consecutive to each other, (p) the signal after gain control of the current frame consisting of multiple time samples Checking whether the level has a significant difference to the signal level after gain control of the frame preceding the current frame (also consisting of a plurality of time samples), and (q) the difference And (r) the first correction point of the current frame to prevent excessive attenuation of the frame preceding the current frame. Limiting the modification level of the input signal, and (s) checking whether the signal level at the end of the gain-controlled frame is weaker than that before the end of the frame.
(T) amplifying (or attenuating) the end so that the signal level becomes equal to the target peak value CurrFramePeak.
ãï¼ï¼ï¼ï¼ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«ãã
ã¦ãä¸è¨ã¢ã¿ãã¯ä¿¡å·ã®åºæºã¯ãæ½å¨çãªã¢ã¿ãã¯ãã¤
ã³ãã«å
è¡ããäºã決ããããæ°ã®åºåã®æå¤§ã®ãã¼ã¯
ã®çµ¶å¯¾å¤MaxPeakã®ããããã«å¯¾ãããå½è©²æ½å¨çãªã¢
ã¿ãã¯ãã¤ã³ãã®ãã¼ã¯ã®çµ¶å¯¾å¤MaxPeakã®æ¯ã«åºã¥
ããä¸è¨æ½å¨çãªã¢ã¿ãã¯ãã¤ã³ãã«å
è¡ããåºåæ°
ããä¸è¨äºã決ããããæ°ãéæããã«ã¯ä¸ååã§ãã
ã¨ãã¯ãä¸è¨ãã¼ã¯ã®çµ¶å¯¾å¤MaxPeakã®ä»£ããã«ãè¤æ°
ã®æéãµã³ãã«ãããªãå©å¾å¶å¾¡ãããå
è¡ãããã¬ã¼
ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤PrevFramePeakã«åºã¥ãããã®æ¯ã
äºã決ãããããããå¤ãè¶
ãã¦ãããã¨ãå¿
è¦ã¨ãã
æ½å¨çãªã¢ã¿ãã¯ãã¤ã³ãã¨å
è¡ããã¢ã¿ãã¯ãã¤ã³ã
ã¨ã®åé¢ããäºã決ããããåºåæ°ãè¶
ãã¦ãããã¨ã
å¿
è¦ã¨ãããIn the audio signal encoding method, the attack signal criterion is the potential of each of the maximum peak absolute values MaxPeak of a predetermined number of sections preceding a potential attack point. Based on the ratio of the absolute value MaxPeak of the peak of the attack point, when the number of categories preceding the potential attack point is insufficient to achieve the predetermined number, the absolute value MaxPeak of the peak Instead of, based on the absolute value PrevFramePeak of the peak of the gain-controlled preceding frame consisting of multiple time samples, we require that this ratio exceeds a predetermined threshold,
It is necessary that the separation between the potential attack point and the preceding attack point exceeds a predetermined number of sections.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãã¢ã¿ãã¯ãã¤ã³ãã«å¯¾ããä¸è¨ãã¼ãã³ã°ã®å
ä½ã¯ãäºã決ããããæ£ã®å¤ãä¸è¨ã¢ã¿ãã¯ãã¤ã³ãã®
ä¿®æ£ã¬ãã«alevcodeã«æå®ãããã¨ã¨ãä¸è¨ã¢ã¿ãã¯ã
ã¤ã³ãã®åºåã®ã¤ã³ããã¯ã¹ãaloccodeã«è¨æ¶ãããã¨
ã¨ãç¹å¾´ã¨ãããIn the audio signal coding method, the marking operation for the attack point is performed by designating a predetermined positive value in the modification level alevcode of the attack point and by distinguishing the attack point from each other. The index is stored in the aloccode.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨ãªãªã¼ã¹ä¿¡å·ã®åºæºã¯ãæ½å¨çãªãªãªã¼
ã¹ãã¤ã³ã以å¾ã®äºã決ããããæ°ã®åºåã®æå¤§ã®ãã¼
ã¯ã®çµ¶å¯¾å¤MaxPeakã®ããããã«å¯¾ãããå½è©²æ½å¨çãª
ãªãªã¼ã¹ãã¤ã³ãã«å
è¡ãããã¼ã¯å¤InterModMaxPeak
ã®æ¯ã§ããããã¤ããã®æ¯ãäºã決ãããããããå¤ã
è¶
ãã¦ãããã¨ãå¿
è¦ã¨ããä¸è¨ãããå¤ã¯ãä¸è¨æ½å¨
çãªãªãªã¼ã¹ãã¤ã³ãã«å
è¡ããä¿®æ£ãã¤ã³ãã®ã¿ã¤ã
ããã¢ã¿ãã¯ã§ãããåã¯ãªãªã¼ã¹ã§ãããã«å¾ã£ã¦å¯
å¤ã§ãããã¨ãç¹å¾´ã¨ãããFurther, in the above audio signal encoding method, the reference of the release signal is the potential of each of the maximum peak absolute values MaxPeak of a predetermined number of sections after the potential release point. Value that precedes a specific release point InterModMaxPeak
And the ratio needs to exceed a predetermined threshold, the threshold being the type of modification point preceding the potential release point is an attack. It is variable according to whether it is a release or a release.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨ãã¼ãã³ã°ã®åä½ã¯ãä¸è¨ãªãªã¼ã¹
ãã¤ã³ãããåã§çºè¦ããããã¼ã¯å¤InterModMaxPeak
ã®å¤§ããã«å¾ã£ã¦å½è©²ãªãªã¼ã¹ãã¤ã³ãã®ä¿®æ£ã¬ãã«al
evcodeãè¨å®ãããã¨ã¨ãåºåã®ã¤ã³ããã¯ã¹ãä¸è¨ãª
ãªã¼ã¹ãã¤ã³ãã®ä¿®æ£ä½ç½®aloccodeã«è¨æ¶ãããã¨ã¨ã
ç¹å¾´ã¨ãããFurthermore, in the above-mentioned audio signal encoding method, the marking operation is performed by the peak value InterModMaxPeak found before the release point.
The modification level al of the release point according to the size of
It is characterized in that evcode is set and the index of the division is stored in the correction position aloccode of the release point.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨ç®æ¨ãã¼ã¯å¤CurrFramePeakã¯ãäºãã«é£
æ¥ãããã¹ã¦ã®ä¿®æ£ãã¤ã³ãéã®ãã¼ã¯ã®çµ¶å¯¾å¤InterM
odMaxPeakã¨ãç¾å¨ã®ãã¬ã¼ã ã®ãã¹ã¦ã®ä¿®æ£ã¬ãã«ale
vcodeã®ä¿æ°ã¨ãæªæ¥ãã¬ã¼ã ã®ç¬¬ï¼ã®åºåã®ãã¼ã¯ã®
絶対å¤MaxPeakã¨ã«åºã¥ããã¨ãç¹å¾´ã¨ãããIn the audio signal encoding method, the target peak value CurrFramePeak is the absolute value InterM of peaks between all correction points adjacent to each other.
odMaxPeak and all modification levels of the current frame ale
It is characterized in that it is based on the coefficient of vcode and the absolute value MaxPeak of the peak of the first section of the future frame.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨èªç¶ãªéä¸ããã§ãã¯ããããã®åºæº
ã¯ãå
è¡ãããã¬ã¼ã ã«ããã¦å®è¡ãããä¿®æ£ã®ç·æ°ã
æå®ã®ãããå¤ããå°ãããã¨ã¨ãç¾å¨ã®ãã¬ã¼ã ã«ã
ãã¦ãã¼ãã³ã°ãããä¿®æ£ãã¤ã³ãããã¹ã¦ãªãªã¼ã¹ä¿¡
å·ã®ã¿ã¤ãã«å±ãã¦ãããã¨ã¨ã§ãããFurther, in the above method of encoding an audio signal, the criteria for checking the natural drop are that the total number of modifications performed in the preceding frame is less than a predetermined threshold, and that the current frame is All the correction points marked in 1. belong to the type of release signal.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨èªç¶ãªéä¸ã«å¯¾å¦ããåä½ã¯ãçºè¦
ããããã¹ã¦ã®ãªãªã¼ã¹ãã¤ã³ãã®é¤å»ã§ãããã¨ãç¹
å¾´ã¨ãããFurthermore, in the above-mentioned audio signal encoding method, the operation for coping with the natural drop is removal of all found release points.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨ãã¹ã¦ã®ä¿®æ£ãã¤ã³ãã«å¯¾ããä¿®æ£ã¬ãã«
ã¯ãç®æ¨ãã¼ã¯å¤CurrFramePeakã¨ãä¸è¨ä¿®æ£ãã¤ã³ã
ã®ããããã«å
è¡ãããã¼ã¯å¤InterModMaxPeakã¨ã«åº
ã¥ãã¦è¨ç®ããããã¼ã¯å¤InterModMaxPeakãã¼ãã§ã
ãã¨ãã¯ãä¸è¨è¨ç®ã«ããã¦ãä¸è¨ãã¼ã¯å¤InterModMa
xPeakã®ä»£ããã«äºã決ããããã¼ãã§ãªãå°ããªå¤ã
ç¨ãããããã¨ãç¹å¾´ã¨ãããIn the audio signal encoding method, the correction levels for all the correction points are calculated based on the target peak value CurrFramePeak and the peak value InterModMaxPeak preceding each of the correction points. When InterModMaxPeak is zero, in the above calculation, the above peak value InterModMa
It is characterized in that a predetermined small non-zero value is used instead of xPeak.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨èªç¶ãªä¸æããã§ãã¯ããããã®åºæº
ã¯ãä¸è¨åã¢ã¿ãã¯ãã¤ã³ãã®ä¿®æ£ä½ç½®aloccodeã¨ä¸è¨
ã¢ã¿ãã¯ãã¤ã³ãã®å¾ã«ç¶ããã¼ã¯ä½ç½®InterModMaxLoc
ã¨ã®è·é¢ã«åºã¥ããã¨ãç¹å¾´ã¨ãããFurther, in the above-mentioned audio signal encoding method, the reference for checking the natural rise is the modified position aloccode of each attack point and the peak position InterModMaxLoc following the attack point.
It is based on the distance between and.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨èªç¶ãªä¸æã«å¯¾å¦ããåä½ã¯ãä¸è¨
ã¢ã¿ãã¯ãã¤ã³ãã®ä¿®æ£ã¬ãã«alevcodeãäºã決ããã
ãéã ããã¯ãªã¡ã³ããããã¨ãç¹å¾´ã¨ãããFurthermore, in the audio signal encoding method, the operation for coping with the natural rise is characterized in that the modification level alevcode of the attack point is decremented by a predetermined amount.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãå©å¾å¶å¾¡å¾ã®å
è¡ãããã¬ã¼ã ã¨å©å¾å¶å¾¡å¾ã®
ç¾å¨ã®ãã¬ã¼ã ã¨ã®ä¸è¨å·®åãè©ä¾¡ããããã®åºæºã¯ã
ç®æ¨ãã¼ã¯å¤CurrFramePeakã®å
è¡ãããã¬ã¼ã ã®ãã¼
ã¯å¤PrevFramePeakã«å¯¾ããæ¯ã«åºã¥ããã¨ãç¹å¾´ã¨ã
ããIn the audio signal coding method, the reference for evaluating the difference between the preceding frame after gain control and the current frame after gain control is as follows.
It is characterized in that it is based on the ratio of the target peak value CurrFramePeak to the peak value PrevFramePeak of the preceding frame.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨å·®åã«å¯¾å¦ããåä½ã¯ãä¸è¨è¨ç®ããã
æ¯ããå°åºãããä¿®æ£ã¬ãã«alevcodeãæããæ°ããä¿®
æ£ãã¤ã³ããæ¿å
¥ãããã¨ã§ããããã®ä¿®æ£ä½ç½®alocco
deã¯ãã¬ã¼ã ã®æåã®åºåã®ã¤ã³ããã¯ã¹ã«çãããã¨
ãç¹å¾´ã¨ãããFurther, in the above audio signal coding method, the operation for coping with the difference is to insert a new correction point having a correction level alevcode derived from the calculated ratio, the correction position alocco
de is characterized by being equal to the index of the first segment of the frame.
ãï¼ï¼ï¼ï¼ãã¾ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·å
æ¹æ³ã«ããã¦ãä¸è¨æåã®ä¿®æ£ãã¤ã³ãã®ã¬ãã«ãå¶é
ããåä½ã¯ãç¾å¨ã®ãã¬ã¼ã ã«å
è¡ãããã¬ã¼ã ã®ãã¹
ã¦ã®ã¢ã¿ãã¯ãã¤ã³ãã«ä¿ãæå°ã®ä¿®æ£ã¬ãã«alevcode
ã«åºã¥ããã¨ãç¹å¾´ã¨ãããFurthermore, in the above-mentioned audio signal coding method, the operation of limiting the level of the first correction point is the minimum correction level alevcode relating to all attack points of the frame preceding the current frame.
It is based on.
ãï¼ï¼ï¼ï¼ãã¾ããä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³ã«
ããã¦ãä¸è¨å¼±ãçµç«¯ããã§ãã¯ããããã®åºæºã¯ãæ
å¾ã®ä¿®æ£ãã¤ã³ãã®å¾ã«ç¶ããã¹ã¦ã®åºåã®æå¤§ã®ãã¼
ã¯ã®çµ¶å¯¾å¤MaxPeakã«å¯¾ãããç®æ¨ãã¼ã¯å¤CurrFramePe
akã®æ¯ã«åºã¥ããã¨ãç¹å¾´ã¨ãããIn the audio signal coding method, the criterion for checking the weak termination is that the target peak value CurrFramePe with respect to the absolute value MaxPeak of the maximum peaks of all the sections following the last correction point.
It is characterized by being based on the ratio of ak.
ãï¼ï¼ï¼ï¼ãããã«ãä¸è¨ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³
ã«ããã¦ãä¸è¨çµç«¯ãå¢å¹
ããåä½ã¯ãä¸è¨è¨ç®ããã
æ¯ããå°åºãããä¿®æ£ã¬ãã«alevcodeãæããæ°ããä¿®
æ£ãã¤ã³ããæ¿å
¥ãããã¨ã§ããããã®ä¿®æ£ä½ç½®alocco
deã¯ãã¬ã¼ã ã®æå¾ã®åºåã®ã¤ã³ããã¯ã¹ã«çãããã¨
ãç¹å¾´ã¨ãããFurther, in the above method of encoding an audio signal, the operation of amplifying the termination is to insert a new correction point having a correction level alevcode derived from the calculated ratio, and its correction position alocco.
de is characterized by being equal to the index of the last segment of the frame.
ãï¼ï¼ï¼ï¼ã[0054]
ãçºæã®å®æ½ã®å½¢æ
ã以ä¸ãå³é¢ãåç
§ãã¦æ¬çºæã®å®
æ½å½¢æ
ã«ã¤ãã¦èª¬æãããæ¬é¡æç´°æ¸ã§ã¯ãATRAC
ï¼ã®ããã®ãããã£ã¹ã¯è¨é²åçã·ã¹ãã ãä¸å®æ½å½¢æ
ã¨ãã¦åç
§ãã¦èª¬æããããæ¬çºæã®ç¬¦å·åæ¹æ³ã¯ãå©
å¾å¶å¾¡ãç¨ããä»ã®ç¬¦å·åå¨ã«ãé©ç¨ãããã¨ãã§ã
ããBEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings. As used herein, ATRAC
Although a mini disk recording / reproducing system for 3 will be described as an embodiment, the encoding method of the present invention can be applied to other encoders using gain control.
ãï¼ï¼ï¼ï¼ãå³ï¼ã¯ãæ¬çºæã®ä¸å®æ½å½¢æ
ã«ä¿ãããã
ã£ã¹ã¯è¨é²åçã·ã¹ãã ã®æ§æã示ããããã¯å³ã§ã
ãããã®å®æ½å½¢æ
ã¯ãç¹ã«ããªã¼ãã£ãªã¨ã³ã³ã¼ãï¼å
ã®è©³ç´°æ§æã«ç¹å¾´ãæãã¦ãããFIG. 1 is a block diagram showing the configuration of a mini disk recording / reproducing system according to an embodiment of the present invention. This embodiment is particularly characterized by the detailed configuration inside the audio encoder 2.
ãï¼ï¼ï¼ï¼ãå³ï¼ã«ããã¦ãAï¼ï¼¤ã³ã³ãã¼ã¿ï¼ã¯ãªã¼
ãã£ãªå
¥åä¿¡å·ãAï¼ï¼¤å¤æãã¦ãã£ã¸ã¿ã«åããããª
ã¼ãã£ãªãµã³ãã«ä¿¡å·ã«å¤æããæ¬¡ãã§ããªã¼ãã£ãªã¨
ã³ã³ã¼ãï¼ã¯ã詳細å¾è¿°ãããããªå©å¾å¶å¾¡æ¹æ³ãç¨ã
ã¦ä¸è¨å¤æããããªã¼ãã£ãªãµã³ãã«ä¿¡å·ãå§ç¸®ãã¦ç¬¦
å·åããATRACï¼ã®ãããã¹ããªã¼ã ä¿¡å·ãçºçã
ããæ¬¡ãã§ããããã£ã¹ã¯è¨é²è£
ç½®ï¼ã¯ãATRACï¼
ã®ãããã¹ããªã¼ã ä¿¡å·ãæå®ã®å¤èª¿è¨é²ä¿¡å·ã«å¤èª¿ã
ãå¾ãä¸è¨å¤èª¿è¨é²ä¿¡å·ããããã£ã¹ã¯ï¼ã«è¨é²ããã
䏿¹ããããã£ã¹ã¯åçè£
ç½®ï¼ã¯ããããã£ã¹ã¯ï¼ãã
è¨é²ä¿¡å·ãåçãã¦å¾©èª¿ãããã¨ã«ãããATRACï¼
ã®ãããã¹ããªã¼ã ä¿¡å·ãåçãããããã«ããªã¼ãã£
ãªãã³ã¼ãï¼ã¯å§ç¸®ããã¦ããATRACï¼ã®ãããã¹
ããªã¼ã ä¿¡å·ã«å¯¾ãã¦ãä¸è¨å©å¾å¶å¾¡æ¹æ³ã¨ã¯éã®éå©
å¾å¶å¾¡æ¹æ³ãç¨ãã¦ãã®å©å¾ãå¶å¾¡ããªãããã£ã¸ã¿ã«
ãªã¼ãã£ãªä¿¡å·ã«å¾©å·åããå¾ãæå¾ã«ï¼¤ï¼ï¼¡ã³ã³ãã¼
ã¿ï¼ã¯ãã£ã¸ã¿ã«ãªã¼ãã£ãªä¿¡å·ãDï¼ï¼¡å¤æãã¦ãªã¼
ãã£ãªåºåä¿¡å·ãåºåãããIn FIG. 1, an A / D converter 1 A / D converts an audio input signal to convert it into a digitized audio sample signal, and then an audio encoder 2 uses a gain control method as will be described later in detail. It is used to compress and encode the converted audio sample signal to generate an ATRAC3 bitstream signal. Next, the mini disk recording device 3 is set to ATRAC3.
After modulating the bit stream signal of 1 to a predetermined modulation recording signal, the modulation recording signal is recorded on the mini disk 4.
On the other hand, the mini-disc reproducing device 5 reproduces the recording signal from the mini-disc 4 and demodulates it so that the ATRAC3
Play the bitstream signal of. Further, the audio decoder 6 decodes the compressed bit stream signal of the ATRAC3 into a digital audio signal while controlling the gain by using an inverse gain control method which is the reverse of the above gain control method. The D / A converter 7 D / A converts the digital audio signal and outputs an audio output signal.
ãï¼ï¼ï¼ï¼ãä¸è¿°ã®å©å¾å¶å¾¡åã³éå©å¾å¶å¾¡ã«ãã£ã¦ã
ãã£ã¸ã¿ã«ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·ååã³å¾©å·åã®éã«çº
çããããªã¨ã³ã¼éé³åã³ãã¹ãã¨ã³ã¼éé³ãæå§ãã
ãã¨ãã§ããããã®å®æ½å½¢æ
ã§ã¯ãè¨é²åçã·ã¹ãã
ã¯ãATRACï¼ã®ãããã¹ããªã¼ã ä¿¡å·ããããã£ã¹
ã¯ï¼ã«è¨é²ããããã«æ§æããã¦ããããæ¬çºæã¯ãã
ã«éããããã®ä»ã®è¨é²åªä½ã¾ãã¯ä¼éåªä½ãç¨ããã
ãã«è¨é²åçã·ã¹ãã ãä¼éã·ã¹ãã ãæ§æãã¦ãã
ããBy the above gain control and inverse gain control,
It is possible to suppress pre-echo noise and post-echo noise that occur when encoding and decoding a digital audio signal. In this embodiment, the recording / reproducing system is configured to record the bit stream signal of the ATRAC3 on the mini disc 4, but the present invention is not limited to this, and recording is performed using another recording medium or transmission medium. A reproduction system and a transmission system may be configured.
ãï¼ï¼ï¼ï¼ãå³ï¼ã¯ãå³ï¼ã®ãªã¼ãã£ãªã¨ã³ã³ã¼ãï¼ã®
è©³ç´°æ§æã示ããããã¯å³ã§ãããå³ï¼ã«ããã¦ããªã¼
ãã£ãªã¨ã³ã³ã¼ãï¼ã¯ãATRACï¼ã®é³å£°ä¿¡å·å§ç¸®æ
è¡ãç¨ããå
¸åçãªå¤æç¬¦å·åå¨ã§ãããããªã¼ãã£ãª
ã¨ã³ã³ã¼ãï¼ã®è©³ç´°æ§æã®ãã¡ãç¹ã«ãå©å¾æ¤åºé¨ï¼ï¼
åã³å©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã®è©³ç´°æ§æã«ãã
ã¦ç¹å¾´ãæãã¦ãããATRACï¼ã¯ãåºæ¬çã«ãï¼
ï¼ï¼ï¼ï½ï¼¨ï½ã§ãµã³ããªã³ã°ããããªã¼ãã£ãªãµã³ãã«
ä¿¡å·ã®ã¹ããªã¼ã ã«å¯¾ãã¦æ§æãããããµããã³ã符å·
åã¨å¤æç¬¦å·åã®æè¡ã®ãã¤ããªãã符å·åæ¹æ³ã§ã
ããFIG. 2 is a block diagram showing a detailed structure of the audio encoder 2 of FIG. In FIG. 2, the audio encoder 2 is a typical conversion encoder using the audio signal compression technique of the ATRAC 3, but among the detailed configuration of the audio encoder 2, in particular, the gain detection unit 14
The gain control units 15-1 to 15-4 are characterized by their detailed configurations. ATRAC3 is basically 4
It is a hybrid coding method of sub-band coding and transform coding technology, which is configured for a stream of audio sample signals sampled at 4.1 kHz.
ãï¼ï¼ï¼ï¼ãå³ï¼ã«ç¤ºãããã«ããªã¼ãã£ãªã¨ã³ã³ã¼ã
ï¼ã¯ãæéé åã«ããã¦ããããï¼ï¼ï¼ï¼åã®ãµã³ãã«
ãåããé£ç¶çãªãªã¼ãã£ãªãµã³ãã«ä¿¡å·ã®ãã¬ã¼ã
ããï¼ã¹ãã¼ã¸ã®ï¼±ï¼ï¼¦ãã£ã«ã¿ï¼ï¼ãç¨ãã¦ãï¼ã¤ã®
é¨å帯åï¼ãµããã³ãï¼ä¿¡å·ã«å¨æ³¢æ°å¸¯ååå²ããæ¬¡
ã«ãåé¨å叝忝ã«ãæéæ¹åã«ããããï¼ï¼ï¼åã®ãµ
ã³ãã«ä¿¡å·ãæãããã¬ã¼ã ã«åå²ããä¿¡å·ï¼ä»¥ä¸ããµ
ããã³ããã¬ã¼ã ã¨ãããï¼ãåºåãããããã§ãï¼±ï¼
Fåå²ãã£ã«ã¿ï¼ï¼ã¯ã詳ããã¯ããªã¼ãã£ãªãµã³ãã«
ä¿¡å·ãä½åä¿¡å·ã¨é«åä¿¡å·ã«çåãããã¼ãã¹ãã£ã«ã¿
ï¼ï¼åã³ãã¤ãã¹ãã£ã«ã¿ï¼ï¼ã¨ãä¸è¨åå²ãããä½å
åã³é«åä¿¡å·ã卿³¢æ°ã«ã¤ãã¦ããã«çåãããã£ã«ã¿
ãã³ã¯ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨ãåãã¦æ§æããããå¾
ã£ã¦ãï¼±ï¼ï¼¦ãã£ã«ã¿ï¼ï¼ã®åºåã¯ãããããï¼ï¼ï¼å
ã®ãµã³ãã«ä¿¡å·ãæãããï¼åã®ãã¦ã³ãµã³ããªã³ã°ã
ãããµããã³ããã¬ã¼ã ã§ããããããã®åã
ã¯ãæé
軸ä¸ã§ã¯ãï¼ï¼¤ï¼£ï¼´å¤æã®ããã®å¤æãããã¯ã®ååã®
é·ãã§ãããï¼ä¹è³ï¼ï¼ï¼ï½ï¼¨ï½ãï¼ï¼ï¼ï½ï¼¨ï½ä¹è³ï¼
ï¼ï½ï¼¨ï½ãï¼ï¼ï½ï¼¨ï½ä¹è³ï¼ï¼ï¼ï¼ï½ï¼¨ï½ãåã³ï¼ï¼ï¼
ï¼ï½ï¼¨ï½ä¹è³ï¼ï¼ï½ï¼¨ï½ã®ï¼ã¤ã®å¨æ³¢æ°å¸¯ã®ããããã
ãå¾ããããªã¼ãã£ãªä¿¡å·ã§ãããAs shown in FIG. 2, the audio encoder 2 uses a two-stage QMF filter 10 to generate four sub-bands (frames) of continuous audio sample signals each including 1024 samples in the time domain. A subband) signal is frequency-band-divided, and then a signal (hereinafter referred to as a subband frame) divided into frames each having 256 sample signals in the time direction is output for each partial band. Where QM
More specifically, the F division filter 10 divides the audio sample signal into a low-pass signal and a high-pass signal equally, and a low-pass filter 11 and a high-pass filter 12, and further divides the divided low-pass and high-pass signals into frequencies. The filter banks 13-1 to 13-4 are provided. Therefore, the output of QMF filter 10 is four downsampled subband frames, each having 256 sample signals. Each of them is half the length of a transform block for MDCT transform on the time axis, and is 0 to 5.5 kHz, 5.5 kHz to 1
1 kHz, 11 kHz to 16.5 kHz, and 16.
It is an audio signal obtained from each of four frequency bands of 5 kHz to 22 kHz.
ãï¼ï¼ï¼ï¼ããªãããã£ã«ã¿ãã³ã¯ï¼ï¼âï¼ä¹è³ï¼ï¼â
ï¼ã®å¾æ®µã«ãããããã¸ã§ã¤ã³ãã¹ãã¬ãªå¦çãå®è¡ã
ãåè·¯é¨ãããã«åãã¦ããããã¸ã§ã¤ã³ãã¹ãã¬ãªå¦
çã®åè·¯é¨ã¯ãä¾ãã°ãã¹ãã¬ãªä¿¡å·ã¨ãã¦å
¥åããã
ãªã¼ãã£ãªä¿¡å·ããå³ãã£ã³ãã«ä¿¡å·ã¨å·¦ãã£ã³ãã«ä¿¡
å·ã¨ã®å¹³åä¿¡å·ã¨ãå³ãã£ã³ãã«ä¿¡å·ã®å©å¾ãã¼ã¿ã¨ã
å·¦ãã£ã³ãã«ä¿¡å·ã®å©å¾ãã¼ã¿ã¨ã«å¤æãããªã¼ãã£ãª
ä¿¡å·ã®ãããªãå§ç¸®ãéæãããã¨ãã§ãããThe filter banks 13-1 to 13-
Each of the subsequent stages of 4 may further include a circuit unit that executes joint stereo processing. The circuit part of the joint stereo process, for example, an audio signal input as a stereo signal, an average signal of the right channel signal and the left channel signal, gain data of the right channel signal,
The left channel signal can be converted to gain data to achieve further compression of the audio signal.
ãï¼ï¼ï¼ï¼ã次ãã§ãä¿®æ£é¢æ°ãçæãã¦ãåãµããã³
ããã¬ã¼ã ã«å¯¾ãã¦å©å¾å¶å¾¡ãå®è¡ããããããã§ãã
ã¬ã¼ã ä¸ã®åãªã¼ãã£ãªãµã³ãã«ä¿¡å·ã«å¯¾ããå©å¾å¶å¾¡
ã®å¤§ããï¼ããªãã¡å¢å¹
çåã¯æ¸è¡°çï¼ãå®ç¾©ããå©å¾
ä¿æ°ï¼åã¯ãä¿®æ£é¢æ°å¤ã¨å¼ã¶ãï¼ãããã¬ã¼ã ãåä½
ã¨ãã¦ã¾ã¨ãããã®ããä¿®æ£é¢æ°ãã¨å¼ã¶ããã£ã«ã¿ã
ã³ã¯ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ããåºåãããåãµããã³ã
ãã¬ã¼ã ã®ä¿¡å·ã¯ãå©å¾æ¤åºé¨ï¼ï¼ã¨å©å¾å¶å¾¡é¨ï¼ï¼â
ï¼ä¹è³ï¼ï¼âï¼ã«ããããå
¥åãããå©å¾æ¤åºé¨ï¼ï¼
ã¯ããããã¡ã¡ã¢ãªï¼ï¼ï½ã¨ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ãå
ãã¦æ§æãããåãµããã³ããã¬ã¼ã ã«å¯¾ããä¿®æ£é¢æ°
ãè¨ç®ãã¦åºåããæ¬¡ãã§ãåå©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³
ï¼ï¼âï¼ã¯ãä¸è¨çæãããä¿®æ£é¢æ°ãç¨ãã¦åãµãã
ã³ããã¬ã¼ã ã«å¯¾ãã¦å©å¾å¶å¾¡ãå®è¡ããã以ä¸ãå³ï¼
ãåç
§ãã¦ãç¹ã«ããã£ã«ã¿ãã³ã¯ï¼ï¼âï¼ããåºåã
ããï¼ä¹è³ï¼ï¼ï¼ï½ï¼¨ï½ã®å¨æ³¢æ°å¸¯ã®ãµããã³ããã¬ã¼
ã ã«ã¤ãã¦èª¬æãããThen, a correction function is generated and gain control is executed for each subband frame. Here, a gain coefficient (or a correction function value) that defines the magnitude of gain control (that is, an amplification factor or an attenuation factor) for each audio sample signal in a frame is summarized in units of frames. Call it a "correction function". The signals of the sub-band frames output from the filter banks 13-1 to 13-4 are the gain detection unit 14 and the gain control unit 15-.
1 to 15-4, and the gain detector 14
Is composed of a buffer memory 14a and a correction function calculator 14b, calculates and outputs a correction function for each subband frame, and then each of the gain controllers 15-1 to 15-4 is generated as described above. Gain control is performed for each subband frame using a modification function. Below, FIG.
In particular, a subband frame in the frequency band of 0 to 5.5 kHz output from the filter bank 13-1 will be described with reference to FIG.
ãï¼ï¼ï¼ï¼ãå³ï¼ã¯ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ã«å
è¡ã
ãå©å¾æ¤åºé¨ï¼ï¼åã³å©å¾å¶å¾¡é¨ï¼ï¼âï¼ã«ããã夿
ãããã¯ã®çæã示ããããã¯å³ã§ãããããã§ãåãª
ã¼ãã£ãªãµã³ãã«ä¿¡å·ã®ã¹ããªã¼ã ããæ§æãããå
¥å
ããããªã¼ãã£ãªä¿¡å·ã®ãã¡ã§ãï½çªç®ã®ãµããã³ã
ï¼ï½ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ï¼ã«ãããï½çªç®ã®ãªã¼ãã£ãªã
ã¬ã¼ã ãããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ãã¨ãã¦è¡¨
ããå³ï¼ã§ã¯ããµããã³ãã®ã¤ã³ããã¯ã¹ï½ï¼ï¼ã§ã
ããã¾ããä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãï½âï¼çªç®ã®ãµã
ãã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã®ä¿®æ£é¢æ°ãè¨ç®ãã
ã¨ãã«ï¼ï¼ï½âï½ã§å³ç¤ºãããï½çªç®ã®ãµããã³ããã¬
ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã®ä¿®æ£é¢æ°ãè¨ç®ããã¨ãã«ï¼ï¼ï½â
ï½ã§å³ç¤ºããããã¤ãï½ï¼ï¼çªç®ã®ãµããã³ããã¬ã¼ã
ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã®ä¿®æ£é¢æ°ãè¨ç®ããã¨ãã«ï¼ï¼ï½â
ï½ã§å³ç¤ºããã¦ãããããããã¯åä¸ã®ä¿®æ£é¢æ°è¨ç®é¨
ï¼ï¼ï½ã便å®çã«åãã¦å³ç¤ºãããã®ã§ãããFIG. 4 is a block diagram showing the generation of conversion blocks in the gain detection section 14 and the gain control section 15-1 preceding the MDCT processing section 15-1. Here, among the input audio signals composed of streams of audio sample signals, the n-th audio frame in the i-th sub-band (i = 0, 1, 2, 3) is referred to as âsub-band frame [ i] [n] ". In FIG. 4, the subband index i = 0. Further, the correction function calculation unit 14b is illustrated by 14b-a when calculating the correction function of the n-1th subband frame [i] [n-1], and is the nth subband frame [i] [ 14b-when calculating the correction function of
14b- when calculating the correction function of the n + 1th subband frame [i] [n + 1] illustrated in FIG.
Although shown by c, these are the same correction function calculation units 14b separately shown for convenience.
ãï¼ï¼ï¼ï¼ãå³ï¼ã«ããã¦ãå©å¾å¶å¾¡ãããã¹ãï½çªç®
ã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ãä¿®æ£é¢æ°ãè¨ç®
ããããã«ãå¾ç¶ãããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼
ï¼ï¼½ã¨ã¨ãã«ãããã¡ã¡ã¢ãªï¼ï¼ï½ã«æ ¼ç´ããããä¿®æ£
颿°è¨ç®é¨ï¼ï¼ï½âï½ã¯ãï¼ã¤ã®ãµããã³ããã¬ã¼ã
ï¼»ï½ï¼½ï¼»ï½ï¼½åã³ï¼»ï½ï¼ï¼ï¼½ã¨ãå©å¾å¶å¾¡ããããµãã
ã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã®ãã¼ã¯å¤PrevFramePea
kï¼»ï½ï¼½ã¨ã«åºã¥ãã¦ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½
ã®ããã®ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼½ãè¨ç®ããæ¬¡ãã§ã
ä¹ç®å¨ï¼ï¼°ï¼ï½ã¯ä¸è¨ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼½ããµã
ãã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã«ä¹ç®ãã¦ä¹ç®çµæã®ãµã
ãã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ãç¸¦ç¶æ¥ç¶æ¼ç®åCOï¼å
ã³ä¹ç®å¨ï¼ï¼°ï¼ï½ã«åºåãããããã¨ã¨ãã«ãä¹ç®çµæ
ã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ããã¼ã¯å¤PrevFr
amePeakï¼»ï½ï¼½ã¨ãã¦æ¬¡ã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½
ï¼»ï½ï¼ï¼ï¼½ã®è¨ç®ã«ç¨ããããã«ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼
ï½âï½ã«éããããã¾ããä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½âï½
ã¯ããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½åã³ï¼»ï½ï¼½
ã¨ãå©å¾å¶å¾¡ããããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½â
ï¼ï¼½ã®ãã¼ã¯å¤PrevFramePeakï¼»ï½ï¼½ã¨ã«åºã¥ãã¦ãµã
ãã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã®ããã®ä¿®æ£é¢æ°ï¼ï¼¦
ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ãè¨ç®ããæ¬¡ãã§ãä¹ç®å¨ï¼ï¼°ï¼ï½ã¯
ä¸è¨ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ããµããã³ããã¬ã¼
ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã«ä¹ç®ãã¦ä¹ç®çµæã®ãµããã³ãã
ã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ãä¹ç®å¨ï¼ï¼°ï¼ï½ã«åºåããã
ããã¨ã¨ãã«ãä¹ç®çµæã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½
ï¼»ï½âï¼ï¼½ã¯ããã¼ã¯å¤PrevFramePeakï¼»ï½ï¼½ã¨ãã¦æ¬¡
ã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã®è¨ç®ã«ç¨ãããã
ã«ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½âï½ã«éããããããã«ãå
æ§ã«ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½âï½ã¯ããµããã³ããã¬ã¼
ã ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½åã³ï¼»ï½ï¼ï¼ï¼½ã¨ãå©å¾å¶å¾¡ããã
ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã®ãã¼ã¯å¤PrevFrameP
eakï¼»ï½ï¼½ã¨ã«åºã¥ãã¦ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½
ï¼ï¼ï¼½ã®ããã®ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ãè¨ç®
ããæ¬¡ãã§ãä¹ç®å¨ï¼ï¼°ï¼ï½ã¯ä¸è¨ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½
ï¼»ï½ï¼ï¼ï¼½ããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã«ä¹
ç®ãã¦ä¹ç®çµæã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½
ãç¸¦ç¶æ¥ç¶æ¼ç®åCOï¼åã³ä¹ç®å¨ï¼ï¼°ï¼ï½ã«åºåã
ããIn FIG. 4, the nth sub-band frame [i] [n] to be gain controlled has the following sub-band frame [i] [n +] for calculating the correction function.
1] and stored in the buffer memory 14a. The correction function calculator 14b-b has two peak values PrevFramePea of the subband frames [i] [n] and [n + 1] and the gain-controlled subband frame [i] [n-1].
Subband frame [i] [n] based on k [i]
Compute a modified function MF [i] [n] for
The multiplier MP1b multiplies the modified function MF [i] [n] by the subband frame [i] [n] and multiplies the resulting subband frame [i] [n] by the cascade operator CO1 and the multiplier MP2b. Output to. At the same time, the sub-band frames [i] [n] resulting from the multiplication have peak values PrevFr.
The next subband frame [i] as amePeak [i]
The correction function calculator 14 is used to calculate [n + 1].
sent to bc. In addition, the correction function calculation unit 14b-a
Are subband frames [i] [n-1] and [n]
And gain-controlled subband frame [i] [n-
2] peak value PrevFramePeak [i] and the correction function MF for subband frame [i] [n-1]
[I] [n-1] is calculated, and then the multiplier MP1a multiplies the correction function MF [i] [n-1] by the sub-band frame [i] [n-1] to obtain the sub result of the multiplication. The band frame [i] [n-1] is output to the multiplier MP2a.
Along with that, the subband frame [i] of the multiplication result
[N-1] is sent to the correction function calculator 14b-b as a peak value PrevFramePeak [i] for use in the calculation of the next subband frame [i] [n]. Further, similarly, the correction function calculators 14b-c similarly determine the peak value PrevFrameP of the subband frames [i] [n + 1] and [n + 2] and the gain-controlled subband frame [i] [n].
Subband frame [i] [n] based on eak [i]
+1] for calculating the correction function MF [i] [n + 1], and the multiplier MP1c then calculates the correction function MF [i]
Subband frame [i] [n + 1] is obtained by multiplying [n + 1] by subband frame [i] [n + 1].
To the cascade connection operator CO2 and the multiplier MP2c.
ãï¼ï¼ï¼ï¼ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ã«ãããå¤å½¢é¢æ£
ã³ãµã¤ã³å¤æï¼ï¼ï¼¤ï¼£ï¼´ï¼å¦çã«å
è¡ãã¦ããµããã³ã
ãã¬ã¼ã ã®å¢çã«ãããä¿®æ£é¢æ°ã®é£ç¶æ§ãä¿è¨¼ããªã
ããï¼ã¤ã®é£ç¶çãªãµããã³ããã¬ã¼ã ãäºãã«ç¸¦ç¶æ¥
ç¶ããããä¹ç®å¨ï¼ï¼°ï¼ï½ã¯ãä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½
âï¼ï¼½ã«ãã£ã¦å©å¾å¶å¾¡ããã¦ä¹ç®å¨ï¼ï¼°ï¼ï½ããåºå
ããããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ããæ¬¡ã®ãµ
ããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã®ããã®ä¿®æ£é¢æ°ï¼ï¼¦
ï¼»ï½ï¼½ï¼»ï½ï¼½ã®æåã®åºåï¼æ¬å®æ½å½¢æ
ã«ããã¦ãåºå
ï¼ãã¼ãã£ã·ã§ã³ã¨ãå¼ã°ãããï¼ã¨ã¯ãï¼ã¤ã®ãµãã
ã³ããã¬ã¼ã ãï¼ï¼åã®åºåã«åå²ããã¨ãã®å½è©²åºå
ããããï¼ã®å©å¾ä¿æ°ï¼ããªãã¡ãä¿®æ£é¢æ°å¤ï¼ï¼ï¼¦
ï¼»ï½ï¼½ï¼»ï½ï¼½ï¼»ï¼ï¼½ã¨ä¹ç®ãããã¨ã«ããæ£è¦åãã¦ã
ä¹ç®çµæã®ãµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã縦ç¶
æ¥ç¶æ¼ç®åCOï¼ã«åºåãããæ¬¡ãã§ãç¸¦ç¶æ¥ç¶æ¼ç®å
COï¼ã¯ãä¹ç®å¨ï¼ï¼°ï¼ï½ããåºåããããµããã³ãã
ã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã¨ãä¹ç®å¨ï¼ï¼°ï¼ï½ããåºåã
ãããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãç¸¦ç¶æ¥ç¶ã
ã¦ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ã«åºåãããåæ§ã«ãä¹ç®
å¨ï¼ï¼°ï¼ï½ã¯ãä¹ç®å¨ï¼ï¼°ï¼ï½ããåºåããããµããã³
ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ããæ¬¡ã®ãµããã³ããã¬ã¼ã
ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã®ããã®ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼
ï¼ï¼½ã®æåã®åºåã®å©å¾ä¿æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½
ï¼»ï¼ï¼½ã¨ä¹ç®ãããã¨ã«ããæ£è¦åãã¦ãç¸¦ç¶æ¥ç¶æ¼ç®
åCOï¼ã«åºåãããæ¬¡ãã§ãç¸¦ç¶æ¥ç¶æ¼ç®åCOï¼
ã¯ãä¹ç®å¨ï¼ï¼°ï¼ï½ããåºåããããµããã³ããã¬ã¼ã
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãä¹ç®å¨ï¼ï¼°ï¼ï½ããåºåããããµãã
ã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã¨ãç¸¦ç¶æ¥ç¶ãã¦ãï¼ï¼¤
CTå¦çé¨ï¼ï¼âï¼ã«åºåãããããªãã¡ãATRAC
ï¼ã®æ¨æºã¯ãä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼½ã®å¤ããµããã³
ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½ã®ç«¯ç¹ã«ããã¦ï¼ã«çããï¼ã
ãªãã¡ãå©å¾ãï¼ã§ããï¼ãã¨ãå¿
è¦ã¨ãããã®ããã
ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼½ã次ã®ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½
ï¼»ï½ï¼ï¼ï¼½ã¨çµåãããã¨ããããã¯ã次ã®ä¿®æ£é¢æ°ã®
æåã®åºåã®å©å¾ä¿æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ï¼»ï¼ï¼½ã¨ã®
ä¹ç®ã«ãã£ã¦ããµããã³ããã¬ã¼ã ï¼»ï½ï¼½ï¼»ï½ï¼½åã³
ï¼»ï½ï¼ï¼ï¼½ã®å¢çã«ããã¦åä¸ã®ã¬ãã«ã«ããããã«æ£
è¦åãããã¨ãã§ãããããã«ãçµåãããï¼ã¤ã®ãµã
ãã³ããã¬ã¼ã ã¯ã飿¥ãããããã¯ãããããï¼ï¼ï¼
ã®éè¤ãæããï¼ï¼¤ï¼£ï¼´ã®ããã®å¤æãããã¯å
¨ä½ãå½¢
æããå¾ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ã«åºåããããPrior to the modified discrete cosine transform (MDCT) processing in the MDCT processing unit 16-1, two continuous subband frames are cascaded with each other while guaranteeing the continuity of the correction function at the boundary of the subband frames. Connected. The multiplier MP2a uses the modification function MF [i] [n
â1] gain-controlled by the sub-band frame [i] [nâ1] output from the multiplier MP1a, the correction function MF for the next sub-band frame [i] [n]
The gain coefficient of the first partition of [i] and [n] (in this embodiment, partition (also referred to as partition) means the partition when one subband frame is divided into 32 partitions). (That is, the correction function value) MF
Normalize by multiplying with [i] [n] [0],
The subband frame [i] [n-1] of the multiplication result is output to the cascade connection operator CO1. Then, the cascade operator CO1 cascade-connects the subband frame [i] [n-1] output from the multiplier MP2a and the subband frame [i] [n] output from the multiplier MP1b. And outputs it to the MDCT processing unit 16-1. Similarly, the multiplier MP2b modifies the subband frame [i] [n] output from the multiplier MP1a by a correction function MF [i] [n +] for the next subband frame [i] [n + 1].
1] first section gain coefficient MF [i] [n + 1]
It is normalized by being multiplied by [0] and output to the cascade connection operator CO2. Then, the cascade connection operator CO2
Connects the subband frame [i] [n] output from the multiplier MP2b and the subband frame [i] [n + 1] output from the multiplier MP1c in cascade connection, and
It is output to the CT processing unit 16-1. That is, ATRAC
The standard of 3 requires that the value of the modification function MF [i] [n] be equal to 1 (ie the gain is 1) at the endpoints of the subband frame [i] [n], so
The modification function MF [i] [n] is the next modification function MF [i]
When combined with [n + 1], it is multiplied by the gain factor MF [i] [n + 1] [0] of the first partition of the next correction function to produce subband frames [i] [n] and [n + 1]. Can be normalized to the same level at the boundaries of. Therefore, two sub-band frames that are combined have 50% adjacent blocks each.
After forming the entire transform block for MDCT having the overlap of, it is output to the MDCT processing unit 16-1.
ãï¼ï¼ï¼ï¼ãå³ï¼ãåç
§ãã¦ä»¥ä¸ã«èª¬æããä¹ç®å¨ï¼ï¼°
ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½åã³ï¼
ï¼°ï¼ï½ã¨ãç¸¦ç¶æ¥ç¶æ¼ç®åCOï¼ï¼ï¼£ï¼¯ï¼ã¨ã¯ãæ¬å®æ½
å½¢æ
ã«ããã¦ã¯ãå©å¾å¶å¾¡é¨ï¼ï¼âï¼ã«å«ã¾ããããã«
æ§æããã¦ãããå³ï¼ã«ããã¦ã¯ã第ï¼ã®ãã³ãã«é¢ã
ãå¦çã®ã¿ãå³ç¤ºãã¦ãããã第ï¼ä¹è³ç¬¬ï¼ã®ãã³ãã«
é¢ããå¦çãåæ§ã«å®è¡ããããThe multiplier MP described above with reference to FIG.
1a, MP1b, MP1c, MP2a, MP2b and M
The P2c and the cascade connection operators CO1 and CO2 are configured to be included in the gain control unit 15-1 in the present embodiment. In FIG. 4, only the processing regarding the first band is illustrated, but the processing regarding the second to fourth bands is similarly executed.
ãï¼ï¼ï¼ï¼ãåã³å³ï¼ãåç
§ããã¨ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼
ï¼âï¼ã¯ãï¼ã¤ã®å¨æ³¢æ°å¸¯åã§å©å¾å¶å¾¡ããããã¤ï¼ï¼¤
CTå¦çããããã®å¤æãããã¯ã«ç¸¦ç¶æ¥ç¶ãããå°ãª
ãã¨ãï¼ã¤ã®ãµããã³ããã¬ã¼ã æ¯ã®ä¿¡å·ã«å¯¾ãã¦ãã
ããå¤å½¢é¢æ£ã³ãµã¤ã³å¤æï¼ï¼ï¼¤ï¼£ï¼´ï¼ã®å¦çãå®è¡
ãããã®çµæã¨ãã¦çããã¹ãã¯ãã«æ
å ±ä¿¡å·ããã¼ã³
æå符å·åå¨ï¼ï¼âï¼ã«åºåããããã¼ã³æå符å·åå¨
ï¼ï¼âï¼ã¯ããã¼ã³æåæ¤åºé¨ãã¹ã±ã¼ã«ãã¡ã¯ã¿å¦ç
é¨åã³ãããå²ãå½ã¦é¨çãåãã¦æ§æãããï¼ï¼¤ï¼£ï¼´
å¦çãããæ
å ±ä¿¡å·ãããã¼ã³æåä¿¡å·ã¨éãã¼ã³æ§ã®
ã¹ãã¯ãã«ä¿¡å·ã¨ãåé¢ãã¦ãããããããã¼ã³æåé
ååå¨ï¼ï¼âï¼ã¨ã¹ãã¯ãã«éååå¨ï¼ï¼âï¼ã«åºåã
ããããã§ããã¼ã³æåéååå¨ï¼ï¼âï¼ã¨ã¹ãã¯ãã«
éååå¨ï¼ï¼âï¼ã¯ãã¼ã³æåä¿¡å·ã¨éãã¼ã³æ§ã®ã¹ã
ã¯ãã«ä¿¡å·ã¨ã«åå¥ã«ç¬¦å·åï¼éååï¼ã®å¦çãããã
次ãã§ããããã¹ããªã¼ã ãã«ããã¬ã¯ãµï¼ï¼ã¯ããã
ãã³ç¬¦å·åå¨ã¨ãã«ããã¬ã¯ãµã¨ãåãã¦æ§æãããã
ã¼ã³æåéååå¨ï¼ï¼âï¼ã¨ã¹ãã¯ãã«éååå¨ï¼ï¼â
ï¼ã«ããã¦ç¬¦å·åãããä¿¡å·ããè¤æ°åï¼ä¾ãã°ï¼ï¼
åï¼ã®ãããã³ãã¼ãã«ãç¨ãã¦å§ç¸®ããæ¬¡ãã§ããã¼
ã³æå符å·åå¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã«ããããã¼ã³æ
ååã³ã¹ãã¯ãã«ã®æ
å ±ã¨ã符å·åãããåãã¼ã³æå
ã¨ã符å·åãããåéãã¼ã³æåã¨ãå©å¾æ¤åºé¨ï¼ï¼â
ï¼ä¹è³ï¼ï¼âï¼ããåºåãããä¿®æ£é¢æ°ã®ãã¼ã¿ï¼è©³ç´°
å¾è¿°ããããã«ããµããã³ããã¬ã¼ã å
ã®ä¿®æ£ä½ç½®ã®ã
ã¼ã¿ã¨ãä¿®æ£ã¬ãã«ã«é¢ãããã¼ã¿ã¨ãå¤åç¹ã®åæ°ã®
ãã¼ã¿ã¨ãå«ããï¼ãå«ã¿ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«å¾ããµ
ã¤ãæ
å ±ã¨ãå¤éåãããã¨ã«ãããATRACï¼ã®ã
ããã¹ããªã¼ã ä¿¡å·ãå¾ã¦åºåãããReferring again to FIG. 2, MDCT processing unit 1
6-1 is gain controlled in one frequency band, and MD
A modified discrete cosine transform (MDCT) process is performed on each signal of at least two sub-band frames cascade-connected to a transform block for CT processing, and the resulting spectrum information signal is tone component encoded. Output to the device 17-1. The tone component encoder 17-1 is configured to include a tone component detection unit, a scale factor processing unit, a bit allocation unit, etc.
The tone component signal and the non-tone spectrum signal are separated from the processed information signal and output to the tone component quantizer 18-1 and the spectrum quantizer 19-1. Here, the tone component quantizer 18-1 and the spectrum quantizer 19-1 individually perform encoding (quantization) processing on the tone component signal and the non-tone spectrum signal.
Next, the bit stream multiplexer 20 is configured to include a Huffman encoder and a multiplexer, and the tone component quantizer 18-1 and the spectrum quantizer 19-
A plurality of signals (for example, 14
, Huffman table, and then the tone component and spectral information in the tone component encoders 17-1 to 17-4, each encoded tone component, and each encoded non-tone. Component and gain detector 14-
The correction function data output from 1 to 14-4 (including the correction position data in the subband frame, the correction level data, and the change point number data, as will be described later in detail). , And side information according to the ATRAC3 standard are multiplexed to obtain and output a bitstream signal of ATRAC3.
ãï¼ï¼ï¼ï¼ãå³ï¼ã¯ãå³ï¼ã®ãªã¼ãã£ãªãã³ã¼ãï¼ã®è©³
ç´°æ§æã示ããããã¯å³ã§ãããæåã«ããããã¹ããª
ã¼ã ããã«ããã¬ã¯ãµï¼ï¼ã¯ããããã£ã¹ã¯ï¼çã®è¨é²
åªä½ãåã¯ä¼éåªä½ããèªã¿åºãããå¾å¾©èª¿ãããAT
RACï¼ã®ãããã¹ããªã¼ã ä¿¡å·ããåãµããã³ããã¨
ã®ãã¼ã³æåã®ç¬¦å·ååã³ã¹ãã¯ãã«ã®ç¬¦å·åï¼éãã¼
ã³æåã®ç¬¦å·åï¼ã¨ãä¿®æ£é¢æ°ã®ãã¼ã¿ãå«ããµã¤ãæ
å ±ã¨ã«åé¢ãããæ¬¡ãã§ããã¼ã³æåééååå¨ï¼ï¼â
ï¼ã¨ã¹ãã¯ãã«ééååå¨ï¼ï¼âï¼ã¯ãããããå符å·
åãã¹ãã¯ãã«ä¿æ°ã«å¾©å·åï¼ééååï¼ãããã¼ã³æ
å復å·åé¨ï¼ï¼âï¼ã¯åé¢ãã復å·åããããã¼ã³æå
ä¿¡å·ã¨éãã¼ã³æ§ã®ã¹ãã¯ãã«ä¿¡å·ã¨ãåæãããã¾
ããä»ã®ãµããã³ãã«ããããã¼ã³æåã®ç¬¦å·ååã³ã¹
ãã¯ãã«ã®ç¬¦å·åã«ã¤ãã¦ãåæ§ã§ãããæ¬¡ãã§ãéï¼
DCTå¦çé¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¯ãå卿³¢æ°å¸¯åã
ã¨ã«éï¼ï¼¤ï¼£ï¼´ã«ãã£ã¦æéé åã®ãµããã³ããã¬ã¼ã
ãçæããæ¬¡ãã§ãéå©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼
ã¯ããããã¹ããªã¼ã ããã«ããã¬ã¯ãµï¼ï¼ããå
¥åã
ãããµã¤ãæ
å ±å
ã®ä¿®æ£é¢æ°ã®ãã¼ã¿ã«åºã¥ãã¦ãåãµ
ããã³ããã¬ã¼ã ã«å¯¾ãã¦å©å¾å¶å¾¡ã¨ã¯éã®ä¿®æ£é¢æ°ã§
éå©å¾å¶å¾¡ããããªãã¡ãå©å¾å¶å¾¡ã«ãã£ã¦å¢å¹
ããã
é¨åãæå§ããæå§ãããé¨åãå¢å¹
ããéå©å¾å¶å¾¡ã
ããåãã³ãã®ãµããã³ããã¬ã¼ã ãï¼±ï¼ï¼¦åæãã£ã«
ã¿ï¼ï¼ã«åºåãããæå¾ã«ãï¼±ï¼ï¼¦åæãã£ã«ã¿ï¼ï¼ã¯
åãã³ãï¼å¨æ³¢æ°å¸¯åï¼æ¯ã®ãµããã³ããã¬ã¼ã ãåæ
ãã¦ãåæããããã£ã¸ã¿ã«ãªã¼ãã£ãªä¿¡å·ãåºåã
ããFIG. 3 is a block diagram showing the detailed structure of the audio decoder 6 shown in FIG. First, the bit stream demultiplexer 21 reads out from a recording medium such as a mini disk 4 or a transmission medium and then demodulates the AT.
The bitstream signal of the RAC3 is separated into a tone component code string and a spectrum code string (non-tone component code string) for each sub-band, and side information including correction function data. Then, the tone component dequantizer 22-
1 and the spectrum dequantizer 23-1 respectively decode (dequantize) each code string into spectrum coefficients, and the tone component decoding unit 24-1 separates and decodes the separated tone component signal and non-tone. And the spectral signal of sex. The same applies to the tone component code strings and spectrum code strings in other subbands. Then reverse M
The DCT processing units 25-1 to 25-4 generate time-domain subband frames by inverse MDCT for each frequency band, and then the inverse gain control units 26-1 to 26-4.
On the basis of the correction function data in the side information input from the bitstream demultiplexer 21, performs inverse gain control on each subband frame with a correction function opposite to the gain control, that is, amplifies by gain control. The suppressed part is suppressed, the suppressed part is amplified, and the sub-band frame of each band subjected to inverse gain control is output to the QMF synthesis filter 27. Finally, the QMF synthesis filter 27 synthesizes the sub-band frames for each band (frequency band) and outputs the synthesized digital audio signal.
ãï¼ï¼ï¼ï¼ã以ä¸èª¬æããããã«ãå³ï¼ã®ãããã£ã¹ã¯
è¨é²åçã·ã¹ãã ã¯ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åè£
ç½®ã§ã
ããªã¼ãã£ãªã¨ã³ã³ã¼ãï¼ã¨ãªã¼ãã£ãªä¿¡å·ã®å¾©å·åè£
ç½®ã§ãããªã¼ãã£ãªãã³ã¼ãï¼ã¨ãåãããªã¼ãã£ãªä¿¡
å·ã®ç¬¦å·ååã³å¾©å·åã·ã¹ãã ã§ãããä¸è¨ãªã¼ãã£ãª
ã¨ã³ã³ã¼ãï¼ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦
ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ããå©å¾æ¤åºé¨ï¼ï¼ã¨ãè¨
ç®ãããä¿®æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦
å©å¾å¶å¾¡ããå©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨ãä¸è¨
å©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦ãç¸¦ç¶æ¥ç¶ãã
ãäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã æ¯ã«ç´äº¤å¤æå¦çãè¡
ãï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨ãä¸è¨ç´äº¤å¤
æãããä¿¡å·ã«ç¬¦å·åå¦çãè¡ããã¼ã³æå符å·åå¨ï¼
ï¼âï¼ä¹è³ï¼ï¼âï¼ããã¼ã³æåéååå¨ï¼ï¼âï¼ä¹è³
ï¼ï¼âï¼åã³ã¹ãã¯ãã«éååå¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼
ãåãããã¨ã«ããã符å·åããããããã¹ããªã¼ã ä¿¡
å·ãå¾ããã¾ããä¸è¨ãªã¼ãã£ãªãã³ã¼ãï¼ã¯ãä¸è¨ãª
ã¼ãã£ãªã¨ã³ã³ã¼ãï¼ã«ãã符å·åããããããã¹ããª
ã¼ã ä¿¡å·ã復å·åãããã¼ã³æåééååå¨ï¼ï¼âï¼ä¹
è³ï¼ï¼âï¼ãã¹ãã¯ãã«ééååå¨ï¼ï¼âï¼ä¹è³ï¼ï¼â
ï¼åã³ãã¼ã³æå復å·åå¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨ãä¸
è¨å¾©å·åãããä¿¡å·ã«éç´äº¤å¤æå¦çãè¡ãéï¼ï¼¤ï¼£ï¼´
å¦çé¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨ãåãããã¨ã«ãããè¤
æ°ã®ãã¬ã¼ã ãããªããªã¼ãã£ãªä¿¡å·ãå¾ã¦ãä¸è¨ãªã¼
ãã£ãªä¿¡å·ã®å¾©å·åè£
ç½®ã¯ããã«ãä¸è¨å¾ããããªã¼ã
ã£ãªä¿¡å·ã«å¯¾ãã¦ãä¸è¨ä¿®æ£é¢æ°ã¨ã¯éã®ä¿®æ£é¢æ°ãç¨
ãã¦éå©å¾å¶å¾¡ãè¡ã£ã¦éå©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡
å·ãå¾ã¦åºåããéå©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ã¨
ãåãã¦ãããAs described above, the mini-disc recording / reproducing system of FIG. 1 has an audio signal code including the audio encoder 2 which is an audio signal encoding device and the audio decoder 6 which is an audio signal decoding device. It is an encryption and decryption system. The audio encoder 2 includes a gain detection unit 14 that calculates a correction function for each frame based on the input audio signal, and gain control units 15-1 to 15-1 that control the gain of the audio signal according to the calculated correction function. 15-4, the MDCT processing units 16-1 to 16-4 that perform an orthogonal transform process for every two adjacent frames that are connected in cascade with respect to the gain-controlled audio signal, and the orthogonal transform is performed. Tone component encoder 1 for encoding signals
7-1 to 17-4, tone component quantizers 18-1 to 18-4 and spectrum quantizers 19-1 to 19-4
By providing, the encoded bitstream signal is obtained. The audio decoder 6 also includes tone component dequantizers 22-1 to 22-4 and spectrum dequantizers 23-1 to 23- for decoding the bitstream signal encoded by the audio encoder 2.
4 and tone component decoders 24-1 to 24-4 and an inverse MDCT for performing an inverse orthogonal transform process on the decoded signal.
By including the processing units 25-1 to 25-4, an audio signal composed of a plurality of frames is obtained, and the audio signal decoding device further applies the correction function to the obtained audio signal. And inverse gain control sections 26-1 to 26-4 for performing inverse gain control using a correction function opposite to that for obtaining and outputting the inverse-gain-controlled audio signal.
ãï¼ï¼ï¼ï¼ãæ¬é¡æç´°æ¸ã«é示ããçºæã¯ãæ¬é¡åºé¡äºº
ã«ãã£ã¦ç¹é¡ï¼ï¼ï¼ï¼âï¼ï¼ï¼ï¼ï¼ï¼å·ã®ç¹è¨±åºé¡ã«ã
ãã¦é示ãããããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³åã³è£
ç½®ã並ã³ã«ç¬¦å·ååã³å¾©å·åã·ã¹ãã ãã¨é¡ããçºæ
ï¼ä»¥ä¸ããå
é¡ã®çºæãã¨å¼ã¶ãï¼ããæ¹åãããã¤æ¡
å¼µããããã®ã§ãããå
é¡ã®çºæã¯è¨ç®é¢ã§çµæ¸çãªã
ã®ã§ããã®ã«å¯¾ãã¦ãæ¬çºæã¯ãè¨ç®ã³ã¹ãã¯ä¸æãã
ãããã«ç¸å¿ãã¦åªããå質ãæä¾ãããï¼ã¤ã®çºæã®
ãã¼ã¨ãªãç¸éç¹ã¨ãæ¬çºæã«ããã¦æ¹åãããã¤æ¡å¼µ
ãããæ¹åæ¡å¼µäºé
ã¨ã以ä¸ã«åæããã ï¼ï¼ï¼å
é¡ã®çºæã¯ãã¢ã¿ãã¯ä¿¡å·åºæºã«ããã¦ãå¦ç
ãã¹ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã«å
è¡ãããµããã³ã
ãã¬ã¼ã ã®æå¾ã®ï¼åã®åºåã®åæå¤§å¤ã§ãããã¼ã¯å¤
MaxPeakãç¨ãããæ¬çºæã¯ãå
è¡ãããµããã³ããã¬
ã¼ã ã®æå¤§å¤ã§ããåä¸ã®ãã¼ã¯å¤PrevFramePeakãç¨
ããã ï¼ï¼ï¼å
é¡ã®çºæã¯ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã«ç¶ã
æªæ¥ã®ãµããã³ããã¬ã¼ã ããã®åºåãä¸åå©ç¨ããª
ããããã¯ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã®æå¾ã®åºåã
ã·ãããã¦æªæ¥ã®ãµããã³ããã¬ã¼ã ã®ç¬¬ï¼ã®åºåãç
æãããã¨ã«ããããã¬ã¼ã éã®å¢çã«ãããã¢ã¿ãã¯
ä¿¡å·ã®åé¡ï¼å¾è¿°ï¼ã«åãçµãã ãä¸è¨æªæ¥ã®ãµããã³
ããã¬ã¼ã ã®ç¬¬ï¼ã®åºåã¯ããæ¬ä¼¼æªæ¥ãåºåã¨å¼ã°ã
ããæ¬çºæã¯ãæªæ¥ã®ãµããã³ããã¬ã¼ã ã®æåã®ï¼å
ã®åºåã®åæå¤§å¤ã§ãããã¼ã¯å¤MaxPeakãå©ç¨ããã ï¼ï¼ï¼å
é¡ã®çºæã¯ãåºå®ããããªãªã¼ã¹ä¿¡å·åºæºãç¨
ãããæ¬çºæã¯å¯å¤ãªãªãªã¼ã¹ä¿¡å·åºæºãç¨ããã ï¼ï¼ï¼å
é¡ã®çºæã¯ãå©å¾å¶å¾¡ãã¹ãåä¿®æ£ãã¤ã³ãã«
対ãã¦ãããããçºè¦ããããã³ã«ä¿®æ£ã¬ãã«ãè¨ç®ã
ããè¨ç®ãããæ°ããä¿®æ£ã¬ãã«ã¯ãããããå½è©²ãµã
ãã³ããã¬ã¼ã ã®å
è¡ãããã¹ã¦ã®ä¿®æ£ã¬ãã«ã«å ç®ã
ãããæ¬çºæã¯ããã¹ã¦ã®ä¿®æ£ãã¤ã³ãã«ãã¼ãã³ã°
ããã¾ããããã«åã«ææ¨ã¨ãªãä¿®æ£ã¬ãã«ãä»ä¸ãã¦
ãã®ã¿ã¤ãï¼ãã¢ã¿ãã¯ä¿¡å·ãã®ç«ä¸ããã®éå§ä½ç½®ã§
ããããåã¯ããªãªã¼ã¹ä¿¡å·ãã®ç«ä¸ããã®çµäºä½ç½®ã§
ãããï¼ãåºå¥ãããæ¬¡ãã§ãå½è©²ãµããã³ããã¬ã¼ã
ã®æã¾ããä¿¡å·ã¬ãã«ãè¨ç®ããããæå¾ã«ãä¸è¨æã¾
ããä¿¡å·ã¬ãã«ã«åºã¥ãã¦ä¿®æ£ã¬ãã«ãè¨ç®ãããã ï¼ï¼ï¼æ¬çºæã¯ãçºè©±é³å£°ã«ãããä¿¡å·å¼·åº¦ã®èªç¶ãªé·
ç§»ï¼ä¿¡å·å¼·åº¦ã®ç·©ãããªå¢å¤§ã¨ç·©ãããªæ¸å°ãå«ãã以
ä¸ããããããèªç¶ãªä¸æï¼ascentï¼ãåã³ãèªç¶ãªé
ä¸ï¼descentï¼ãã¨å¼ã¶ãï¼ããã§ãã¯ããããã«ç¹å¥
ãªåºæºãçµã¿è¾¼ã¿ãä¸è¨èªç¶ãªé·ç§»ã«åããã¦ä¿®æ£ã¬ã
ã«ã調ç¯ãããå
é¡ã®çºæã¯ãããè¡ããªãã ï¼ï¼ï¼å
é¡ã®çºæã¯ãå©å¾å¶å¾¡ãããç¾å¨ã®ãµããã³ã
ãã¬ã¼ã ã®ä¿¡å·ã¬ãã«ã¨å©å¾å¶å¾¡ãããå
è¡ãããµãã
ã³ããã¬ã¼ã ã®ä¿¡å·ã¬ãã«ã¨ã大å¹
ã«ç°ãªããå¦ããã
ã§ãã¯ããªããæ¬çºæã¯ãããå®è¡ããé©å®ç¶æ³ãç¯æ£
ããã ï¼ï¼ï¼æ¬çºæã¯ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ç¬¬ï¼ã®åº
åã®ä¿®æ£ã¬ãã«ãå¶éãã¦ãå
è¡ãããµããã³ããã¬ã¼
ã ã®éåº¦ã®æ¸è¡°ã鲿¢ãããå
é¡ã®çºæã¯ãããè¡ããª
ãã ï¼ï¼ï¼å
é¡ã®çºæã¯ãå©å¾å¶å¾¡ãããç¾å¨ã®ãµããã³ã
ãã¬ã¼ã ã®çµç«¯ãå¼±ããå¦ãããã§ãã¯ãããã¨ã«ãã
ãã¹ãã¨ã³ã¼ã®ç´æ¥å¶å¾¡ãè¡ãªããªããæ¬çºæã¯ããã
å®è¡ããé©å®ç¶æ³ãç¯æ£ãããThe invention disclosed in the present specification is an invention entitled "Audio signal encoding method and apparatus, and encoding and decoding system" disclosed in the patent application of Japanese Patent Application No. 2001-188067 by the present applicant. (Hereinafter, referred to as âthe invention of the prior applicationâ). While the invention of the prior application is computationally economical, the invention provides correspondingly superior quality at the expense of computational cost. The key differences between the two inventions and the improvements and enhancements that have been improved and expanded in the present invention are listed below. (1) The invention of the prior application is the peak value which is the maximum value of each of the last eight sections of the subband frame preceding the current subband frame to be processed in the attack signal standard.
Max Peak was used. The present invention uses a single peak value PrevFramePeak that is the maximum of the preceding subband frame. (2) The invention of the prior application does not use any division from the future subband frame following the current subband frame. It addressed the problem of attack signals at inter-frame boundaries (discussed below) by shifting the last partition of the current subband frame to produce the first partition of the future subband frame. The first section of the future subband frame was called the "pseudo future" section. The present invention utilizes the peak value MaxPeak, which is the maximum value of each of the first eight sections of the future subband frame. (3) The invention of the prior application used a fixed release signal reference. The present invention uses a variable release signal reference. (4) In the invention of the prior application, for each correction point to be gain controlled, the correction level was calculated each time they were found. Each new correction level calculated was added to all previous correction levels of the subband frame in question. The present invention marks all the correction points and first gives them a correction level that is merely an indicator to determine the type (either the start position of the rising edge of the "attack signal" or the falling edge of the "release signal"). End position). The desired signal level for that subband frame is then calculated. Finally, a correction level is calculated based on the desired signal level. (5) The present invention includes natural transitions of signal strength in uttered speech (including a gradual increase and a gradual decrease in signal strength. Hereinafter, those are referred to as "natural ascent" and "natural descent". ".") And incorporate a special criterion to adjust the correction level to the natural transition. The invention of the earlier application does not do this. (6) The invention of the prior application does not check whether the signal level of the current gain-controlled subband frame and the signal level of the preceding gain-controlled subband frame are significantly different. The present invention does this and corrects the situation as appropriate. (7) The present invention limits the modification level of the first partition of the current subband frame to prevent excessive attenuation of the preceding subband frame. The invention of the earlier application does not do this. (8) The invention of the prior application does not directly control the post echo by checking whether or not the end of the current subband frame whose gain is controlled is weak. The present invention does this and corrects the situation as appropriate.
ãï¼ï¼ï¼ï¼ã以ä¸ãæ¬çºæã®å®æ½å½¢æ
ã«ä¿ãä¿®æ£é¢æ°è¨
ç®æ¹æ³ã®ç¹å¾´ã«ã¤ãã¦èª¬æãããä¿®æ£é¢æ°ã®çæã¯ãA
ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã®ä»æ§ã«æºãããå
¥åä¿¡å·ã®ã¹ãã¯ãã«
æåã§ããåãµããã³ãã®ï¼ï¼ï¼åã®ãªã¼ãã£ãªãµã³ã
ã«ä¿¡å·ã¯ãï¼ï¼åã®åºåï¼ãã¼ãã£ã·ã§ã³ï¼ã«åå²ã
ãããã®ããããã¯ï¼åã®é£ç¶ãããªã¼ãã£ãªãµã³ãã«
ä¿¡å·ã§ã¾ã¨ãããã¦ããããµããã³ããã¬ã¼ã å
ã®ä»»æ
ã®åºåã«ããã¦ãå©å¾å¶å¾¡ããããã¨ãå¿
è¦ã¨ããä¿®æ£
ãã¤ã³ãã¨ãã¦ãã¢ã¿ãã¯ä¿¡å·ã®ç«ä¸ããã®éå§ä½ç½®
ï¼ä»¥ä¸ããã¢ã¿ãã¯ãã¤ã³ããã¨å¼ã¶ãï¼ããåã¯ãªãª
ã¼ã¹ä¿¡å·ã®ç«ä¸ããã®çµäºä½ç½®ï¼ä»¥ä¸ãããªãªã¼ã¹ãã¤
ã³ããã¨å¼ã¶ãï¼ãããã®å¤æ°ãå
·ä½åãããã¨ã«ãã£
ã¦æ±ºå®ãããã¨ãã§ãããå³ï¼ã¯ã夿°ã決å®ãããã¨
ã«ããããã¤ãã®ä¿®æ£ãã¤ã³ãã®å
·ä½åã¨ãä¿®æ£é¢æ°ï¼
F[ï½ï¼½ï¼»ï½ï¼½ã®çæã¨ã示ãããã®ã§ãããadjust_n
umï¼»ï½ï¼½ã¯ããµããã³ãï½ã®ãµããã³ããã¬ã¼ã ã®ä¿®æ£
ãã¤ã³ãã®ç·æ°ã¨ãã¦å®ç¾©ããããã®å ´åã¯ãadjust_n
umï¼»ï½ï¼½ï¼ï¼ã§ãããä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½ï¼»ï½ï¼½
ã¯ããµããã³ãï½ã®ãµããã³ããã¬ã¼ã ã«ãããï½çªç®
ã®ä¿®æ£ãã¤ã³ãï¼å³ï¼ã®ä¾ã§ã¯ãï½ï¼ï¼ï¼ï¼ï¼â¦ï¼adju
st_numï¼»ï½ï¼½âï¼ãããªãã¡ï¼ã¾ã§ï¼ã®åºåã®ã¤ã³ãã
ã¯ã¹ï¼ï¼ï¼ï¼ï¼â¦ï¼ï¼ï¼ï¼ã§ãããä¿®æ£ã¬ãã«alevcode
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ãä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½ï¼»ï½ï¼½ã«å¯¾å¿
ããä¿®æ£ãã¤ã³ãã®ä¿®æ£ã®å¤§ããã表ãå©å¾ä¿æ°ã§ã
ããï¼ã®ã¹ãä¹ã®ã¹ãææ°ãæå³ãããï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨
æºãä¿®æ£ãã¤ã³ãæ°adjust_numï¼»ï½ï¼½ãä¿®æ£ä½ç½®alocco
deï¼»ï½ï¼½ï¼»ï½ï¼½åã³ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã«
対ãã¦å²ãå½ã¦ããããæ°ã¯ãããããï¼ï¼ï¼åã³ï¼ã§
ãããThe features of the correction function calculation method according to the embodiment of the present invention will be described below. The generation of the correction function is A
According to the specifications of TRAC3 standard. The 256 audio sample signals of each sub-band, which are spectral components of the input signal, are divided into 32 partitions, each of which is grouped into 8 consecutive audio sample signals. In any section in the subband frame, as a correction point that needs to be gain-controlled, the start position of the rising edge of the attack signal (hereinafter referred to as "attack point") or the falling edge of the release signal. The end position (hereinafter referred to as the "release point") can be determined by instantiating that variable. FIG. 5 shows the implementation of some modification points by determining variables and the modification function M
9 shows the generation of F [i] [n]. adjust_n
um [i] is defined as the total number of correction points in the subband frame of subband i, in this case adjust_n
um [i] = 4. Correction position aloccode [i] [j]
Is the j-th correction point in the subband frame of subband i (j = 0, 1, ..., Adju in the example of FIG. 5).
It is an index (0, 1, ..., 31) of a section of st_num [i] -1, that is, up to 3. Modification level alevcode
[I] [j] is a gain coefficient representing the magnitude of modification of the modification point corresponding to the modification position aloccode [i] [j], and means a power of 2 to a power. The ATRAC3 standard is the number of correction points adjust_num [i], the correction position alocco
The number of bits assigned to de [i] [j] and the modification level alevcode [i] [j] are 3, 5 and 4, respectively.
ãï¼ï¼ï¼ï¼ãã¢ã¿ãã¯ä¿¡å·ãæ¤åºãããã¨ããããã®ä¿®
æ£ãã¤ã³ãï¼ã¢ã¿ãã¯ãã¤ã³ãï¼ã¨ãã¦ãä¿¡å·ã®æ¯å¹
ã
æ¥æ¿ã«å¢å¤§ããåºåã®ä½ç½®ããã¼ãã³ã°ããå
è¡ããä¿®
æ£ãã¤ã³ãããå½è©²ã¢ã¿ãã¯ãã¤ã³ãã¾ã§ã®ç¸å¯¾çã«å°
ããæ¯å¹
ã®é¨åãå¢å¹
ãããããã«å¯¾ãã¦ããªãªã¼ã¹ä¿¡
å·ãæ¤åºãããã¨ããããã®ä¿®æ£ãã¤ã³ãï¼ãªãªã¼ã¹ã
ã¤ã³ãï¼ã¨ãã¦ãä¿¡å·ã®æ¯å¹
ãæ¥æ¿ã«æ¸å°ããå¾ã®ãå°
ããæ¯å¹
ãç¶ãé¨åã®æåã®åºåã®ä½ç½®ããã¼ãã³ã°
ããå
è¡ããä¿®æ£ãã¤ã³ãããå½è©²ãªãªã¼ã¹ãã¤ã³ãã¾
ã§ã®ç¸å¯¾çã«å¤§ããæ¯å¹
ã®é¨åãæ¸è¡°ããããATRA
ï¼£ï¼æ¨æºã«ããã°ãä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã¯
ãã¹ã¦æ£ã®æ´æ°ã§ãããï¼ã®å¤ããå©å¾å¶å¾¡ãªããã«å¯¾
å¿ããããã«å®ç¾©ããã¦ããã®ã§ããªã¼ãã£ãªãµã³ãã«
ä¿¡å·ã«å®éã«é©ç¨ãããä¿®æ£ã¬ãã«ãããªãã¡å¢å¹
åã¯
æ¸è¡°ã®å¤§ããã¯ï¼(alevcodeï¼»ï½ï¼½ ï¼»ï½ï¼½âï¼)ã§ããã
ã¾ããï¼ã¤ã®ç°ãªãä¿®æ£ã¬ãã«ã®éã¯å¯¾æ°çã«é·ç§»ã
ããWhen an attack signal is detected, the position of a section where the amplitude of the signal sharply increases is marked as a correction point (attack point) of the attack signal, and it is relatively small from the preceding correction point to the attack point. Amplify the amplitude part. On the other hand, when a release signal is detected, as a correction point (release point) for the release signal, the position of the first section of the portion where the small amplitude continues after the signal amplitude suddenly decreases is marked and preceded. A portion having a relatively large amplitude from the correction point to the release point is attenuated. ATRA
According to the C3 standard, the modification levels alevcode [i] [j] are all positive integers, and the value of 4 is defined to correspond to "no gain control", so it is actually applied to audio sample signals. The correction level applied, ie the magnitude of the amplification or attenuation, is 2 (alevcode [i] [j] -4) .
Also, there is a logarithmic transition between the two different modification levels.
ãï¼ï¼ï¼ï¼ãä¾ãã°ãå³ï¼ãåç
§ããã¨ãæåã®ä¿®æ£ã
ã¤ã³ãï¼adjust_numï¼»ï½ï¼½ï¼ï¼ï¼ã¯ãä¿®æ£ä½ç½®aloccode
ï¼»ï½ï¼½ï¼»ï¼ï¼½ï¼ï¼åã³ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï¼ï¼½
ï¼ï¼ãæããã¢ã¿ãã¯ãã¤ã³ãã§ããããã¤ããã®æå
ã®ä¿®æ£ãã¤ã³ãã«ã¤ãã¦ã¯ããã®å¹æããµããã³ããã¬
ã¼ã ã®å
é é¨åã«ã¾ã§åã¶ããã«ä¿®æ£é¢æ°ãå»¶é·ããã¦
ãããå¾ã£ã¦ããã®ä¿®æ£ãã¤ã³ãã«å¯¾ãã¦ã第ï¼ã®åºå
ï¼æ¬å®æ½å½¢æ
ã§ã¯ãï¼çªç®ã®åºåã¨ãã¦è¡¨è¨ããããï¼
ãã第ï¼ã®åºåï¼æ¬å®æ½å½¢æ
ã§ã¯ãï¼çªç®ã®åºåã¨ãã¦
表è¨ããããï¼ã¾ã§ã®æ¯å¹
ããï¼åã ãå¢å¹
ããããå
æ§ã«ã第ï¼ã®ä¿®æ£ãã¤ã³ãï¼adjust_numï¼»ï½ï¼½ï¼ï¼ï¼
ã¯ãä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½ï¼»ï¼ï¼½ï¼ï¼ï¼åã³ä¿®æ£ã¬ã
ã«alevcodeï¼»ï½ï¼½ï¼»ï¼ï¼½ï¼ï¼ãæãããªãªã¼ã¹ãã¤ã³ã
ã§ããããã®ä¿®æ£ã¬ãã«ã¯ãå
è¡ãã第ï¼ã®ä¿®æ£ãã¤ã³
ãã¨å½è©²ç¬¬ï¼ã®ä¿®æ£ãã¤ã³ãã¨ã®éã®åºåã«ä½ç¨ããã®
ã§ããã®ä¿®æ£ãã¤ã³ãã«å¯¾ãã¦ã第ï¼ã®åºåãã第ï¼ï¼
ã®åºåã¾ã§ã®æ¯å¹
ããï¼âï¼åã ãå¢å¹
ãããªãã¡ï¼å
ã®ï¼ã«æ¸è¡°ããããã¾ãã第ï¼ã®ä¿®æ£ãã¤ã³ãï¼adjust
_numï¼»ï½ï¼½ï¼ï¼ï¼ã®å¾ã®æå¾ã®ä¿®æ£ãã¤ã³ãã§ã¯ãä¿®æ£
ã¬ãã«ãï¼ã«ããå¿
è¦ããããï¼ã®ã¬ãã«ã«ããçç±
ã¯ãå³ï¼ã§èª¬æãããããã«ã次ã®ãµããã³ããã¬ã¼ã
ã®ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã¨ã®çµåã容æã«ãã
ãã¨ã«ãããããã«ãã£ã¦ãä¿®æ£é¢æ°ã®é£ç¶æ§ã¯ãä¿®æ£
颿°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½ï¼½ã®å
¨ä½ã«ä¿®æ£é¢æ°ï¼ï¼¦ï¼»ï½ï¼½ï¼»ï½
ï¼ï¼ï¼½ã«ä¿ã第ï¼ã®åºåã®ä¿®æ£é¢æ°å¤ï¼ããªãã¡ä¿®æ£ã¬
ãã«ï¼ãä¹ç®ãããã¨ã«ãã£ã¦å®¹æã«éæãããã¨ãã§
ãããFor example, referring to FIG. 5, the first correction point (adjust_num [i] = 0) is the correction position aloccode.
[I] [0] = 4 and modification level alevcode [i] [0]
= 5 attack points, and for this first modification point, the modification function is extended so that its effect extends to the beginning of the subband frame. Therefore, with respect to this correction point, the first division (in the present embodiment, described as the 0th division).
To the fourth section (in the present embodiment, described as the third section), the amplitude is amplified by a factor of two. Similarly, the second correction point (adjust_num [i] = 1)
Is a release point having a modification position aloccode [i] [1] = 13 and a modification level alevcode [i] [1] = 2, which modification level corresponds to the preceding first modification point and the second modification. Since it affects the section between the point and the point, for this correction point,
The amplitude up to the section is amplified by 2 â2 times, that is, attenuated to 1/4. In addition, the fourth correction point (adjust
At the last modification point after _num [i] = 3), the modification level needs to be 1. The reason for setting the level to 1 is to facilitate the combination with the correction function MF [i] [n + 1] of the next subband frame, as described in FIG. The property is that the correction function MF [i] [n]
This can easily be achieved by multiplying by the correction function value (ie the correction level) of the first section for [+1].
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãã¢ã¿ã
ã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãå½è©²ã¢ã¿ãã¯ä¿¡å·ã®é
å§ä½ç½®ãå«ãåºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã®ååº
åã«ä¿ãä¿®æ£é¢æ°å¤ããå½è©²ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ã
å«ãåºåã¨ãä¸è¨å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ããã
æå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤InterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½
ããæå®ã®ç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã«çãã
ãªãããã«è¨ç®ããã¾ãããªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«
ãåºåã¨ãå½è©²ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤
ããå½è©²ãªãªã¼ã¹ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®éã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾
å¤InterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ããä¸è¨ç®æ¨ãã¼ã¯å¤C
urrFramePeakï¼»ï½ï¼½ã«çãããªãããã«è¨ç®ãããã¨ã
ç¹å¾´ã¨ããã以ä¸ã®ãããªä¿®æ£é¢æ°å¤ãè¨ç®ããå¦ç
ã¯ãå¾è¿°ã®ããã¼ãã£ã¼ãã§ã¯ãå³ï¼ï¼ã®ã¹ãããï¼³ï¼
ï¼ã®ä¿®æ£ã¬ãã«alevcodeè¨ç®å¦çã§å®è¡ããããTherefore, the correction function calculation unit 14b determines the correction function value for each section between the section including the start position of the attack signal and the correction point preceding the section including the start position of the attack signal. Absolute value InterMaxMaxPeak [i] [j] of the maximum peak between the segment including the start position of the attack signal and the preceding correction point
Is calculated so as to be equal to a predetermined target peak value CurrFramePeak [i], and each of between the section including the start position of the release signal and the correction point preceding the section including the start position of the release signal. As for the correction function value related to the division, the absolute value InterModMaxPeak [i] [j] of the maximum peak between the division including the start position of the release signal and the preceding correction point is the target peak value C.
The calculation is performed so as to be equal to urrFramePeak [i]. The process of calculating the correction function value as described above is performed in step S2 of FIG.
It is executed in the modification level alevcode calculation processing of 3.
ãï¼ï¼ï¼ï¼ããªã¼ãã£ãªç¬¦å·åã¯ãµããã³ããã¬ã¼ã æ¯
ã®åºæºã§å®è¡ãããã®ã§ããµããã³ããã¬ã¼ã ã®å
é é¨
åã§çºçããä¿¡å·ã®ã¢ã¿ãã¯ä¿¡å·ã¯ãå©å¾å¶å¾¡ãå
è¡ã
ããµããã³ããã¬ã¼ã ã«é©ç¨ãããã¨ãã®ã¿ãåãæ±ã
ãã¨ãã§ãããããã¯ããå¢çã¢ã¿ãã¯ä¿¡å·ãåé¡ã¨ã
ã¦ç¥ããããç¾å¨ã®ããµããã³ããã¬ã¼ã ãå¦çããã¦
ããã¨ããæªæ¥ã®ããµããã³ããã¬ã¼ã ã®ç¬¬ï¼ã®åºåã«
ãããã¢ã¿ãã¯ä¿¡å·ã«å¯¾ãã¦æ¸¬å®ãå®è¡ãããããã«ã
ä½åãªãµããã³ããã¬ã¼ã ã®ãããã¡ãªã³ã°ãå¿
è¦ã¨ã
ããæ¬çºæã¯ããªãªã¼ã¹ã®ãã广çãªæ¤åºã®ããã«ã
æªæ¥ã®ãµããã³ããã¬ã¼ã ã®æåã®ï¼åã®åºåãå©ç¨ã
ããã¢ã¿ãã¯ä¿¡å·åã³ãªãªã¼ã¹ä¿¡å·ã®æ¤åºã¯ããµããã³
ããã¬ã¼ã ã®æåï¼ï¼çªç®ï¼ã®åºåããæå¾ï¼ï¼ï¼çª
ç®ï¼ã®åºåã¾ã§å®è¡ããããSince audio encoding is performed on a subband-frame-by-subband basis, the attack signal of the signal that occurs at the beginning of a subband frame is only handled when gain control is applied to the preceding subband frame. be able to. This is known as the "boundary attack signal" problem, so that when the "current" subband frame is being processed, measurements are performed on the attack signal in the first section of the "future" subband frame. To
Requires extra subband frame buffering. The present invention provides for more effective detection of releases,
Utilize the first eight partitions of future subband frames. The detection of the attack signal and the release signal is executed from the first (0th) section to the last (31st) section of the subband frame.
ãï¼ï¼ï¼ï¼ãããåºåã«ããã¦ã¢ã¿ãã¯ä¿¡å·ã®ç«ä¸ãã
ã®éå§ä½ç½®ãæ¤åºããããã«ãå½è©²åºåå
ã®ãªã¼ãã£ãª
ãµã³ãã«ä¿¡å·ã®ãã¼ã¯ã®çµ¶å¯¾å¤ã§ããåºåã®ãã¼ã¯å¤Ma
xPeakã¯ãããã«å
è¡ããï¼åã®åºåã®ãã¼ã¯å¤MaxPeak
ã®ããããã¨æ¯è¼ããããå½è©²åºåã®ãã¼ã¯å¤MaxPeak
ã¯ãå
è¡ããï¼åã®åºåã®æå¤§ã®ãã¼ã¯å¤MaxPeakãäº
ãæ±ºãããããããå¤ã®æ¯ï¼æ¬å®æ½å½¢æ
ã§ã¯ï¼ï¼ã ãä¸
åãã¨ãã®ã¿ãã¢ã¿ãã¯ä¿¡å·ã¨ãã¦åé¡ããããå½è©²åº
åããµããã³ããã¬ã¼ã ã®å
é é¨åã®è¿ãã§ãããã¤ã
ãã«å
è¡ãã¦åå¨ããåºåãï¼åã«æºããªãã¨ãã¯ãç¾
å¨ã®ãµããã³ããã¬ã¼ã ã«å
è¡ãããã§ã«å©å¾å¶å¾¡ãã
ããµããã³ããã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤ï¼ãã¼ã¯å¤Prev
FramePeakï¼»ï½ï¼½ï¼ãèæ
®ããå¿
è¦ããããæ¯ã®è¦ä»¶ã
æºãããã¦ããã°ãå
è¡ããã¢ã¿ãã¯ãã¤ã³ãããã®åº
åã®åé¢è·é¢ï¼åé¢åºåæ°ï¼ãååã«å¤§ããå ´åã«ã®
ã¿ãå½è©²åºåã¯æå¹ãªã¢ã¿ãã¯ãã¤ã³ãã§ãããã®ã¨ã
ãããããã¯ãé常ã«å¾®ç´°ãªã¹ã±ã¼ã«ã§å©å¾å¶å¾¡ããå©
çããããéååã®ããã«ããå¤ãã®ããããæ®ãã¦ã
ãå©çã®ã»ããéè¦ãªããã§ãããæ¬å®æ½å½¢æ
ã§ã¯ãï¼
åºåã®åé¢è·é¢ãæ¡ç¨ãã¦ãããIn order to detect the rising position of the attack signal in a certain section, the peak value Ma of the section, which is the absolute value of the peak of the audio sample signal in the section, is detected.
xPeak is the peak value MaxPeak of the eight sections preceding it
Compared to each. Peak value of the category MaxPeak
Is classified as an attack signal only when the maximum peak value MaxPeak of the preceding eight segments is exceeded by a predetermined threshold ratio (2 in this embodiment). If the partition is near the beginning of the subband frame and less than eight partitions exist prior to this, the peak of the subband frame that has preceded the current subband frame and has already been gain controlled Absolute value of (peak value Prev
FramePeak [i]) needs to be considered. If the ratio requirement is met, then a segment is considered to be a valid attack point only if the separation distance of the segment from the preceding attack point (separation segment number) is sufficiently large. This is because the benefit of leaving more bits for quantization is more important than the benefit of gain control on a very fine scale. In this embodiment, 1
The separation distance of the division is adopted.
ãï¼ï¼ï¼ï¼ãä¸è¬ã«ã¢ã¿ãã¯ä¿¡å·ã«å¾ç¶ãã¦åå¨ãããª
ãªã¼ã¹ä¿¡å·ãæ¤åºããããã«ãç¾å¨ã®åºåã®åã®æå¾ã®
ä¿®æ£ãã¤ã³ãã¨ç¾å¨ã®åºåã¨ã®éã®æå¤§ã®ãã¼ã¯å¤MaxP
eakã§ããInterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ããç¾å¨ã®åºå
以å¾ã®ï¼åã®åºåã®æå¤§ã®ãã¼ã¯å¤MaxPeakããããå¤
ã®æ¯ã ãè¶
éãã¦ããå¿
è¦ãããããããå¤ã¯ãå
è¡ã
ãä¿®æ£ãã¤ã³ããã¢ã¿ãã¯ãã¤ã³ãã§ããããªãªã¼ã¹ã
ã¤ã³ãã§ãããã«ä¾åãã¦å¤åãããå
è¡ããä¿®æ£ãã¤
ã³ããã¢ã¿ãã¯ãã¤ã³ãã§ããã°ããã大ãããããå¤
ï¼æ¬å®æ½å½¢æ
ã§ã¯ãæ¸è¡°ä¿æ°attn_ftrï¼ï¼ï¼ã使ç¨ã
ãããªãªã¼ã¹ãã¤ã³ãã§ããã°ãããå°ãããããå¤
ï¼æ¬å®æ½å½¢æ
ã§ã¯ãæ¸è¡°ä¿æ°attn_ftrï¼ï¼ï¼ã使ç¨ãã
ããå³ï¼ã示ãããã«ããªãªã¼ã¹ä¿¡å·ã¯ãã¢ã¿ãã¯ä¿¡å·
ã¨ã¯ç°ãªããæåã«ããæ¥æ¿ãªå¾é
ã§çºçãããã®å¾ã
ãã«ããããã«ããç·©ãããªã¬ã¼ãã«ãªãå¾åãããã
å¯å¤ãªãããå¤ã¯ããªãªã¼ã¹ä¿¡å·ããç°ãªãã¬ãã«ã®å©
å¾å¶å¾¡ãé©ç¨å¯è½ãªè¤æ°ã®é åã¸ã¨ãã广çã«åå²
ï¼ã»ã°ã¡ã³ãåï¼ã§ãããçµæã¨ãã¦ããã¹ãã¨ã³ã¼ã¯
ããè¯å¥½ã«æå§ããããIn order to detect the release signal, which generally follows the attack signal, the maximum peak value MaxP between the last correction point before the current segment and the current segment.
InterModMaxPeak [i] [j], which is an eak, needs to exceed the maximum peak value MaxPeak of the eight partitions after the current partition by a threshold ratio. The threshold value changes depending on whether the preceding correction point is an attack point or a release point. If the preceding correction point is an attack point, a larger threshold value (in the present embodiment, a damping coefficient attn_ftr = 4) is used, and if it is a release point, a smaller threshold value (in the present embodiment, a damping coefficient). The coefficient attn_ftr = 3) is used. As FIG. 6 shows, the release signal, unlike the attack signal, tends to occur with a steeper slope first and then sooner to a slightly slower rate.
The variable threshold can more effectively divide (segment) the release signal into regions where different levels of gain control can be applied. As a result, post-echo is better suppressed.
ãï¼ï¼ï¼ï¼ããã¹ã¦ã®ã¢ã¿ãã¯ãã¤ã³ãã«å¯¾ãã¦ããã®
ææ¨ï¼åã¯ããã¼ã«ã¼ï¼ã¨ãã¦ï¼ã®å¤ãä¿®æ£ã¬ãã«alev
codeï¼»ï½ï¼½ï¼»ï½ï¼½ã«å²ãå½ã¦ãããããã¹ã¦ã®ãªãªã¼ã¹
ãã¤ã³ãã«é¢ãã¦ãããã«å
è¡ãããã¼ã¯å¤InterModMa
xPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã®å¤§ããã示ãè² ã®å¤ããä¿®æ£ã¬ã
ã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã«ææ¨ã¨ãã¦å²ãå½ã¦ãããã
ããã§å²ãå½ã¦ãããä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½
ã¯ãå®éã«å©å¾å¶å¾¡ãè¡ãããã®å¤ã§ã¯ãªããA value of 1 as the index (or marker) for all attack points is the correction level alev
It is assigned to code [i] [j]. Preceding peak value InterModMa for all release points
A negative value indicating the magnitude of xPeak [i] [j] is assigned to the modification level alevcode [i] [j] as an index.
Modification level assigned here alevcode [i] [j]
Is not a value for actually performing gain control.
ãï¼ï¼ï¼ï¼ãæå¾ï¼ï¼ï¼çªç®ï¼ã®åºåãå¦çãããå¾
ã§ãå
ã«çºè¦ããããã¹ã¦ã®ä¿®æ£ã¬ãã«ã«åºã¥ãã¦ãç®
æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ãè¨ç®ãããããã
ã¯ãç¾å¨ã®ãã¬ã¼ã ã®ä¿®æ£ãã¤ã³ãéã®ãã¹ã¦ã®ä¿¡å·é
åãå¢å¹
ãããåã¯æ¸è¡°ãããç®æ¨ã¨ãªããæã¾ããã
ä¿¡å·ã¬ãã«ã§ãããAfter the last (31st) segment has been processed, the target peak value CurrFramePeak [i] is calculated based on all previously discovered correction levels. This is the desired "desirable" target where all signal areas between the correction points of the current frame are amplified or attenuated.
The signal level.
ãï¼ï¼ï¼ï¼ãå®éã®ä¿®æ£ã¬ãã«ãè¨ç®ãããåã«ãä¿¡å·
ã¯ããèªç¶ãªéä¸ãã«ã¤ãã¦ãããªãã¡ä¿¡å·å¼·åº¦ãçºè©±
é³å£°ã«ç¹æã®èªç¶ãªæ¸è¡°ï¼ãã§ã¼ã¸ã³ã°ï¼ã®å¾´åãæã
ããå¦ãã«ã¤ãã¦ãã§ãã¯ããããå
è¡ãããµããã³ã
ãã¬ã¼ã ãã¢ã¿ãã¯ã®å¾´åãã»ã¨ãã©ç¤ºããªããããã
ã¯å
¨ã示ããããã¤ç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ä¿®æ£ã
ã¤ã³ãããªãªã¼ã¹ãã¤ã³ãã®ã¿ã§æ§æããã¦ããã¨ã
ã¯ãããã¯ãèªç¶ãªéä¸ãã¨ãã¦åé¡ããããæå¹ãª
ãèªç¶ãªéä¸ãã«ã¤ãã¦ã¯ããã¹ã¦ã®ãªãªã¼ã¹ãã¤ã³ã
ãé¤å»ããããBefore the actual correction level is calculated, the signal is checked for "natural descent", ie, whether the signal strength has the signs of natural attenuation (fading) typical of spoken speech. . If the preceding subband frame shows little or no sign of attack, and the modification point of the current subband frame consists only of the release point, this is classified as a "natural descent". It For effective "natural descent" all release points are removed.
ãï¼ï¼ï¼ï¼ã次ã«ãå
ã«å²ãå½ã¦ããããææ¨ã¨ãªãã
ä¿®æ£ã¬ãã«ã«å¯¾ãã¦ãå®éã®ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½
ï¼»ï½ï¼½ããç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ãç¨ãã¦
ãã¹ã¦ã®ä¿®æ£ãã¤ã³ãã«ã¤ãã¦è¨ç®ããããå¾ã£ã¦ãä¿®
æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥
æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ããã¢ã¿ãã¯ä¿¡å·
ã®éå§ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ãããª
ã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡
å·ã®é¨åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãä¿®
æ£ãã¤ã³ãã¨ãã¦èå¥ãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ã
ä¿®æ£é¢æ°ã«å¾ã£ã¦å©å¾å¶å¾¡ããã¨ãã«ææããããå¦ç
ãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãããç®æ¨ãã¼ã¯å¤ãè¨ç®ãã
èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãèå¥
ããããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåã¨ãè¨ç®ãã
ãç®æ¨ãã¼ã¯å¤ã¨ã«åºã¥ãã¦ãå½è©²ç¾å¨ã®ãã¬ã¼ã ã®å
åºåã«ä¿ãä¿®æ£é¢æ°å¤ãããªãä¿®æ£é¢æ°ãè¨ç®ãããã¨
ãç¹å¾´ã¨ãããNext, the âindexâ assigned previously
Actual modification level alevcode [i] against modification level
[J] is calculated for all correction points using the target peak value CurrFramePeak [i]. Therefore, the correction function calculation unit 14b identifies, as the correction point, a section including the start position of the attack signal, which is a portion of the signal including a sharp rise of the sound, based on the audio signal, and based on the audio signal, The section including the end position of the release signal, which is the portion of the signal including the fall of the sound, is identified as the correction point, and based on the audio signal,
Calculating the target peak value in the current frame to be processed, which is desired when gain control is performed according to the correction function,
Based on the segment containing the start position of the identified attack signal, the segment containing the end position of the identified release signal, and the calculated target peak value, from the correction function value for each segment of the current frame It is characterized in that a correction function is calculated.
ãï¼ï¼ï¼ï¼ãä¿¡å·ããçºè©±é³å£°ã«ç¹æã®ãèªç¶ãªä¸æã
ã示ãã¨ããããªãã¡ä¿¡å·å¼·åº¦ã次第ã«å¢å¤§ããã¨ãã
ã¢ã¿ãã¯ãã¤ã³ãã®ä¿®æ£ã¬ãã«ã調ç¯ããããå³ï¼
ï¼ï½ï¼ã示ãããã«ãä¿¡å·ã®æ¯å¹
ããã£ããã¨å¢å¤§ãã¦
ããã¨ãã¯ãä¿¡å·ããã¼ã¯ã«å°éããåºåã¯å
è¡ããã¢
ã¿ãã¯ãã¤ã³ãããååã«åé¢ããã¦ããããã®ãããª
çºè©±é³å£°ã«ç¹æã®ç¹å¾´ãæããä¿¡å·ã«å¯¾ãã¦è¨ç®ããã
ä¿®æ£ã¬ãã«ï¼å¢å¹
ï¼ããã®ã¾ã¾é©ç¨ããã¨ãå³ï¼ï¼ï½ï¼
ã®ããã«ã¢ã¼ãã£ãã¡ã¯ããçºçããããã¨ããããå¾
ã£ã¦ãæå¹ãªãèªç¶ãªä¸æããåå¨ãããªãã°ãæ¬å®æ½
å½¢æ
ã®å¦çã§ã¯ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ãï¼ã
ãæ¸å°ããï¼å³ï¼ï¼ï½ï¼ï¼ãæ¬å®æ½å½¢æ
ã§ãã¢ã¿ãã¯ä¿¡
å·ããèªç¶ãªä¸æãã¨ãã¦åé¡ãããããã«ã¯ãä¿®æ£ä½
ç½®aloccodeï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ããã®å¾ã®ãã¼ã¯ä½ç½®InterM
odMaxLocï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ã¨ã®éã«ï¼åã®åºåã ãã®å
é¢è·é¢ãå¿
è¦ã§ãããThe signal is a "natural rise" peculiar to the speech voice.
, That is, when the signal strength gradually increases,
The correction level of the attack point is adjusted. Figure 7
As shown in (a), when the amplitude of the signal is slowly increasing, the segment where the signal reaches the peak is well separated from the preceding attack point. If the correction level (amplification) calculated for the signal having the characteristic feature of the uttered voice is applied as it is, FIG.
May cause artifacts like. Therefore, if there is an effective ânatural increaseâ, the modification level alevcode [i] [j] is decreased by 1 in the processing of this embodiment (FIG. 7 (c)). In this embodiment, in order to classify the attack signal as ânatural riseâ, the correction position aloccode [i] [j] and the subsequent peak position InterM
A separation distance of only 5 partitions is required between odMaxLoc [i] [j + 1].
ãï¼ï¼ï¼ï¼ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã¯ãï¼ï¼¤ï¼£ï¼´ã®
ããã«ãå
è¡ãããµããã³ããã¬ã¼ã åã³å¾ç¶ãããµã
ãã³ããã¬ã¼ã ã¨ããããç¸¦ç¶æ¥ç¶ãããã®ã§ãç®æ¨ã
ã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã§ç¤ºããã¨ãã§ããå©å¾å¶
御ãããç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ä¿¡å·å¼·åº¦ã¯ããã¼
ã¯å¤PrevFramePeakï¼»ï½ï¼½ã§ç¤ºãããå©å¾å¶å¾¡ãããå
è¡ãããµããã³ããã¬ã¼ã ã®ä¿¡å·å¼·åº¦ã¨åçã§ããå¿
è¦
ãããã䏿¹ã®å¤ã仿¹ã®å¤ããã¯ããã«å¤§ããå ´å
ã¯ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ç¬¬ï¼ã®åºåã«ç¹å¥ãªä¿®
æ£ãã¤ã³ããå°å
¥ããããä¾å¤çã«ã両æ¹ã®ãµããã³ã
ãã¬ã¼ã ã«å©å¾å¶å¾¡ãå®è¡ãããªãã¨ãã¯ä¸è¿°ã®ä¿®æ£ã¯
å®è¡ãããããã®ãããªå ´åã¯ãï¼ã¤ã®ãµããã³ããã¬
ã¼ã ã®ä¿¡å·ã¬ãã«ã®å·®åããèªç¶ãã§ããã¨ããããThe current sub-band frame is cascade-connected to the preceding sub-band frame and the succeeding sub-band frame, respectively, for MDCT, so that the gain control can be performed by the target peak value CurrFramePeak [i]. The signal strength of the current subband frame needs to be equivalent to the signal strength of the gain-controlled preceding subband frame indicated by the peak value PrevFramePeak [i]. If one value is much larger than the other, a special correction point is introduced in the first partition of the current subband frame. Exceptionally, the above modification is not performed when gain control is not performed on both subband frames, in which case the difference between the signal levels of the two subband frames is said to be "natural". .
ãï¼ï¼ï¼ï¼ãåãµããã³ããã¬ã¼ã ã«å¯¾ããä¿®æ£é¢æ°ã®
第ï¼ã®åºåã«ä¿ãä¿®æ£ã¬ãã«ï¼å³ï¼ã§ã¯ä¿®æ£é¢æ°å¤ï¼ï¼¦
ï¼»ï½ï¼½ï¼»ï½ï¼½ï¼»ï¼ï¼½ã§ç¤ºããã¦ãããï¼ã¯ãå
è¡ãããµ
ããã³ããã¬ã¼ã å
¨ä½ã«ãåã¶ã®ã§ãããã¯å
è¡ãããµ
ããã³ããã¬ã¼ã ã¸ã®éåº¦ã®æ¸è¡°ãå¼ãèµ·ãããã®ã§ã
ã£ã¦ã¯ãªããªããããã«ãåºæºã«é©åãããããã«ä»ã®
ä¿®æ£ã¬ãã«ã«é©ç¨ãããé常ã®ã¯ãªããã³ã°æ¸¬å®ã«å ã
ã¦ã第ï¼ã®åºåã®ä¿®æ£ã¬ãã«ã¯ãå
è¡ãããµããã³ãã
ã¬ã¼ã ã®ãã¹ã¦ã®ã¢ã¿ãã¯ãã¤ã³ãã«ãããæå°ã®ä¿®æ£
ã¬ãã«ããå°åºããããããªãã¯ãªããã³ã°ãåãããThe correction level (correction function value MF in FIG. 4) relating to the first section of the correction function for each subband frame.
It is indicated by [i] [n] [0]. ) Also spans the entire preceding subband frame, so it should not cause excessive attenuation to the preceding subband frame. Therefore, in addition to the usual clipping measurements applied to other modification levels to meet the criteria, the modification level of the first partition is derived from the minimum modification level at all attack points of the preceding subband frame. Subject to further clipping.
ãï¼ï¼ï¼ï¼ãåä¿®æ£ãã¤ã³ãã«å¯¾ããä¿®æ£ã¬ãã«alevco
deï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ããã®ä¿®æ£ãã¤ã³ãã®ä¿®æ£ä½ç½®alocco
deï¼»ï½ï¼½ï¼»ï½ï¼½ã¨å
è¡ããä¿®æ£ãã¤ã³ãã®ä¿®æ£ä½ç½®aloc
codeï¼»ï½ï¼½ï¼»ï½âï¼ï¼½ã¨ã®éã®ä¿¡å·ã®ã¿ãä¿®æ£ããã®
ã§ãæå¾ã®ä¿®æ£ãã¤ã³ããããæéçã«å¾ã®ä¿¡å·ã¯ãå©
å¾å¶å¾¡ãè¦ããå ´åã§ãã£ã¦ãä¿®æ£ãããªãã¾ã¾å©å¾å¶
御é¨ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ããåºåããã¦ãã¾ããã¨ã
èããããï¼å³ï¼ï¼ï½ï¼ï¼ï¼ï½ï¼ï¼ãå®éã«ãããã¯ã
ãã¹ãã¨ã³ã¼ã®é常ã«ä¸è¬çãªåå ã§ããããã®åé¡ã
åé¿ããããã«ãæå¾ã®ä¿®æ£ãã¤ã³ã以å¾ã®ãã¹ã¦ã®åº
åã«ãããæå¤§ã®ãã¼ã¯å¤MaxPeakããç®æ¨ãã¼ã¯å¤Cur
rFramePeakï¼»ï½ï¼½ã®å¤§ããã«ç
§ããã¦ãã§ãã¯ãããè
ããå·®ãåå¨ããå ´åã¯ãå³ï¼ï¼ï½ï¼ã®ããã«æå¾ã®åº
åã«æ°ããä¿®æ£ãã¤ã³ããå°å
¥ããããModification level alevco for each modification point
de [i] [j] is the correction position alocco of the correction point
Correction position aloc of the correction point preceding de [i] [j]
Since only the signal between code [i] [j-1] is modified, the signal temporally after the last modification point remains unmodified even if gain control is required, and the gain control unit 15 It is considered that the data is output from -1 to 15-4 (FIGS. 8A and 8B). In fact, this is
It is a very common cause of post echo. To avoid this problem, the maximum peak value MaxPeak in all sections since the last modification point is set to the target peak value Cur.
If checked against the magnitude of rFramePeak [i] and there is a significant difference, then a new correction point is introduced in the last partition, as in FIG. 8 (c).
ãï¼ï¼ï¼ï¼ã以ä¸ãæ¬çºæã®å®æ½å½¢æ
ã«ä¿ãä¿®æ£é¢æ°è¨
ç®æ¹æ³ããATRACï¼ã®å©å¾å¶å¾¡æ¹æ³ã宿½ä¾ã¨ãã¦
ç¨ãã¦èª¬æããããæ¬æ¹æ³ã¯ãä»ã®å©å¾å¶å¾¡æ¹æ³ã«ä¿ã
å©å¾å¶å¾¡å¦çãå®è¡ãã符å·åå¨ã«ãä¸è¬åãããã¨ã
ã§ãããå³ï¼ä¹è³å³ï¼ï¼ã¯ãå³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼
ï½ã«ãã£ã¦å®è¡ãããä¿®æ£é¢æ°è¨ç®å¦çã®ããã¼ãã£ã¼
ãã示ãã¦ãããHereinafter, the correction function calculation method according to the embodiment of the present invention will be described by using the gain control method of the ATRAC3 as an example, but this method executes the gain control processing according to another gain control method. It can also be generalized to encoders. 9 to 18 show the correction function calculation unit 14 of FIG.
The flowchart of the correction function calculation process performed by b is shown.
ãï¼ï¼ï¼ï¼ãå³ï¼ã®ã¹ãããï¼³ï¼ã«ããã¦ãã¾ãããã
ãã¡ã¡ã¢ãªï¼ï¼ï½ãããæ¬¡ã«å¦çãã¹ãå¦ç対象ã®ãµã
ãã³ããã¬ã¼ã ã®åºåã¨ããã«å¾ç¶ãããµããã³ããã¬
ã¼ã ã®æåã®ï¼åã®åºåã¨ãå«ãï¼ã¤ã®ãµããã³ããã¬
ã¼ã ã®åºåã®ãµã³ãã«ä¿¡å·ãèªã¿åºãã¦ããã§ã«å©å¾å¶
御ãããå
è¡ãããµããã³ããã¬ã¼ã ã®ãã¼ã¯å¤PrevFr
amePeakï¼»ï½ï¼½ã¨ã¨ãã«ãä¸è¨å¦ç対象ã®ãµããã³ãã
ã¬ã¼ã ã®ããã®ä¿®æ£é¢æ°ãè¨ç®ããããã«ç¨ãããµãã
ã³ããã¬ã¼ã ã®ãã¼ã¿ã¨ããããªãããããã¡ã¡ã¢ãªï¼
ï¼ï½ã«ã¯ããã£ã«ã¿ãã³ã¯ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼ãã符
å·åãã¹ããªã¼ãã£ãªãµã³ãã«ä¿¡å·ãé æ¬¡æç³»åã§å
¥å
ããã¦æ ¼ç´ããã¦ãããã®ã¨ãããæ¬¡ãã§ãã¹ãããï¼³
ï¼ã«ããã¦ãã¡ã¤ã³ã«ã¼ããéå§ãããããã«ãï½ããµ
ããã³ãã®ã¤ã³ããã¯ã¹ã¨ãã¦åæåãããã¤ã³ããã¯
ã¹ï½ã¯å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã§ã¤ã³ã¯ãªã¡ã³ãããã
å¦çã¯ãã¹ãããï¼³ï¼ï¼ã§æ±ºå®ãããããã«ãã¤ã³ãã
ã¯ã¹ï½ããµããã³ãæ°max_bandãè¶
éããã¾ã§å復ãã
ãããµããã³ãæ°max_bandã¯ããªã¼ãã£ãªã¨ã³ã³ã¼ãï¼
ãå©å¾å¶å¾¡ãé©ç¨ãããã¨ãããµããã³ãï½ï¼ï½ï¼ï¼ï¼
ï¼ï¼ï¼ï¼ï¼ï¼ã®æå¤§ã®æ°ï¼ããªãã¡ï¼ï¼ã示ããIn step S1 in FIG. 9, first, from the buffer memory 14a, two sub-segments including a section of a sub-band frame to be processed next and the first eight sections of subsequent sub-band frames are processed. Read the sampled signal of the band frame section to obtain the peak value PrevFr of the preceding sub-band frame that has already been gain controlled.
Together with amePeak [i], it is the data of the subband frame used to calculate the correction function for the subband frame to be processed. The buffer memory 1
It is assumed that the audio sample signals to be encoded from the filter banks 13-1 to 13-4 are sequentially input and stored in time series in 4a. Then, step S
At 2, i is initialized as a subband index to start the main loop. The index i is incremented in step S32 of FIG.
The process is repeated until the index i exceeds the number of subbands max_band, as determined in step S33. The number of subbands max_band is audio encoder 2
Is to apply the gain control to subband i (i = 0,
1, 2, 3) is the maximum number (ie 3).
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ã¯åæåå¦çã§ããã詳ãã
ã¯ãå³ï¼ï¼ã®ãµãã«ã¼ãã³ãåç
§ããã¨ãã¾ãã¹ããã
ï¼³ï¼ï¼ã«ããã¦ãå
è¡ãããµããã³ããã¬ã¼ã ã®ä¿®æ£ã
ã¤ã³ãæ°prev_adjust_numï¼»ï½ï¼½ã決å®ããããã®æ¡ä»¶
ããã§ãã¯ãããããã¯ãåä¸ã®ãµããã³ãï½ã®å
è¡ã
ããµããã³ããã¬ã¼ã ã«ããã¦ãä¿®æ£ãåå¨ãããå¦ã
ã®æ¨èã§ãããå
è¡ãããµããã³ããã¬ã¼ã ã®ä¿®æ£ãã¤
ã³ãæ°adjust_numï¼»ï½ï¼½ãï¼ã§ãããããããã¯ãã ï¼
ã¤ã®ä¿®æ£ãã¤ã³ããåå¨ããã¤ãããå
è¡ãããµããã³
ããã¬ã¼ã ã®ã¾ãã«å
é é¨åã§çºçãã¦ããã¨ãã¯ï¼æ¬
宿½å½¢æ
ã§ã¯ããµããã³ããã¬ã¼ã ã®æåã®ï¼ã¤ã®åºå
ã®ãã¡ã®ããããï¼aloccodeï¼»ï½ï¼½ï¼»ï¼ï¼½ï¼ï¼ï¼ãã¹ã
ããï¼³ï¼ï¼ã«ããã¦ãããã¯ï¼ã«è¨å®ããããåè¿°ã®æ¡
ä»¶ãæºãããã¦ããªãå ´åã¯ãä¿®æ£ãã¤ã³ãæ°prev_adj
ust_numï¼»ï½ï¼½ã¯ãã¹ãããï¼³ï¼ï¼ã§ãå
è¡ãããµãã
ã³ããã¬ã¼ã ã®ä¿®æ£ãã¤ã³ãæ°adjust_numï¼»ï½ï¼½ã«è¨å®
ããããStep S3 is an initialization process. Specifically, referring to the subroutine of FIG. 14, first, in step S41, the condition for determining the number of correction points prev_adjust_num [i] of the preceding subband frame is checked. This is an indicator of whether there was a modification in the preceding subband frame of the same subband i. The number of adjustment points adjust_num [i] of the preceding sub-band frame is 0, or only 1
If there are two correction points and they occur at the very beginning of the preceding subband frame (in this embodiment, one of the first three subband segments; aloccode [i] [ 0] <3), this is set to 0 in step S42. If the above conditions are not met, the number of correction points prev_adj
ust_num [i] is set to the number of adjustment points adjust_num [i] of the preceding subband frame in step S43.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãä¿®æ£ä½ç½®aloccode
ï¼»ï½ï¼½ï¼»ï½ï¼½åã³ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ãå
æåãããæ¬çºæã¯ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«è¨±å®¹ãããæ
大å¤ï¼ï¼åï¼ãè¶
éããä¸éã®ä¿®æ£ãã¤ã³ããçºçãã
ããã¨ãããã®ã§ããããã®ä¸éã®ãã¼ã¿ã«è¨æ¶è£
ç½®ã
å²ãå½ã¦ããã¨ãéè¦ã§ãããæ¬å®æ½å½¢æ
ã®ä¿®æ£é¢æ°è¨
ç®å¦çã§ã¯ãï¼ï¼åã®ä¸éã®ä¿®æ£ãã¤ã³ãã許容ããã¦
ãã¦ãåä¸ã®ä¿®æ£ã¬ãã«ãå²ãå½ã¦ããã¦ããäºãã«é£
æ¥ããä¿®æ£ãã¤ã³ãã®çµ±åï¼ã¹ãããï¼³ï¼ï¼ï¼ã®å¾ã§ã
ãªããæ¨æºãããå¤ãçºçãããä¸éã®ä¿®æ£ãã¤ã³ãã
åå¨ããã¨ãã¯ãã¹ãããï¼³ï¼ï¼ã§ãæ¤åºãããä¸éã®
ä¿®æ£ãã¤ã³ãã忏ãã¦ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«å¾ãæçµ
çãªä¿®æ£ãã¤ã³ãã®ãã¼ã¿ã¨ãããThe step S44 is a correction position aloccode.
Initialize [i] [k] and modification level alevcode [i] [k]. It is important to allocate storage for these intermediate data, as the present invention may generate intermediate correction points that exceed the maximum allowed by the ATRAC3 standard (7). In the correction function calculation process of the present embodiment, 20 intermediate correction points are allowed, and even after the integration of adjacent correction points (step S24) to which the same correction level is assigned, it is still higher than the standard. If there are a lot of intermediate correction points, the detected intermediate correction points are reduced in step S25 to obtain final correction point data according to the ATRAC3 standard.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã«ããã¦ãå
¥åä¿¡å·ã®ã¹ã
ã¯ãã«æåã®ãã¡ã®ï¼ã¤ã§ãã£ã¦ãï¼ï¼ï¼åã®ãªã¼ãã£
ãªãµã³ãã«ä¿¡å·ã«ã¦ãªããµããã³ãï½ã®ãµããã³ããã¬
ã¼ã ãããï¼â¦ï½ï¼ï¼ï¼ã®ï½ã«å¯¾ãã¦ï¼ï½Ãï¼ï¼çªç®ã
ãå§ã¾ãï¼åã®é£ç¶ãããªã¼ãã£ãªãµã³ãã«ä¿¡å·ã«ãã
ãæå¤§ã®çµ¶å¯¾å¤ã¨ãã¦å®ç¾©ããããã¼ã¯å¤MaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼½ãçºçããããï¼ï¼ï¼ï½ï¼ï¼ï¼ã§ããã°ã
ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ãæªæ¥ã®ãµããã³ãã
ã¬ã¼ã ã«ããããï¼ï½âï¼ï¼ï¼Ãï¼çªç®ããå§ã¾ãï¼å
ã®é£ç¶ãããªã¼ãã£ãªãµã³ãã«ä¿¡å·ããåå¾ããããã
ã®ããã«åå²ããããï¼åã®é£ç¶ãããªã¼ãã£ãªãµã³ã
ã«ä¿¡å·ãã¯ããµããã³ããã¬ã¼ã ã®ãï½çªç®ã®åºåãã
æ§æããå¾ã£ã¦ããã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ãåºå
ã®ãã¼ã¯å¤ã¨å¼ã¶ãIn step S45, from one of the spectral components of the input signal, a subband frame of subband i consisting of 256 audio sample signals, for k of 0â¤k <40 ( Peak value MaxPeak defined as the maximum absolute value in 8 consecutive audio sample signals starting from the (k à 8) th
Generate [i] [k]. If 31 <k <40,
The peak value MaxPeak [i] [k] is obtained from eight consecutive audio sample signals starting from (kâ32) Ã 8th in the future subband frame. The â8 consecutive audio sample signalsâ thus divided constitute the âkth divisionâ of the subband frame, and thus the peak value MaxPeak [i] [k] is called the peak value of the division. .
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãï¼ã¤ã®å¤æ°ããªã»ãã
ãããä¿®æ£ãã¤ã³ãæ°adjust_numï¼»ï½ï¼½ã¯ããµããã³ã
ï½ã®ãµããã³ããã¬ã¼ã ã§çºè¦ãããä¿®æ£ãã¤ã³ãã®ç·
æ°ã§ããããã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ã
ã¹ãããï¼³ï¼ä¹è³ï¼³ï¼ï¼ã®ã«ã¼ãã«ããã¦ååºåãé æ¬¡
ã«èª¿ã¹ã¦ããéã¯ãç¾å¨ã®åºåã¨ï¼ï½âï¼ï¼çªç®ã®ä¿®æ£
ãã¤ã³ãã¨ã®éã®æéã«ãããæå¤§ã®ãã¼ã¯å¤MaxPeak
ã¨ãã¦å®ç¾©ãããã¾ããä¸è¨ã«ã¼ããçµäºããå¾ã®ç®æ¨
ãã¼ã¯å¤CurrFramePeakè¨ç®å¦çï¼ã¹ãããï¼³ï¼ï¼ï¼ã§
ã¯ãäºãã«é£æ¥ããåä¿®æ£ãã¤ã³ãéã®æå¤§ã®ãã¼ã¯å¤
MaxPeakã¨ã¿ãªãããããã¼ã¯ä½ç½®InterModMaxLoc
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ãä¸è¨æå¤§ã®ãã¼ã¯å¤InterModMaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼½ãä½ç½®ãã¦ããåºåã®ã¤ã³ããã¯ã¹ã§ã
ããA step S46 resets the three variables. The number of correction points adjust_num [i] is the total number of correction points found in the subband frame of subband i. The peak value InterModMaxPeak [i] [k] is
While sequentially examining each segment in the loop of steps S5 to S19, the maximum peak value MaxPeak in the period between the current segment and the (k-1) th correction point
In the target peak value CurrFramePeak calculation process (step S22) after the loop is finished, the maximum peak value between the correction points adjacent to each other is defined as
Considered as MaxPeak. Peak position InterModMaxLoc
[I] [k] are the maximum peak values InterModMaxPeak
[I] [k] is the index of the segment in which it is located.
ãï¼ï¼ï¼ï¼ãåã³å³ï¼ãåç
§ããã¹ãããï¼³ï¼ã¯ããã
ã¼ã®å
é¨ã«ã¼ããéå§ãããããã®ã«ã¼ãã¯ããåºåã®
ã¤ã³ããã¯ã¹ãï½ã§å復ããããã¤ã³ããã¯ã¹ï½ã¯å³ï¼
ï¼ã®ã¹ãããï¼³ï¼ï¼ã«ããã¦ï¼ã ãã¤ã³ã¯ãªã¡ã³ãã
ããã¹ãããï¼³ï¼ï¼ã«ããã¦ã¤ã³ããã¯ã¹ï½ãï¼ï¼ã«å°
éããã¨ãä¸è¨å
é¨ã«ã¼ããçµäºãã¦å³ï¼ï¼ã®ã¹ããã
ï¼³ï¼ï¼ã«é²ããReferring again to FIG. 9, a step S4 starts the inner loop of the flow. This loop iterates on the "partition index" j. The index j is shown in FIG.
When it is incremented by 1 in step S14 of 0 and the index j reaches 32 in step S15, the inner loop is terminated and the process proceeds to step S20 of FIG.
ãï¼ï¼ï¼ï¼ãå
é¨ã«ã¼ãã«ããã¦ãã¹ãããï¼³ï¼ã§ãã
ã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã
ææ°ã®åºåã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ãããå°ã
ãã¨ãã¯ãã¹ãããï¼³ï¼ã§ãä¸è¨ææ°ã®åºåã®ãã¼ã¯å¤
MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½åã³ãã®ä½ç½®ããåºåï½ã«å¾ã£
ã¦ããã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½ï¼»adjust_num
ï¼»ï½ï¼½ï¼½åã³ãã¼ã¯ä½ç½®InterModMaxLocï¼»ï½ï¼½ï¼»adjust
_numï¼»ï½ï¼½ï¼½ã®å¤ãæ´æ°ãã¦ã¹ãããï¼³ï¼ã«é²ãããã
ã§ãªãã¨ãã¯ããã®ã¾ã¾ã¹ãããï¼³ï¼ã«é²ããIn the inner loop, when the peak value InterModMaxPeak [i] [adjust_num [i]] is smaller than the peak value MaxPeak [i] [j] of the latest segment in step S5, the latest value is updated in step S6. Peak value of category
Peak value InterModMaxPeak [i] [adjust_num according to MaxPeak [i] [j] and the segment j in which it is located.
[I]] and peak position InterModMaxLoc [i] [adjust
The value of _num [i]] is updated and the process proceeds to step S7. If not, the process proceeds directly to step S7.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ã¯ãã¹ãããï¼³ï¼åã³ã¹ãã
ãï¼³ï¼ã«ããã¦æ¸è¡°ä¿æ°attn_ftrã®å¤ã決å®ããããã®
æ¡ä»¶ããã§ãã¯ããããã®æ¡ä»¶ã¯åºæ¬çã«ãå
è¡ããä¿®
æ£ãã¤ã³ããã¢ã¿ãã¯ãã¤ã³ãã§ããã¨ãï¼alevcode
ï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½âï¼ï¼½ï¼ï¼ï¼ãå¾ã®ãªãªã¼ã¹
ä¿¡å·ã®åºæºï¼å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ï¼ã«ãã£ã¦ãä¿¡å·
ã¬ãã«ã大ããæ¸è¡°ããï¼æ¬å®æ½å½¢æ
ã§ã¯ãï¼åã®ï¼ã
ãå°ãããªããï¼ããæ¥æ¿ãªãªãªã¼ã¹ä¿¡å·ãçºè¦ããã
ã¨ãããã¨ãäºæãã¦ãã¹ãããï¼³ï¼ã§æ¸è¡°ä¿æ°attn_f
trãï¼ã«è¨å®ãããã®ã§ãããå
è¡ããä¿®æ£ãã¤ã³ãã
ãã§ã«ãªãªã¼ã¹ãã¤ã³ãã§ããã¨ããåã¯å°ãªãã¨ãã¢
ã¿ãã¯ãã¤ã³ãã§ã¯ãªãã¨ãï¼alevcodeï¼»ï½ï¼½ï¼»adjust
_numï¼»ï½ï¼½âï¼ï¼½â¦ï¼ï¼ãã¹ãããï¼³ï¼ã§æ¸è¡°ä¿æ°attn
_ftrãï¼ã«è¨å®ãããã¨ã«ãã£ã¦ãå¾ç¶ã®ãªãªã¼ã¹ã®ä¿®
æ£ãã¤ã³ãã¯ãããç·©åãããåºæºï¼æ¬å®æ½å½¢æ
ã§ã¯ã
ï¼åã®ï¼ããå°ãããªããï¼ã«ãã£ã¦ããã®å¤æ°ã決å®
ãã¦å
·ä½åãããå¿
è¦ããããã¹ãããï¼³ï¼ã®èæ¯ã¨ãª
ãåçã¯ãæéãçµã¤ã«ã¤ãã¦æ¬¡ç¬¬ã«ãã¼ãç¶ã«å°ãã
ãªãã¢ã¿ãã¯ä¿¡å·ã示ãå³ï¼ã«èª¬æããã¦ãããæ¸è¡°ä¿
æ°attn_ftrã®ç°ãªãå¤ã¯ã夿°ã決å®ãããã¨ã«ãã£ã¦
è¤æ°ã®ãªãªã¼ã¹ãã¤ã³ããå
·ä½åãããã¨ããã广ç
ã«ä¿é²ããå¾ã£ã¦ãã¹ãã¨ã³ã¼ã®å¶å¾¡ãå¿
è¦ã¨æããã
å ´åã«ã¯ãç°ãªãã¬ãã«ã®å¢å¹
ãé©ç¨ãããã¨ãå¯è½ã«
ãããA step S7 checks the conditions for determining the value of the damping coefficient attn_ftr in the steps S8 and S9. This condition is basically when the preceding modification point is the attack point (alevcode
[I] [adjust_num [i] -1]> 0), and the signal level is largely attenuated (in the present embodiment, smaller than 1/4) by the reference of the later release signal (step S16 in FIG. 11). ) In anticipation of a more rapid release signal being discovered, the damping factor attn_f is determined in step S9.
It sets tr to 4. When the preceding modification point is already a release point, or at least not an attack point (alevcode [i] [adjust
_num [i] â1] ⦠0), the attenuation coefficient attn in step S8
By setting _ftr to 3, the modification points for subsequent releases are more relaxed criteria (in this embodiment,
It is smaller than 1/3. ), It is necessary to determine its variables and be embodied. The principle behind step S7 is illustrated in FIG. 6, which shows an attack signal that taper off over time. Different values of the damping factor attn_ftr more effectively facilitate the realization of multiple release points by determining variables, and thus different levels of amplification when control of the post-echo seems necessary. Allows you to apply.
ãï¼ï¼ï¼ï¼ãæ¸è¡°ä¿æ°attn_ftrãæ±ºå®ãããå¾ã§ãä¿®æ£
颿°è¨ç®é¨ï¼ï¼ï½ã¯ãå³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ã¢ã¿ã
ã¯ä¿¡å·ã®åºæºãæºè¶³ãããã¨ãã§ãããå¦ãããã§ãã¯
ãããããã§ãï½ï¼ï¼çªç®ã®åºåã®ãã¼ã¯å¤MaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼ï¼ï¼½ããï½âï¼â¦ï½â¦ï½ã§ããï½çªç®ã®åº
åã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã®ï¼å以ä¸ã§ããã
å¦ãã夿ãããããã ããåºåã®ã¤ã³ããã¯ã¹ï½ã®ä¸
éãï¼ããå°ããã¨ãï¼ããªãã¡ãï½â¦ï¼ã®ã¨ãï¼ãåº
åã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ããµããã³ãï½ã®
å
è¡ãããµããã³ããã¬ã¼ã ã®åºåã®ãã¼ã¯å¤MaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼½ã§ã¯ãªãã¦ããµããã³ãï½ã®ãã§ã«å©å¾å¶
御ãããå
è¡ãããµããã³ããã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤
ã¨ãã¦å®ç¾©ããããã¼ã¯å¤PrevFramePeakï¼»ï½ï¼½ï¼ã¹ã
ããï¼³ï¼ï¼ãåç
§ï¼ã«ç½®æãããã¨ãããã¨ã注æã
ããAfter the attenuation coefficient attn_ftr is determined, the correction function calculation unit 14b checks whether or not the attack signal reference of step S10 of FIG. 10 can be satisfied. Here, the peak value MaxPeak of the j + 1st segment
It is determined whether or not [i] [j + 1] is at least twice the peak value MaxPeak [i] [k] of the k-th section where j-7 ⦠k ⦠j. However, when the lower limit of the segment index k is smaller than 0 (that is, when j ⦠7), the segment peak value MaxPeak [i] [k] is the segment peak value of the preceding subband frame of the subband i. MaxPeak
Instead of [i] [k], the peak value PrevFramePeak [i] (see step S31) is defined as the absolute value of the peak of the already gain-controlled preceding subband frame of subband i. Be careful.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã®æ¡ä»¶ãæºè¶³ã§ãããªã
ã°ãç¾å¨ã®åºåã¯æ½å¨çãªä¿®æ£ãã¤ã³ãã§ãããããã
ãªããããããæ°ãç¯ç´ãããããã¹ãããï¼³ï¼ï¼ã¯ã
ãã«æ¡ä»¶ã課ãã¦ãæ°ããä¿®æ£ãã¤ã³ãããã®åã«ãã
ã¢ã¿ãã¯ãã¤ã³ãã®åºåï¼ä¿®æ£ãã¤ã³ããã¢ã¿ãã¯ãã¤
ã³ãã§ããã¨ããï¼ããååãªè·é¢ãéã¦ã¦åå¨ããã
ã¨ãå¿
è¦ã¨ãããæ¬å®æ½å½¢æ
ã¯ãï¼åºåã®åé¢ãæç¤ºã
ã¦ããããã®åé¢ãåå¨ããªãã¨ãã¯ã次ã®åºåãä¿®æ£
ãã¤ã³ãã§ãããå¦ããæ±ºå®ããããã«ãã¹ãããï¼³ï¼
ï¼ã«é²ããIf the condition of step S10 can be satisfied, the current division is a potential correction point. However, in order to save the number of bits, step S11 imposes an additional condition that the new modification point is present at a sufficient distance from the segment of the attack point in front of it (when the modification point is the attack point). Need to do. In this embodiment, separation of one section is instructed. If this separation does not exist, step S1 is performed to determine whether the next segment is a correction point.
Go to 4.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼åã³ï¼³ï¼ï¼ã®ï¼ã¤ã®åºæºã
æºããããã¨ãæ°ããä¿®æ£ãã¤ã³ãã§ããï½çªç®ã®åºå
ãã¢ã¿ãã¯ãã¤ã³ãã¨ãã¦ãã¼ãã³ã°ããããã«ãã¹ã
ããï¼³ï¼ï¼ã§ä¿®æ£ä¿æ°expoãï¼ã«è¨å®ããããã¹ããã
ï¼³ï¼ï¼ã§ã¯ããã¾ãã¾ãªå¤æ°ãæ´æ°ãããããªãã¡ãä¿®
æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ãä¸è¨ä¿®
æ£ä¿æ°expoã«è¨å®ããä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½ï¼»adjust
_numï¼»ï½ï¼½ï¼½ãåºåã®ã¤ã³ããã¯ã¹ï½ã«è¨å®ããä¿®æ£ã
ã¤ã³ãæ°adjust_numï¼»ï½ï¼½ãï¼ã ãã¤ã³ã¯ãªã¡ã³ãã
ããããã«ãä¸è¨ã¤ã³ã¯ãªã¡ã³ããããä¿®æ£ãã¤ã³ãå¤
adjust_numï¼»ï½ï¼½ã«å¯¾ãããã¼ã¯å¤InterModMaxPeak
ï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ããåºåã®ãã¼ã¯å¤MaxPea
kï¼»ï½ï¼½ï¼»ï½ï¼ï¼]ã«è¨å®ãããã¤ãä¸è¨ãã¼ã¯å¤InterM
odMaxPeakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã®ãã¼ã¯ä½ç½®Int
erModMaxLocï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ãï½ï¼ï¼ã«è¨
å®ããããã ãããã®ã¹ãããï¼³ï¼ï¼ã§è¨å®ãããä¿®æ£
ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã¯åãªããã¼ã«ã¼ã§ããã
å®éã«ä¿¡å·ã«å°å ãããå¢å¹
ã§ã¯ãªãç¹ã注æãã¦ã
ããIf the two criteria of steps S10 and S11 are fulfilled, the correction factor expo is set to 1 in step S12 in order to mark the new correction point, the jth segment, as an attack point. In step S13, various variables are updated. That is, the correction level alevcode [i] [adjust_num [i]] is set to the correction coefficient expo, and the correction position aloccode [i] [adjust
_num [i]] is set to the segment index j, and the number of correction points adjust_num [i] is incremented by 1. Furthermore, the correction point value incremented above
Peak value InterModMaxPeak for adjust_num [i]
[I] [adjust_num [i]] is the peak value of the segment MaxPea
k [i] [j + 1] and the peak value InterM
Peak position Int of odMaxPeak [i] [adjust_num [i]]
Set erModMaxLoc [i] [adjust_num [i]] to j + 1. However, the modification level alevcode [i] [j] set in step S13 is a mere marker,
Note that this is not the amplification actually applied to the signal.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãã¹ãããï¼³ï¼ï¼ä¹è³ã¹ãããï¼³ï¼
ï¼ã«ä¿ãã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåãèå¥ãã
å¦çã§ã¯ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãå¦çãã¹ãç¾å¨ã®
åºåã«ç¶ã次ã®åºåã®ãã¼ã¯ã®çµ¶å¯¾å¤ã¨ãå½è©²ç¾å¨ã®åº
å以åã®äºã決ããããæ°ï¼ï¼åï¼ã®åºåã«ä¿ãåãã¼
ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯ã«åºã¥ãã¦ãå½è©²ç¾å¨ã®åºåãã¢ã¿ã
ã¯ä¿¡å·ã®éå§ä½ç½®ãå«ããå¦ããæ±ºå®ããå½è©²ç¾å¨ã®åº
åãä¸è¨ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãã¨æ±ºå®ãããã¨
ãã¯ãã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ããã¨ã示ãæå®ã®
第ï¼ã®å¤ï¼ããªãã¡ãï¼ï¼ãå½è©²ç¾å¨ã®åºåã«å²ãå½ã¦
ããã¨ãç¹å¾´ã¨ãããTherefore, steps S10 to S1
In the process of identifying a section including the start position of the attack signal according to No. 3, the correction function calculation unit 14b determines the absolute value of the peak of the next section following the current section to be processed and a predetermined value before the current section. It is determined whether or not the current segment includes the start position of the attack signal based on the ratio with the absolute value of each peak related to the determined number (8) segments, and the current segment is the attack signal. When it is determined to include the start position of the attack signal, a predetermined first value (ie, 1) indicating that the attack signal includes the start position is assigned to the current section.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã®ã¢ã¿ãã¯ä¿¡å·ã®åºæºãæº
ãããã¨ãã§ããªãã¨ããããã¼ã¯ã¹ãããï¼³ï¼ï¼ã®ãª
ãªã¼ã¹ä¿¡å·ã®åºæºã«é²ããããã§ã¯ãï½â¦ï½â¦ï½ï¼ï¼ã®
ï½ã«å¯¾ãã¦ãä¸è¨æ¸è¡°ä¿æ°attn_ftrãä¹ç®ãããåºåã®
ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï½]ããå
è¡ããä¿®æ£ãã¤ã³ã
ã¨ç¾å¨ã®åºåã¨ã®éã®ãã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½
ï¼»adjust_numï¼»ï½ï¼½ï¼½ãããå°ãããå¦ãã夿ããã
ãã ããï½ï¼ï¼ï¼ã§ããã°ã使ç¨ãããåºåã®ãã¼ã¯å¤
MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ãã¹ãããï¼³ï¼ï¼ã§å®ç¾©ããã
æªæ¥ã®ãµããã³ããã¬ã¼ã ã®åºåããå°åºããããã¨ã
注æãããã¹ãããï¼³ï¼ï¼ã®æ¡ä»¶ãæºãããããªãã°ã
ã¹ãããï¼³ï¼ï¼ã«é²ã¿ãããã§ãªãã¨ãã¯ã次ã®åºåã
ä¿®æ£ãã¤ã³ãã§ãããå¦ããæ±ºå®ããããã«ãã¹ããã
ï¼³ï¼ï¼ã«é²ããIf the attack signal criteria of step S10 cannot be met, the flow proceeds to the release signal criteria of step S16. Here, the peak value MaxPeak [i] [k] of the section obtained by multiplying k of j ⦠k ⦠j + 7 by the damping coefficient attn_ftr is the peak value InterModMaxPeak between the preceding correction point and the current section. [I]
It is determined whether it is smaller than [adjust_num [i]].
However, if k> 31, the peak value of the segment used
Note that MaxPeak [i] [k] is derived from the partition of the future subband frame defined in step S45. If the condition of step S16 is satisfied,
If not, the process proceeds to step S14 to determine whether the next segment is a correction point.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã§ããã¼ã¯å¤InterModMaxP
eakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã®å¤§ããã«å¾ã£ã¦ä¿®æ£
ä¿æ°expoã®å¤ãè¨å®ãããå½è©²ãã¼ã¯å¤InterModMaxPea
kï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ãé«ãã»ã©ããªãªã¼ã¹ã
ã¤ã³ãã«å
è¡ãã大ããªä¿¡å·ã®é¨åããã大ããæ¸è¡°ã
ããããã«ãä¿®æ£ä¿æ°expoã¯è² ã®æ°ã¨ãã¦å°ãããªãã
æ¬å®æ½å½¢æ
ã§ã¯ãï¼ï¼ãããï¼âï¼ï¼ï¼ï¼ï¼ä¹è³ï¼ï¼ï¼
ï¼ï¼ï¼ã§è¡¨ãããä¿¡å·å¼·åº¦ã«å¯¾ãã¦ããã¼ã¯å¤InterMod
MaxPeakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã®å¤§ãããï¼ï¼ï¼
ï¼æªæºã®ã¨ããä¿®æ£ä¿æ°expoã¯ï¼ã«è¨å®ããããã¼ã¯å¤
InterModMaxPeakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã®å¤§ãã
ãï¼ï¼ï¼ï¼ä»¥ä¸ï¼ï¼ï¼ï¼ï¼æªæºã®ã¨ããä¿®æ£ä¿æ°expoã¯
âï¼ã«è¨å®ããããã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½ï¼»adj
ust_numï¼»ï½ï¼½ï¼½ã®å¤§ãããï¼ï¼ï¼ï¼ï¼ä»¥ä¸ã®ã¨ããä¿®
æ£ä¿æ°expoã¯âï¼ã«è¨å®ããããã¹ãããï¼³ï¼ï¼ã¯ãå°
ããªéãã¯ãããã¹ãããï¼³ï¼ï¼ã¨ã»ã¼åæ§ã«ããã¾ã
ã¾ãªå¤æ°ãæ´æ°ãããããªãã¡ãä¿®æ£ã¬ãã«alevcode
ï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ãä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½
ï¼»adjust_numï¼»ï½ï¼½ï¼½ãåã³ä¿®æ£ãã¤ã³ãæ°adjust_num
ï¼»ï½ï¼½ã®è¨å®ã¯ãã¹ãããï¼³ï¼ï¼ã¨åãã§ãããã¹ãã
ãï¼³ï¼ï¼ã§ã¯ãããã«ããã¼ã¯å¤InterModMaxPeak
ï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ããåºåã®ãã¼ã¯å¤MaxPea
kï¼»ï½ï¼½ï¼»ï½]ã«è¨å®ãããã¤ããã¼ã¯ä½ç½®InterModMaxL
ocï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ãï½ã«è¨å®ãããã¹ãã
ãï¼³ï¼ï¼ã®å ´åã¨åæ§ã«ãä¿®æ£ã¬ãã«alevcodeã«å²ãå½
ã¦ãããå¤ã¯ãä¿®æ£ãã¤ã³ãããªãªã¼ã¹ãã¤ã³ãã¨ãã¦
ãã¼ãã³ã°ããã®ã¿ã§ãããä¿®æ£ã¬ãã«ã®å®éã®å¤ã¯ã
å¾ã«æ±ºå®ããããAt step S17, the peak value InterModMaxP
The value of the correction coefficient expo is set according to the size of eak [i] [adjust_num [i]]. The relevant peak value InterModMaxPea
The higher k [i] [adjust_num [i]], the smaller the correction factor expo becomes as a negative number, so that the part of the large signal preceding the release point is more attenuated.
In this embodiment, 16 bits (-32768 to 327) are used.
67) for the signal strength represented by
MaxPeak [i] [adjust_num [i]] size is 400
When it is less than 0, the correction coefficient expo is set to 0 and the peak value
When the magnitude of InterModMaxPeak [i] [adjust_num [i]] is 4000 or more and less than 20000, the modification coefficient expo is set to -1, and the peak value InterModMaxPeak [i] [adj
When the size of ust_num [i]] is 20000 or more, the modification coefficient expo is set to -2. In step S18, various variables are updated almost in the same manner as step S13 with a small difference. That is, the modification level alevcode
[I] [adjust_num [i]], correction position aloccode [i]
[Adjust_num [i]] and the number of correction points adjust_num
The setting of [i] is the same as in step S13. In step S18, the peak value InterModMaxPeak is further added.
[I] [adjust_num [i]] is the peak value of the segment MaxPea
Set to k [i] [j] and set peak position InterModMaxL
Set oc [i] [adjust_num [i]] to j. As in step S13, the value assigned to the modification level alevcode only marks the modification point as a release point. The actual value of the modification level is
It will be decided later.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãã¹ãããï¼³ï¼ï¼ä¹è³ã¹ãããï¼³ï¼
ï¼ã«ä¿ããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãèå¥ãã
å¦çã§ã¯ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãå¦çãã¹ãç¾å¨ã®
åºåã¨ãå½è©²ç¾å¨ã®åºåã«å
è¡ããä¿®æ£ãã¤ã³ãã¨ã®é
ã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤InterModMaxPeakï¼»ï½ï¼½
ï¼»ï½ï¼½ã¨ãå½è©²ç¾å¨ã®åºå以å¾ã®äºã決ããããæ°ï¼ï¼
åï¼ã®åºåã«ä¿ãåãã¼ã¯ã®çµ¶å¯¾å¤ã¨ã®æ¯ã«åºã¥ãã¦ã
å½è©²ç¾å¨ã®åºåããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ããå¦ã
ãæ±ºå®ããå½è©²ç¾å¨ã®åºåãä¸è¨ãªãªã¼ã¹ä¿¡å·ã®çµäºä½
ç½®ãå«ãã¨æ±ºå®ãããã¨ãã¯ããªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®
ãå«ããã¨ã示ãæå®ã®ç¬¬ï¼ã®å¤ï¼ï¼ï¼âï¼ï¼åã¯â
ï¼ï¼ãå½è©²ç¾å¨ã®åºåã«å²ãå½ã¦ããã¨ãç¹å¾´ã¨ããTherefore, steps S16 to S1
In the process of identifying the section including the end position of the release signal according to No. 8, the correction function calculation unit 14b uses the absolute value of the maximum peak between the current section to be processed and the correction point preceding the current section. Value InterModMaxPeak [i]
[J] and a predetermined number after the current division (8
Based on the ratio with the absolute value of each peak related to
It is determined whether the current section includes the end position of the release signal, and when the current section is determined to include the end position of the release signal, a predetermined value indicating that the end section of the release signal is included. The second value of (0, -1, or-
2) is assigned to the current division
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãããã¾ã§ã«çºè¦ããã
ä¿®æ£ãã¤ã³ãããããã«å²ãå½ã¦ãããåè¨è¨æ¶éï¼ã
ãªãã¡ãï¼ï¼åï¼ããã§ã«è¶
éãã¦ãããå¦ãããã§ã
ã¯ãããè¶
éãã¦ããªããã°ã次ã®åºåãä¿®æ£ãã¤ã³ã
ã§ãããå¦ããæ±ºå®ããããã«ãããã¼ã¯ã¹ãããï¼³ï¼
ï¼ã«é²ããããã§ãªããã°ãããã»ã¹ã¯ã¤ã³ããã¯ã¹ï½
ã«ããå
é¨ã«ã¼ããçµäºãã¦ãã¼ã¯å¤CurrFramePeak
ï¼»ï½ï¼½ã決å®ããããã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¯ã
ç¾å¨ã®ãµããã³ããã¬ã¼ã ã«å¯¾ããå©å¾å¶å¾¡ã®ç®æ¨ã®ä¿¡
å·ã¬ãã«ã§ãããç®æ¨ãã¼ã¯å¤ãã表ããA step S19 checks whether or not the correction points found so far have already exceeded the total storage amount (ie, 20) allocated to them. If not, the flow proceeds to step S1 to determine whether the next segment is a correction point.
Go to 4. Otherwise, the process is index j
End the inner loop due to the peak value CurrFramePeak
Determine [i]. The peak value CurrFramePeak [i] is
It represents a "target peak value" which is the target signal level of gain control for the current subband frame.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãå
è¡ãããµããã³ãã
ã¬ã¼ã ã®ä¿®æ£ãã¤ã³ãæ°prev_adjust_numï¼»ï½ï¼½ãï¼ã§
ããããã¤ç¾å¨ã®ãµããã³ããã¬ã¼ã ã«ããã¦çºè¦ãã
ããã¹ã¦ã®ä¿®æ£ãã¤ã³ããããªãªã¼ã¹ä¿¡å·ãã®ã¿ã¤ãã«
å±ãããå¦ãããã§ãã¯ãããæ¡ä»¶ãçã§ããã¨ãã¯ã
ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãå½è©²ãµããã³ããã¬ã¼ã ã«ã
ãããªã¼ãã£ãªãµã³ãã«ä¿¡å·ããèªç¶ãªéä¸ãããã¦ã
ãã¨çµè«ããããã®ãããªç¾è±¡ã¯çºè©±é³å£°ã®ä¿¡å·ã«å
±é
ãããã®ã§ããããã¹ãã¨ã³ã¼ã鲿¢ããããã«ãã®ã
ããªä¿¡å·ã«å©å¾å¶å¾¡ãé©ç¨ãããªãã°ãæã¾ãããªãã¢
ã¼ãã£ãã¡ã¯ããå¼ãèµ·ãããããããã«ãããã¯ãç¾
å¨ã®ãµããã³ããã¬ã¼ã ã®ä¿®æ£ãã¤ã³ãæ°adjust_num
ï¼»ï½ï¼½ãã¼ãã«ãªã»ããããæ¸è¡°å¶éä¿æ°min_amp
ï¼»ï½ï¼½ãï¼ã«è¨å®ãã¦ï¼ã¹ãããï¼³ï¼ï¼ï¼ãå³ï¼ï¼ã®ã¹
ãããï¼³ï¼ï¼ã«é²ããæ¸è¡°å¶éä¿æ°min_ampï¼»ï½ï¼½ã®ç®
çã¯ãéåº¦ã®æ¸è¡°ãç·©åãããã¨ã«ããããããã«ã¤ã
ã¦ã¯å¾ã«è©³è¿°ãããA step S20 decides whether or not the number of correction points prev_adjust_num [i] of the preceding subband frame is 0 and all the correction points found in the current subband frame belong to the "release signal" type. I will check. When the condition is true,
The correction function calculation unit 14b concludes that the audio sample signal in the subband frame has "natural drop". Such a phenomenon is common to speech signals, and applying gain control to such signals to prevent post-echo causes unwanted artifacts. Therefore, this is the number of correction points in the current subband frame adjust_num
[I] is reset to zero and the damping limit coefficient min_amp
[I] is set to 4 (step S21), and the process proceeds to step S32 in FIG. The purpose of the damping limit coefficient min_amp [i] is to mitigate excessive damping, which will be described in detail later.
ãï¼ï¼ï¼ï¼ãããããªããããèªç¶ãªéä¸ãç¾è±¡ãåå¨
ããªãã¨ãã¯ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã¯å¼ãç¶ãå©
å¾å¶å¾¡ãå¿
è¦ã¨ããå ´åãããã®ã§ãå³ï¼ï¼ã®ã¹ããã
ï¼³ï¼ï¼ã®ç®æ¨ãã¼ã¯å¤CurrFramePeakè¨ç®å¦çã«é²ãã
ã¹ãããï¼³ï¼ï¼ã¯ãä¿®æ£ãã¤ã³ãããã¼ãã³ã°ããã¦ã
ããå¦ãããã§ãã¯ããé©å®ã«ãç®æ¨ãã¼ã¯å¤CurrFram
ePeakï¼»ï½ï¼½ãè¨å®ãããHowever, when the "natural drop" phenomenon does not exist, the current subband frame may still require gain control, so the process proceeds to the target peak value CurrFramePeak calculation process of step S22 in FIG.
A step S22 checks whether or not the correction points are marked, and appropriately sets the target peak value CurrFram.
Set ePeak [i].
ãï¼ï¼ï¼ï¼ãå³ï¼ï¼ã«å³ç¤ºãããã¹ãããï¼³ï¼ï¼ã®ãµã
ã«ã¼ãã³ãåç
§ããã¨ãã¹ãããï¼³ï¼ï¼ã§ç¾å¨ã®ãµãã
ã³ããã¬ã¼ã ã«ä¿®æ£ãã¤ã³ããåå¨ããã¨å¤æãããã¨
ãã¯ãã¹ãããï¼³ï¼ï¼ã§å§ã¾ãï¼ã¤ã®ã¹ãã¼ã¸ã§ç¾å¨ã®
ãµããã³ããã¬ã¼ã ã®ç®æ¨ãã¼ã¯å¤CurrFramePeak
ï¼»ï½ï¼½ãè¨ç®ããããã¹ãããï¼³ï¼ï¼ã®å¼ã¯ã次ã®ãã
ã«åæãããã¨ãã§ãããããªãã¡ãã¢ã¿ãã¯ãã¤ã³ã
ï¼alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ï¼ï¼ï¼ãå¾ã«ç¶ããã¼ã¯å¤Inte
rModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã«ã¤ãã¦ã¯ãç®æ¨ãã¼ã¯å¤Cur
rFramePeakï¼»ï½ï¼½ã¯ãä¿®æ£ãã¤ã³ãéã®ãã¼ã¯å¤InterM
odMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ãã®ãã®ã¨ãªãããªãªã¼ã¹ãã¤
ã³ãï¼alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½â¦ï¼ï¼ãå¾ã«ç¶ããã¼ã¯å¤
InterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã«ã¤ãã¦ã¯ãç®æ¨ãã¼ã¯
å¤CurrFramePeakï¼»ï½ï¼½ã¯ãä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½
ï¼»ï½ï¼½ã«ãã£ã¦æ¸è¡°ãããä¿®æ£ãã¤ã³ãéã®ãã¼ã¯å¤In
terModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãªããã¹ãããï¼³ï¼ï¼ã®
æçµçãªç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¯ãï¼â¦ï½
ï¼adjust_numï¼»ï½ï¼½ã®ï½ã«å¯¾ããããããã¹ã¦ã®ç®æ¨ã
ã¼ã¯å¤ã®æå¤§å¤ã¨ãªããã¹ãããï¼³ï¼ï¼ã§åå¾ãããç®
æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¯ãã¹ãããï¼³ï¼ï¼ã§
ããã«ï¼ä¿®æ£ãã¤ã³ããå¾ã«åå¨ããªãï¼æå¾ã®ãã¼ã¯
å¤InterModMaxPeakï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½ï¼½ã¨æ¯è¼
ãããã¹ãããï¼³ï¼ï¼ã§æªæ¥ã®ãµããã³ããã¬ã¼ã ã®æ
åã®åºåã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï¼ï¼ï¼½ã¨æ¯è¼ãã
ãããããã®ãã¡ã®å¤§ããã»ãããæçµçãªç®æ¨ãã¼ã¯
å¤CurrFramePeakï¼»ï½ï¼½ã¨ãã¦é¸æããããã¹ãããï¼³
ï¼ï¼ã¯ã次ã®ãµããã³ããã¬ã¼ã ã®éå§ä½ç½®ï¼ããªãã¡
ï¼çªç®ã®åºåï¼ã«è¿æ¥ããï¼imminentï¼ã¢ã¿ãã¯ã«å¯¾ã
ãã»ã¼ãã¬ã¼ãï¼é²è·å¯¾çï¼ã¨ãã¦ä¸å¯æ¬ ã§ãããReferring to the subroutine of step S22 shown in FIG. 15, when it is determined in step S51 that a correction point exists in the current subband frame, the current subband frame is detected in three stages starting in step S52. Target peak value of CurrFramePeak
[I] is calculated. The equation of step S52 can be analyzed as follows. That is, the peak value Inte followed by the attack point (alevcode [i] [k]> 0)
For rModMaxPeak [i] [k], the target peak value Cur
rFramePeak [i] is the peak value InterM between the correction points
It becomes odMaxPeak [i] [k] itself. Peak value followed by release point (alevcode [i] [k] ⤠0)
For InterModMaxPeak [i] [k], the target peak value CurrFramePeak [i] is the modification level alevcode [i].
Peak value In between the correction points attenuated by [k]
terModMaxPeak [i] [k]. The final target peak value CurrFramePeak [i] in step S52 is 0 ⦠k
It is the maximum of all these target peak values for k of <adjust_num [i]. The target peak value CurrFramePeak [i] obtained in step S52 is further compared in step S53 with the last peak value InterModMaxPeak [i] [adjust_num [i]] (there is no correction point later), and in step S54 the future peak value The peak value MaxPeak [i] [32] of the first segment of the subband frame is compared. The larger of them is selected as the final target peak value CurrFramePeak [i]. Step S
54 is indispensable as a safeguard (protective measure) against an attack that is close to the start position (that is, the 0th section) of the next subband frame.
ãï¼ï¼ï¼ï¼ãããã«å¯¾ãã¦ãä¿®æ£ãã¤ã³ããå
¨ãåå¨ã
ãªãã¨ã¹ãããï¼³ï¼ï¼ã§å¤æãããå ´åã¯ãã¹ããã
ã§ãç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¯åã«ãç¾å¨ã®
ãµããã³ããã¬ã¼ã ã«ããããã¹ã¦ã®åºåã®ãã¼ã¯å¤Ma
xPeakï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãæªæ¥ã®ãµããã³ããã¬ã¼ã ãã
å°åºãããæåã®åºåã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï¼
ï¼ï¼½ã¨ã®æå¤§å¤ããå°åºããããå¾ã£ã¦ãã¹ãããï¼³ï¼
ï¼ã«ä¿ãç®æ¨ãã¼ã¯å¤CurrFramePeakè¨ç®å¦çã§ã¯ãä¿®
æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãäºãã«é£æ¥ããåï¼å¯¾ã®ä¿®æ£ã
ã¤ã³ãéã®æå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤InterModMaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãå²ãå½ã¦ããã第ï¼ã®å¤ï¼ã¹ãããï¼³
ï¼ï¼ã§å²ãå½ã¦ãããä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½
ï¼»ï½ï¼½ï¼åã³ç¬¬ï¼ã®å¤ï¼ã¹ãããï¼³ï¼ï¼ã§å²ãå½ã¦ãã
ãä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ï¼ã¨ãå¦çãã¹ãç¾
å¨ã®ãã¬ã¼ã ã«ç¶ã次ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿ãã
ã¼ã¯ã®çµ¶å¯¾å¤MaxPeakï¼»ï½ï¼½ï¼»ï¼ï¼ï¼½ã¨ã«åºã¥ãã¦ãå½
該å¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·ãä¿®æ£é¢
æ°ã«å¾ã£ã¦å©å¾å¶å¾¡ããã¨ãã«ææãããç®æ¨ãã¼ã¯å¤
CurrFramePeakï¼»ï½ï¼½ãè¨ç®ãããã¨ãç¹å¾´ã¨ãããOn the other hand, if it is determined in step S51 that there are no correction points, then in step, the target peak value CurrFramePeak [i] is simply the peak value Ma of all sections in the current subband frame.
xPeak [i] [k] and the peak value MaxPeak [i] [3 of the first segment derived from the future subband frame
2] and the maximum value. Therefore, step S2
In the target peak value CurrFramePeak calculation process related to No. 2, the correction function calculation unit 14b causes the maximum peak absolute value InterModMaxPeak between each pair of correction points adjacent to each other.
[I] [j] and the assigned first value (step S
Modification level alevcode [i] assigned in 13
[J]) and the second value (correction level alevcode [i] [j] assigned in step S18) and the absolute value MaxPeak of the peak for the first segment of the next frame following the current frame to be processed. Based on [i] and [32], the desired peak value desired when the audio signal of the current frame to be processed is gain-controlled according to the correction function.
It is characterized by calculating CurrFramePeak [i].
ãï¼ï¼ï¼ï¼ãç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã決å®
ãããã¨ãã¹ãããï¼³ï¼ï¼ã§ãã®å¤§ããããã§ãã¯ãã
ãããããï¼ï¼ï¼ãããå°ããã¨ããå¦çã¯ãå
¨ä½çãª
ä¿¡å·å¼·åº¦ãå°ãããã¦ãå©å¾å¶å¾¡ãã¦ãå©ç¹ããªãã¨çµ
è«ãããããã¯ã¹ãã¯ãã«éååã®éã«ããååã«æ¶è²»
ãããããã«ã¨ã£ã¦ããããããã®ãããã¹ãããï¼³ï¼
ï¼ã§ãå½è©²ãµããã³ããã¬ã¼ã ã®ä¿®æ£ãã¤ã³ãæ°adjust
_numï¼»ï½ï¼½ããªã»ãããã¦ãå³ï¼ï¼ã®ããã¼ã«ãªã¿ã¼ã³
ããã¹ãããï¼³ï¼ï¼ã«é²ããWhen the target peak value CurrFramePeak [i] is determined, its magnitude is checked in step S56. When this is less than 200, the processing concludes that the overall signal strength is too small and gain control has no benefit, and the bits are set aside to be more fully consumed during spectral quantization. Therefore, step S5
7. Adjust the number of correction points for the subband frame
_num [i] is reset, the process returns to the flow of FIG. 12, and proceeds to step S23.
ãï¼ï¼ï¼ï¼ãç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ãææ
ï¼ã¹ãããï¼³ï¼ï¼ãNOï¼ã§ããã°ããªã¿ã¼ã³ãã¦ãã¹
ãããï¼³ï¼ï¼ã§ä¿®æ£ã¬ãã«alevcodeè¨ç®å¦çãå®è¡ã
ããå³ï¼ï¼ã«å³ç¤ºãããã¹ãããï¼³ï¼ï¼ã®ãµãã«ã¼ãã³
ãåç
§ããã¨ãã¹ãããï¼³ï¼ï¼ã§ãåä¿®æ£ãã¤ã³ãã«ã¤
ãã¦ï¼ã¤ãã¤å®éã®ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã
決å®ããããã«ãï½ã§ã¤ã³ããã¯ã¹ãä»ä¸ãããæ°ãã
å
é¨ã«ã¼ããéå§ãããIf the target peak value CurrFramePeak [i] is significant (NO in step S56), the process returns and the correction level alevcode calculation process is executed in step S23. Referring to the subroutine of step S23 illustrated in FIG. 16, in step S61, a new indexed j is determined to determine the actual modification level alevcode [i] [j], one for each modification point. Start the inner loop.
ãï¼ï¼ï¼ï¼ãå½è©²ã«ã¼ãã«ããã¦ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼
ï½ã¯ãã¹ãããï¼³ï¼ï¼ã§ãç¾å¨ã®ä¿®æ£ãã¤ã³ãããåºå
ï¼ã«ããã¦å
·ä½åããã¦ãããå¦ãããã§ãã¯ãããå
·
ä½åããã¦ããã°ã夿°tempãã¹ãããï¼³ï¼ï¼ã«ããã¦
åºåã®ãã¼ã¯å¤MaxPeakï¼»ï½ï¼½ï¼»ï¼ï¼½ã¨å
è¡ãããµãã
ã³ããã¬ã¼ã ã®ãã¼ã¯å¤PrevFramePeakï¼»ï½ï¼½ã¨ã®å¤§ã
ãã»ãã«è¨å®ãããæ¬¡ãã§ãã¹ãããï¼³ï¼ï¼ã«ããã¦ä¸
è¨å¤æ°tempã«ï¼ã®ä¸éå¤ã課ããããã¹ãããï¼³ï¼ï¼ã
NOã®ã¨ãã¯ãã¹ãããï¼³ï¼ï¼ã«ããã¦ç¾å¨ã®ä¿®æ£ãã¤
ã³ãã«å
è¡ãããã¼ã¯å¤InterModMaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½
ããã§ãã¯ãããããããï¼ãã大ããã¨ãã¯ãã¹ãã
ãï¼³ï¼ï¼ã§ã夿°tempã¯ãã¼ã¯å¤InterModMaxPeak
ï¼»ï½ï¼½ï¼»ï½ï¼½èªä½ã¨çãããããããããï¼ã§ããã¨ã
ã¯ãã¹ãããï¼³ï¼ï¼ã§ãã¼ãã«ãã£ã¦é¤ç®ãããããã¨
ã鲿¢ããããã«ã夿°tempã¯ï¼ã«è¨å®ãããï¼ã¹ãã
ãï¼³ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ãï¼³ï¼ï¼ãåã¯ï¼³ï¼ï¼ã«ç¶
ãã¦ãã¹ãããï¼³ï¼ï¼ã§ãï¼ãåºã¨ãã対æ°ã®æ´æ°é¨ã§
表ããããç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¨ä¸è¨å¤
æ°tempã¨ã®æ¯ã¨ãã¦ãä¿®æ£ä¿æ°expo1ãè¨ç®ããããã¹
ãããï¼³ï¼ï¼ã¯ä¸è¨ä¿®æ£ä¿æ°expo1ã®æ£è² ããã§ãã¯
ãããããï¼ä»¥ä¸ã§ããã¨ãã¯ãã¹ãããï¼³ï¼ï¼ã«ãã
ã¦ãä¿®æ£ä¿æ°expo1ã®æ´æ°é¨ãæ¹ãã¦ä¿®æ£ä¿æ°expoã¨ã
ã¦è¨å®ãããããè² ã§ããã¨ãã¯ãä¿®æ£ä¿æ°expo1ãã
ï¼ï¼ï¼ãæ¸ç®ããå¤ã®æ´æ°é¨ãä¿®æ£ä¿æ°expoã¨ãã¦è¨å®
ãããç°ãªã丸ãï¼æåãæ¹æ³ã¯ã¢ã¼ãã£ãã¡ã¯ãã®å¯
è´æ§ã«å¤§ããå½±é¿ãããã¨ã観å¯ããã¦ããã®ã§ãæå¾
ã®ã¹ãããã¯éè¦ã§ãããIn the loop, the correction function calculation unit 14
In step S62, b checks whether or not the current correction point is embodied in category 0. If so, the variable temp is set to the larger of the segment peak value MaxPeak [i] [0] and the preceding subband frame peak value PrevFramePeak [i] in step S63, and then in step S64. A lower limit of 1 is imposed on the variable temp. If NO in step S62, the peak value InterModMaxPeak [i] [j] that precedes the current correction point in step S65.
Is checked. If this is greater than 0, the variable temp is set to the peak value InterModMaxPeak in step S66.
[I] [j] is made equal to itself. If it is 0, the variable temp is set to 1 to prevent being "divided by zero" in step S68 (step S67). Following step S64, S66, or S67, in step S68, the correction coefficient expo1 is calculated as the ratio between the target peak value CurrFramePeak [i] represented by the integer part of the base 2 logarithm and the variable temp. To be done. In step S69, whether the correction coefficient expo1 is positive or negative is checked. If it is 0 or more, the integer part of the correction coefficient expo1 is set again as the correction coefficient expo in step S70, and if it is negative, the correction coefficient expo1 is corrected. The integer part of the value obtained by subtracting 0.5 from the coefficient expo1 is set as the correction coefficient expo. The last step is important because different rounding / truncating methods have been observed to significantly affect the audibility of artifacts.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãå¢å¹
ã軽æ¸ããå¿
è¦ã
ãããå¦ãããã§ãã¯ããããã®ã¹ãããã¯ãã¹ããã
ï¼³ï¼ï¼ã§æ±ããããèªç¶ãªéä¸ãã®ç¾è±¡ã¨åæ§ã®æå³ã§
çºè©±é³å£°ã®ä¿¡å·ã«å
±éãã¦åå¨ããã¢ã¿ãã¯ä¿¡å·ã®ãèª
ç¶ãªä¸æãã®ç¾è±¡ã«å¯¾å¦ããä¸ã§ä¸å¯æ¬ ã§ããããèªç¶
ãªä¸æãã®ç¾è±¡ã¯ãå³ï¼ã«ç¤ºããã¦ããããã®æ¡ä»¶ã®æ¬
質ã¯ãç¾å¨ã®ã¢ã¿ãã¯ãã¤ã³ãã«ç¶ããã¼ã¯ä½ç½®InterM
odMaxLocï¼»ï½ï¼½ï¼»ï½ï¼ï¼]ããã¢ã¿ãã¯ãã¤ã³ãèªä½ã
ãé常ã«ãé ãé¢ãããä½ç½®ã«ããã°ããèªç¶ãªä¸æã
ã«ééãã¦ããã¨çµè«ã§ããã¨ãããã®ã§ãããã¹ãã
ãï¼³ï¼ï¼ã§ãèªç¶ãªä¸æããåå¨ããã¨å¤æãããã¨ã
ã¯ãã¹ãããï¼³ï¼ï¼ã«ããã¦ãä¿®æ£ã¬ãã«alevcode
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯expoâï¼ã«è¨å®ããããããã§ãªãã¨
ããã¹ãããï¼³ï¼ï¼ã«ããã¦ãä¿®æ£ã¬ãã«alevcode
ï¼»ï½ï¼½ï¼»ï½ï¼½ã¯ä¿®æ£ææ°expoã«è¨å®ããããæ¬å®æ½å½¢æ
ã§ã¯ãã¹ãããï¼³ï¼ï¼ã®åºæºã®ããã«ãï¼ã¤ã®åºåã ã
ã®åé¢è·é¢ãæ¡ç¨ããå½è©²ä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½
ï¼»ï½ï¼½ããµããã³ããã¬ã¼ã ã®æå¾ã®ï¼ã¤ã®åºåããã
åã§ãããã¨ãæ¡ä»¶ã¨ãã¦èª²ãã¦ãããA step S72 checks whether or not it is necessary to reduce the amplification. This step is indispensable for coping with the phenomenon of "natural rise" of the attack signal which is commonly present in the signals of the spoken voice in the same sense as the phenomenon of "natural fall" treated in step S20. . The phenomenon of "natural rise" is shown in FIG. The essence of this condition is that the peak position InterM following the current attack point
âNatural climbâ if odMaxLoc [i] [j + 1] is very âfarâ away from the attack point itself
You can conclude that you are encountering. If it is determined in step S72 that "natural increase" exists, the correction level alevcode is determined in step S73.
[I] [j] are set to expo-1. Otherwise, in step S74, the modification level alevcode
[I] [j] are set to the correction index expo. In this embodiment, the separation distances of only five sections are adopted for the reference of step S72, and the correction position aloccode [i]
The condition is that [j] is before the last 6 sections of the subband frame.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ããªã¼ã
ã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°ã«å«ã¾ããæå®ã®ããã
ããªå¾é
ã§èªç¶ã«éä¸ãã第ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹
ãããï¼³ï¼ï¼ã¨ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçºè©±é³å£°
ã«å«ã¾ããæå®ã®ãããããªå¾é
ã§èªç¶ã«ä¸æãã第ï¼
ã®ä¿¡å·é¨åãæ¤åºããã¹ãããï¼³ï¼ï¼ã¨ãå®è¡ããå¦ç
ãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ããã¦ãèå¥ãããã¢ã¿ãã¯ä¿¡
å·ã®éå§ä½ç½®ãå«ãåºåã¨ãèå¥ããããªãªã¼ã¹ä¿¡å·ã®
çµäºä½ç½®ãå«ãåºåã¨ã«åºã¥ãã¦ãæ¤åºããã第ï¼ã®ä¿¡
å·é¨åã®ä¿®æ£é¢æ°ã«ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ãèå¥
ãããã¢ã¿ãã¯ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£é¢æ°ãæ¸
å°ããã¦ç¬¬ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®ãããã¨ã«ã
ããå½è©²ç¾å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°ãè¨ç®ããã
ã¨ãç¹å¾´ã¨ãããããã§ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãä¸
è¨ç¬¬ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¨ãä¸è¨ç¬¬ï¼ã®ä¿¡
å·é¨åãæ¤åºããã¹ãããã¨ã®ãã¡ã®ä¸æ¹ã®ã¿ãå®è¡ã
ãããã«æ§æããã¦ãã¦ããããã¾ãã第ï¼ã®ä¿¡å·é¨å
ãæ¤åºããå¦çã¯ãç¾å¨ã®ãã¬ã¼ã ã«å
è¡ãããã¬ã¼ã
ã«ããã¦ä¿®æ£ãã¤ã³ãã§ããåºåã®æ°prev_adjust_num
ï¼»ï½ï¼½ãæå®ã®ç¬¬ï¼ã®ãããå¤ããå°ããï¼ã¹ãããï¼³
ï¼ï¼ã®æ¡ä»¶ã«å¾ã£ã¦ãprev_adjust_numï¼»ï½ï¼½ï¼ï¼ã§ã
ãï¼ããã¤ç¾å¨ã®ãã¬ã¼ã ã«ã¢ã¿ãã¯ä¿¡å·ãå«ã¾ãã¦ã
ãªãã¨ãã«ã第ï¼ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ãã第ï¼
ã®ä¿¡å·é¨åãæ¤åºããå¦çã¯ãã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®
ãå«ãåºåã¨ãå½è©²åºåããå½è©²åºåã«ç¶ã次ã®ä¿®æ£ã
ã¤ã³ãã¾ã§ã®éã«ãããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤InterMod
MaxPeakï¼»ï½ï¼½ï¼»ï½ï¼½ãåå¨ããåºåã¨ã®éã®åé¢åºå
æ°ãæå®ã®ç¬¬ï¼ã®ãããå¤ï¼ããªãã¡ãï¼ï¼ãã大ãã
ã¨ãã«ã第ï¼ã®ä¿¡å·é¨åãåå¨ããã¨æ±ºå®ãããã¨ãç¹
å¾´ã¨ãããTherefore, the correction function calculation unit 14b detects, based on the audio signal, the first signal portion which naturally falls at a predetermined gentle gradient included in the uttered voice, and based on the audio signal. , 2 that naturally rises with a predetermined gentle gradient included in the uttered voice
And step S72 of detecting the signal portion of, and based on the partition including the start position of the identified attack signal and the partition including the end position of the identified release signal in the current frame to be processed, By discontinuing the gain control by the detected correction function of the first signal portion while decreasing the correction function calculated based on the identified attack signal to calculate the correction function of the second signal portion, It is characterized in that a correction function in the current frame is calculated. Here, the correction function calculation unit 14b may be configured to execute only one of the step of detecting the first signal portion and the step of detecting the second signal portion. Further, the process of detecting the first signal portion is performed by the number of divisions that are correction points in the frame preceding the current frame prev_adjust_num
[I] is smaller than a predetermined first threshold value (step S
41, prev_adjust_num [i] = 0), and when the current frame does not include an attack signal, it is determined that the first signal portion exists, and
The process of detecting the signal part of is the maximum absolute value InterMod of the peak between the section including the start position of the attack signal and the next correction point following the section.
Determining that the second signal portion is present when the number of separated partitions with respect to the partition in which MaxPeak [i] [j] is present is greater than a predetermined second threshold (ie, 5). And
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«ã
ã£ã¦è¦æ±ãããéããè¨ç®ãããä¿®æ£ã¬ãã«alevcode
ï¼»ï½ï¼½ï¼»ï½ï¼½ãâï¼ããï¼ï¼ã¾ã§ã«å¶éãããæ¬¡ã«ä¿®æ£
颿°è¨ç®é¨ï¼ï¼ï½ã¯ãã¹ãããï¼³ï¼ï¼ã§å
é¨ã«ã¼ãã®ã¤
ã³ããã¯ã¹ï½ãï¼ã ãã¤ã³ã¯ãªã¡ã³ãããæ¬¡ãã§ãã¹ã
ããï¼³ï¼ï¼ã«ããã¦ãã¹ã¦ã®ä¿®æ£ãã¤ã³ããå¦çããã
ãå¦ãã夿ãããæ¬¡ã®ä¿®æ£ãã¤ã³ãã«å¯¾ããä¿®æ£ã¬ã
ã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ãè¨ç®ããã¨ãã¯ã¹ãããï¼³ï¼
ï¼ã«æ»ããããã§ãªãã¨ãã¯ãªã¿ã¼ã³ãã¦ã¹ãããï¼³ï¼
ï¼ã«é²ããStep S75 is the calculated modification level alevcode as required by the ATRAC3 standard.
Limit [i] [j] to -3 to 11. Next, the correction function calculation unit 14b increments the index j of the inner loop by 1 in step S76, and then determines in step S77 whether all the correction points have been processed. When calculating the correction level alevcode [i] [j] for the next correction point, step S6
Return to step 2, otherwise return to step S2
Go to 4.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã§ã¯ãåä¸ã®ä¿®æ£ã¬ãã«al
evcodeãå
±æããé£ç¶çãªãã¤ã³ããçµ±åãããã¨ã«ã
ããä½åãªä¿®æ£ãã¤ã³ãã®æ°ã調æ´ãããç¹ã«é£ç¶ãã
ï¼ã¤ã®ä¿®æ£ãã¤ã³ããåä¸ã®ä¿®æ£ã¬ãã«alevcodeãå
±æ
ãã¦ããã¨ãã¯ã第ï¼ã®ä¿®æ£ãã¤ã³ããé¤å»ãããã¨ã
ã§ãã第ï¼ã®ä¿®æ£ãã¤ã³ãã®ä¿®æ£ã¬ãã«alevcodeã®å¤
ã¯ãé¤å»ããã第ï¼ã®ä¿®æ£ãã¤ã³ãã«å
è¡ããä¿®æ£ãã¤
ã³ãã¨ãå½è©²ç¬¬ï¼ã®ä¿®æ£ãã¤ã³ãã¨ã®éã®åºåã«å¯¾ãã
ä¿®æ£é¢æ°å¤ãå®ç¾©ãããã¨ãã§ãããä½åãªä¿®æ£ãã¤ã³
ããé¤å»ããã¨ã¨ãã«ãä¿®æ£ãã¤ã³ãæ°adjust_num
ï¼»ï½ï¼½ãä¿®æ£ä½ç½®aloccodeï¼»ï½ï¼½ï¼»ï½ï¼½ãåã³ä¿®æ£ã¬ã
ã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ãããããããåã³åä¿®æ£ãã¤
ã³ãã«å¯¾ãã¦å¯¾å¿ããããã¨ãå¿
è¦ã§ãããæ¬¡ã«ãã¹ã
ããï¼³ï¼ï¼ã§ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«é©åãããããã«ã
çºè¦ãããä¿®æ£ãã¤ã³ãæ°adjust_numï¼»ï½ï¼½ããï¼ä»¥ä¸
ã«ãªãããã«å¶éãããå¾ã£ã¦ãã¹ãããï¼³ï¼ï¼ã®çµ±å
ã®å¾ã§ããªãåå¨ããï¼çªç®ä»¥å¾ã®ä¿®æ£ãã¤ã³ãã¯ãé¤
å»ããããAt step S24, the same modification level al
Adjust the number of extra correction points by integrating consecutive points sharing evcode. The first modification point can be removed, especially when two consecutive modification points share the same modification level alevcode, and the value of the modification level alevcode of the second modification point can be A correction function value can be defined for the section between the correction point preceding the one correction point and the second correction point. Remove unnecessary correction points and adjust the number of correction points adjust_num
[I], the correction position aloccode [i] [j], and the correction level alevcode [i] [j] need to be associated with each correction point again. Next, in step S25, in order to conform to the ATRAC3 standard,
The number of found correction points adjust_num [i] is limited to 7 or less. Therefore, the eighth and subsequent correction points that are still present after the integration in step S24 are removed.
ãï¼ï¼ï¼ï¼ã次ãã§ãã¹ãããï¼³ï¼ï¼ã®ãµããã³ããã¬
ã¼ã éã®å·®åå¶éå¦çãå®è¡ããå
è¡ãããµããã³ãã
ã¬ã¼ã ã®ä¿¡å·ã¬ãã«ãç®æ¨ãã¼ã¯å¤CurrFramePeak
ï¼»ï½ï¼½ã¨åæ§ã®å¤§ããã®ä¿¡å·ã¬ãã«ã«ã¾ã§å¤åãããã
ãã«ãï¼çªç®ã®åºåã«ä¿®æ£ãã¤ã³ããé
ç½®ããå¿
è¦ãã
ããå¦ãããã§ãã¯ãããå³ï¼ã«å³ç¤ºãããããã«ãä¸
é£ã®ä¿®æ£ãå®è¡ãããå¾ã«ãï¼ï¼¤ï¼£ï¼´ã®ããã«å
è¡ãã
ãµããã³ããã¬ã¼ã ãç¾å¨ã®ãµããã³ããã¬ã¼ã ã«ç¸¦ç¶
æ¥ç¶ããã符å·åã®ã¹ããããåå¨ããã®ã§ããã®ãµã
ãã³ããã¬ã¼ã éã®å·®åå¶éå¦çã¯ä¸å¯æ¬ ã§ãããThen, the difference limiting process between the sub-band frames of step S26 is executed to set the signal level of the preceding sub-band frame to the target peak value CurrFramePeak.
It is checked whether or not a correction point needs to be arranged in the 0th section in order to change the signal level to the same level as [i]. As shown in FIG. 4, there is a coding step in which the preceding subband frame is cascaded to the current subband frame for MDCT after a series of modifications has been performed, so this subband Difference limiting processing between frames is indispensable.
ãï¼ï¼ï¼ï¼ãå³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ãµãã«ã¼ãã³ã
åç
§ããã¨ãã¹ãããï¼³ï¼ï¼ã§ã¯ãå
è¡ãããµããã³ã
ãã¬ã¼ã åã¯ç¾å¨ã®ãµããã³ããã¬ã¼ã ã®ä½ãããä¿®æ£
ãã¤ã³ããå«ã¿ããã¤ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«å¾ã£ã¦ãã
ãã«ä½åã®ä¿®æ£ãã¤ã³ããå容ããä½å°ãããï¼ããªã
ã¡ãadjust_numï¼»ï½ï¼½â¦ï¼ï¼ãªãã°ãYESã¨å¿çã
ãããããYESã§ããã¨ãã¯ãã¹ãããï¼³ï¼ï¼åã³ï¼³
ï¼ï¼ã¯ãã¹ãããï¼³ï¼ï¼åã³ï¼³ï¼ï¼ã¨åæ§ã®æ¹æ³ã§å¤æ°
tempã®å¤ã決å®ãããã¹ãããï¼³ï¼ï¼ã¯ã夿°tempã®æ¯
å¹
ã¨ç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã®æ¯å¹
ããã§ã
ã¯ããã夿°tempãç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½
ããã大ãããã¤ç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã
æ¯è¼çå°ãããã°ï¼ãã®å ´åã®åºæºã¯ï¼ï¼ï¼æªæºã§ãã
ãå¦ããï¼ãç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¨å¤æ°t
empã®æ¯ã®ï¼åã®ãï¼ãåºã¨ãã対æ°ã¨ãã¦ãä¿®æ£ä¿æ°e
xpo1ãè¨ç®ãããããã§ãªãã¨ãã¯ãããå°ããå¢å¹
ä¿
æ°ãçæããããã«ãç®æ¨ãã¼ã¯å¤CurrFramePeak
ï¼»ï½ï¼½ã¨å¤æ°tempã®æ¯ã®ãï¼ãåºã¨ãã対æ°ã¨ãã¦ãä¿®
æ£ä¿æ°expo1ãè¨ç®ãããã¹ãããï¼³ï¼ï¼ä¹è³ï¼³ï¼ï¼ã¯
ä¿®æ£ä¿æ°expo1ã®å¤§ããããã§ãã¯ããã¹ãããï¼³ï¼ï¼
åã³ã¹ãããï¼³ï¼ï¼ã«ããã¦ãé©å®ã«ä¿®æ£ä¿æ°expoãè¨
ç®ããããè¨åãããã®ã¹ãããã¯ãã¹ãããï¼³ï¼ï¼ä¹
è³ï¼³ï¼ï¼ã«é¡ä¼¼ãããã®ã§ãããåæ§ã®ä¸¸ãï¼æåãè«
çã«å¾ã£ã¦ãããã¹ãããï¼³ï¼ï¼ã¯ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æº
ãè¦æ±ããéãã«ä¿®æ£ä¿æ°expoã®ä¸éï¼âï¼ï¼åã³ä¸é
ï¼ï¼ï¼ï¼ãè¨å®ãããReferring to the subroutine of step S26 of FIG. 17, in step S81, either the preceding sub-band frame or the current sub-band frame contains correction points, and according to the ATRAC3 standard, additional correction points are added. If there is room for accommodation (that is, adjust_num [i] ⦠6), the response is YES. If this is YES, steps S82 and S
83 is a variable in the same manner as in steps S63 and S64.
Determine the value of temp. A step S84 checks the amplitude of the variable temp and the amplitude of the target peak value CurrFramePeak [i]. The variable temp is the target peak value CurrFramePeak [i]
Is larger than the target peak value CurrFramePeak [i] and the target peak value CurrFramePeak [i] is relatively small (whether the reference in this case is less than 500), the target peak value CurrFramePeak [i] and the variable t
The correction coefficient e is the logarithm whose base is 2, which is twice the ratio of emp.
Calculate xpo1. Otherwise, the target peak value CurrFramePeak is generated so that a smaller amplification factor is generated.
The correction coefficient expo1 is calculated as the base 2 logarithm of the ratio of [i] to the variable temp. Steps S87 to S89 check the magnitude of the correction coefficient expo1, and step S88
Then, in step S89, the correction coefficient expo is calculated appropriately. This step referred to is similar to steps S69 to S71 and follows similar rounding / truncating logic. A step S90 sets the lower limit (-3) and the upper limit (11) of the modification coefficient expo as required by the ATRAC3 standard.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãã¹ãããï¼³ï¼ï¼ã«ããã¦ãä¿®æ£é¢
æ°è¨ç®é¨ï¼ï¼ï½ã¯ãäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã ã§ã
ã第ï¼åã³ç¬¬ï¼ã®ãã¬ã¼ã ã«ããã¦ãå©å¾å¶å¾¡å¾ã®ãªã¼
ãã£ãªä¿¡å·ã®ãã¡ã®ç¬¬ï¼ã®ãã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤Pr
evFramePeakï¼»ï½ï¼½ã¨ç¬¬ï¼ã®ãã¬ã¼ã ã®ãã¼ã¯ã®çµ¶å¯¾å¤C
urrFramePeakï¼»ï½ï¼½ã¨ãç°ãªãã¨ããä¸è¨åãã¼ã¯ã®çµ¶
対å¤ãçãããªãããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®
åºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹å¾´ã¨ãããTherefore, in step S26, the correction function calculation unit 14b determines the absolute value of the peak of the first frame of the gain-controlled audio signal in the first and second frames, which are two adjacent frames. Value Pr
evFramePeak [i] and the absolute value C of the peak of the second frame
When urrFramePeak [i] is different, the correction function value relating to the first section of the second frame is corrected so that the absolute values of the respective peaks become equal.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ããã®è¿½å ã®ä¿®æ£ãã¤ã³
ããæ¿å
¥ã§ããããã«ãªãåã«æºè¶³ãããå¿
è¦ããã第
ï¼ã®æ¡ä»¶ã表ãã¦ãããç¾å¨ã®ãµããã³ããã¬ã¼ã ãä¿®
æ£ãã¤ã³ããæããªããããããã¯ããã¨ãã¨ãµããã³
ããã¬ã¼ã ã®å
é é¨åã«åãä¿®æ£ä¿æ°expoãçããä¿®æ£
ãã¤ã³ããåå¨ãã¦ããªãã¨ãã«ããã®æ¡ä»¶ã¯ï¼¹ï¼¥ï¼³ã§
ãããå®éã«ããã§ããã°ãã¹ãããï¼³ï¼ï¼ã§æ°ããä¿®
æ£ãã¤ã³ããæ¿å
¥ããããã¹ã¦ã®æ¢åã®ä¿®æ£ãã¤ã³ãã
å¾ã«ã·ããããã¦ãadjust_numï¼»ï½ï¼½ãï¼ã ãã¤ã³ã¯ãª
ã¡ã³ãããããã¼ã¯ã¹ãããï¼³ï¼ï¼ã«é²ããStep S91 represents a second condition that must be met before this additional correction point can be inserted. This condition is YES when the current subband frame has no correction point or when there is originally no correction point that produces the same correction coefficient expo at the beginning of the subband frame. If so, a new correction point is inserted in step S92, all existing correction points are later shifted, increment adjust_num [i] by 1, and the flow proceeds to step S27.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ãNOã®å¿çãããã¨ãã
ããã¯ãç¾å¨ã®åã³éå»ã®ãµããã³ããã¬ã¼ã ã®ä¿¡å·å¼·
度ã®å·®åï¼ããããã°ï¼ã«é¢ããããé·ç§»ã¯æ¬è³ªçã«ç·©
ããã§ãããã¨ã嫿ãã¦ãããããã«ãå¾å±æ¥ç¶ãã
ããµããã³ããã¬ã¼ã éãçµåããä¿®æ£ãã¤ã³ãã¯ä¸è¦
ã§ãããããã¼ã¯ã¹ãããï¼³ï¼ï¼ã«é²ããWhen step S91 gives a NO response,
This implies that the transitions are inherently gradual regardless of the difference (if any) in the signal strength of the current and past subband frames. Therefore, a correction point for connecting sub-frames connected in cascade is not necessary, and the flow proceeds to step S27.
ãï¼ï¼ï¼ï¼ã第ï¼ã®ä¿®æ£ãã¤ã³ããæ¸è¡°ãã¤ã³ãï¼ããª
ãã¡ããªãªã¼ã¹ãã¤ã³ãï¼ã§ããã¨å¤æããã¨ãã¯ãã¹
ãããï¼³ï¼ï¼ã¯ãä¸éâmin_ampï¼»ï½ï¼½âï¼ã課ããã¨
ã«ãã£ã¦ãã¢ã¼ãã£ãã¡ã¯ããçºçãããã¡ãªãå
è¡ã
ããµããã³ããã¬ã¼ã ã«å¯¾ããæ½å¨çã«éåº¦ãªæ¸è¡°ãç·©
åãããæ¸è¡°å¶éä¿æ°min_ampï¼»ï½ï¼½ã¯ãå
è¡ãããµã
ãã³ããã¬ã¼ã ã®ã¢ã¿ãã¯ãã¤ã³ãã«å¯¾ãã¦é©ç¨ãã
ããæå°ã®ä¿®æ£ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã¨ãã¦å®ç¾©
ããã¦ãããå¾ã£ã¦ãã¹ãããï¼³ï¼ï¼ã«ããã¦ãä¿®æ£é¢
æ°è¨ç®é¨ï¼ï¼ï½ã¯ãä¸è¨å©å¾å¶å¾¡é¨ï¼ï¼âï¼ä¹è³ï¼ï¼â
ï¼ã®å¦çã®å¾æ®µã«ããã¦ãä¸è¨äºãã«é£æ¥ããï¼ã¤ã®ã
ã¬ã¼ã ã§ãã第ï¼åã³ç¬¬ï¼ã®ãã¬ã¼ã ã®ãªã¼ãã£ãªä¿¡å·
ãäºãã«å®è³ªçã«é£ç¶ããããã«ã第ï¼ã®ãã¬ã¼ã ã®ãª
ã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦ç¬¬ï¼ã®ãã¬ã¼ã ã®æåã®åºåã«ä¿
ãä¿®æ£é¢æ°å¤ãä¹ç®ãã¦è£æ£ããä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½
ã¯ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ã®ã
ã¡ã®ããªã¼ãã£ãªä¿¡å·ãå¢å¹
ããæå°ã®ä¿®æ£é¢æ°å¤min_
ampï¼»ï½ï¼½ã«åºã¥ãã¦ãä¸è¨ãªã¼ãã£ãªä¿¡å·ãå¢å¹
ãã
ä¿®æ£é¢æ°ã«ããå©å¾å¶å¾¡ãããã¹ã第ï¼ã®ãã¬ã¼ã ã®ãª
ã¼ãã£ãªä¿¡å·ããä¸è¨ä¹ç®ãã¦è£æ£ãããã¨ã«ããæ¸è¡°
ããããã¨ã鲿¢ããããã«ãä¸è¨ç¬¬ï¼ã®ãã¬ã¼ã ã®æ
åã®åºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹å¾´ã¨ã
ããIf the first correction point is found to be the decay point (ie, the release point), step S27 imposes a lower bound of -min_amp [i] -1, which tends to cause artifacts, leading Mitigates potentially excessive attenuation for subband frames that do. The attenuation limit coefficient min_amp [i] is defined as the minimum modification level alevcode [i] [k] applied to the attack point of the preceding subband frame. Therefore, in step S27, the correction function calculation unit 14b causes the gain control units 15-1 to 15-.
In the latter stage of the processing of 4, the second frame is added to the audio signal of the first frame so that the audio signals of the first and second frames, which are the two adjacent frames, are substantially continuous with each other. Correction function value relating to the first section of the
Is the minimum correction function value min_ for amplifying the audio signal among the correction function values related to each section of the first frame.
On the basis of amp [i], the audio signal of the first frame to be gain-controlled by a correction function for amplifying the audio signal is prevented from being attenuated by being multiplied and corrected. It is characterized in that the correction function value relating to the first division of the second frame is corrected.
ãï¼ï¼ï¼ï¼ã次ã«ãã¹ãããï¼³ï¼ï¼ã«ããã¦ãµããã³ã
ãã¬ã¼ã ã®çµç«¯é¨å©å¾å¶å¾¡å¦çãå®è¡ãã¦ããã¹ãã¨ã³
ã¼å¶å¾¡ã®éå§ä½ç½®ããã¼ãã³ã°ãããå³ï¼ï¼ãåç
§ãã
ã¨ãã¹ãããï¼³ï¼ï¼ï¼ã«ããã¦ãä½åã®ä¿®æ£ãã¤ã³ãã®
åå
¥ãããã¾ã å¯è½ãå¦ãï¼ä¿®æ£ãã¤ã³ãæ°adjust_num
ï¼»ï½ï¼½ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«ãã£ã¦æ±ºããããä¸éãã
ãå°ããããï¼ãåã³ç¾å¨ã®æå¾ã®ä¿®æ£ãã¤ã³ãããã§
ã«ãµããã³ããã¬ã¼ã ã®çµããã«ããªãè¿ã¥ãã¦ããã
å¦ãï¼ãµããã³ããã¬ã¼ã ã®æå¾ã®ï¼ã¤ã®åºåãããå
ããï¼ããã§ãã¯ãããæ¡ä»¶ã許ãã°ãããã»ã¹ã¯ã¹ã
ããï¼³ï¼ï¼ï¼ã«é²ã¿ãããã§ããµããã³ããã¬ã¼ã ã®æ
å¾ã®ä¿®æ£ä½ç½®ã®å¾ã«ããããã¹ã¦ã®åºåã®ãã¼ã¯å¤MaxP
eakï¼»ï½ï¼½ï¼»ï½ï¼½ã«åºã¥ãã¦ä¸è¨ãã¼ã¯å¤ã®æå¤§å¤MaxVa
lueãçæãããæ¬¡ãã§ãã¹ãããï¼³ï¼ï¼ï¼ã§ä¿®æ£ä¿æ°e
xpoãè¨ç®ããï¼ï¼ãåºã¨ãã対æ°ã®æ´æ°é¨ã§è¡¨ãã
ããç®æ¨ãã¼ã¯å¤CurrFramePeakï¼»ï½ï¼½ã¨ä¸è¨æå¤§å¤Max
Valueã¨ã®æ¯ï¼ãæå¾ã«ã¹ãããï¼³ï¼ï¼ï¼ã«é²ã¿ããã
ã§ãä¿®æ£ä¿æ°expoã«ããæ°ããä¿®æ£ãï¼ï¼çªç®ã®åºåã«
æ¿å
¥ãããããã®ã¹ãããã¯ãå³ï¼ã«ç¤ºããã¦ãããNext, in step S28, the termination gain control processing of the subband frame is executed to mark the start position of the post echo control. Referring to FIG. 18, in step S101, it is still possible to accept the extra correction points (the number of correction points adjust_num).
Is [i] less than the upper limit set by the ATRAC3 standard? ), And whether the current last modification point is already quite close to the end of the subband frame (before the last three partitions of the subband frame). If the conditions allow, the process proceeds to step S102, where the peak value MaxP of all partitions after the last modified position of the subband frame.
The maximum value MaxVa of the above peak values based on eak [i] [k]
lue is generated, and then in step S103, the correction coefficient e
xpo is calculated (the target peak value CurrFramePeak [i] expressed by the integer part of the base 2 logarithm) and the maximum value Max
(Ratio with Value), and finally, the process proceeds to step S104, where a new correction by the correction coefficient expo is inserted in the 31st section. This step is shown in FIG.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ã¯ãã¹ãã
ãï¼³ï¼ï¼ã§ãå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãããæå¾ã®
ä¿®æ£ãã¤ã³ãã®åºåaloccodeï¼»ï½ï¼½ï¼»adjust_numï¼»ï½ï¼½
âï¼ï¼½ãã徿¹ã«ä½ç½®ããç¾å¨ã®ãã¬ã¼ã ã®ååºåã«ã
ããæå¤§ã®ãã¼ã¯ã®çµ¶å¯¾å¤MaxValueãä¸è¨ç®æ¨ãã¼ã¯å¤
CurrFramePeakï¼»ï½ï¼½ã¨ç°ãªãã¨ãã«ãä¸è¨æå¤§ã®ãã¼
ã¯ã®çµ¶å¯¾å¤MaxValueãä¸è¨ç®æ¨ãã¼ã¯å¤CurrFramePeak
ï¼»ï½ï¼½ã«çãããªãããã«ãä¸è¨å¾æ¹ã«ä½ç½®ããç¾å¨ã®
ãã¬ã¼ã ã®ååºåã«ä¿ãä¿®æ£é¢æ°å¤ãè£æ£ãããã¨ãç¹
å¾´ã¨ãããTherefore, in step S28, the correction function calculation unit 14b determines the last correction point segment aloccode [i] [adjust_num [i] in the current frame to be processed.
-1], the absolute value MaxValue of the maximum peak in each section of the current frame located behind is the target peak value.
When different from CurrFramePeak [i], the absolute value MaxValue of the maximum peak is the target peak value CurrFramePeak
It is characterized in that the correction function value relating to each section of the current frame located at the rear side is corrected so as to be equal to [i].
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ï¼ã®æ¡ä»¶ã許å¯ããªãå ´
åããããã¯ã¹ãããï¼³ï¼ï¼ï¼ã§æ°ããä¿®æ£ãã¤ã³ãã
ãã®å¤æ°ã決å®ãããã¨ã«ãã£ã¦å
·ä½åãããå¾ã¯ãã
ãã»ã¹ã¯ã¹ãããï¼³ï¼ï¼ã«é²ã¿ãããã§ãæ¸è¡°å¶éä¿æ°
min_ampï¼»ï½ï¼½ã®å¤ãå°åºããããç¹°ãè¿ãã¨ãæ¸è¡°å¶
éä¿æ°min_ampï¼»ï½ï¼½ã¯ãã¹ãããï¼³ï¼ï¼ã§éåº¦ã®æ¸è¡°
ãç·©åããããã«å¿
è¦ãªãã®ã§ãããIf the condition of step S101 does not allow, or after a new correction point has been implemented in step S104 by determining its variable, the process proceeds to step S29 where the damping limit factor is
The value of min_amp [i] is derived. Repeatedly, the damping limit coefficient min_amp [i] is necessary for alleviating excessive damping in step S27.
ãï¼ï¼ï¼ï¼ãã¹ãããï¼³ï¼ï¼ã¯ãï¼¡ï¼´ï¼²ï¼¡ï¼£ï¼æ¨æºã«ã
ã£ã¦è¦æ±ããã表ç¾ã¨é©åãããããã«ããã¹ã¦ã®ä¿®æ£
ã¬ãã«alevcodeï¼»ï½ï¼½ï¼»ï½ï¼½ã«ï¼ãå ç®ãããã¹ããã
ï¼³ï¼ï¼ã¯ãè¨ç®ãããä¿®æ£ä½ç½®åã³ä¿®æ£ã¬ãã«ã«åºã¥ã
ã¦ãå½è©²ãµããã³ããã¬ã¼ã ã®ä¿®æ£é¢æ°ãè¨ç®ãããã¹
ãããï¼³ï¼ï¼ã§ã¯ã¾ããæ¬¡ã®ãµããã³ããã¬ã¼ã ã®ä¿®æ£
颿°ã®è¨ç®ã§ä½¿ç¨ããããã«ãä¿®æ£é¢æ°ã«ãã£ã¦å©å¾å¶
御ãããå½è©²ãµããã³ããã¬ã¼ã ã®ãã¼ã¯å¤ï¼ããªã
ã¡ã次ã®ãµããã³ããã¬ã¼ã ã«å¯¾ãã¦ãå
è¡ããããµã
ãã³ããã¬ã¼ã ã®å©å¾å¶å¾¡ããããã¼ã¯å¤ï¼PrevFrameP
eakï¼»ï½ï¼½ã®å¤ãè¨ç®ããããã®å¾ã¯ã¹ãããï¼³ï¼ï¼ã§
ã¤ã³ããã¯ã¹ï½ãï¼ã ãã¤ã³ã¯ãªã¡ã³ããã¦ãã¹ããã
ï¼³ï¼ï¼ãYESãªãã°ãå¥ã®ãµããã³ãï¼ï½ï¼ï¼ï¼ã®ãµ
ããã³ããã¬ã¼ã ãå¦çãããStep S30 adds 4 to all modification levels alevcode [i] [j] to match the representation required by the ATRAC3 standard. A step S31 calculates a correction function of the subband frame based on the calculated correction position and correction level. Also in step S31, the peak value of the subband frame (that is, "leads" to the next subband frame) gain-controlled by the correction function for use in the calculation of the correction function of the next subband frame. Gain-controlled peak value of subband frame) PrevFrameP
Calculate the value of eak [i]. After that, the index i is incremented by 1 in step S32, and if YES in step S33, the subband frame of another subband (i + 1) is processed.
ãï¼ï¼ï¼ï¼ãå½è©²ãªã¼ãã£ãªãã¬ã¼ã ã®ãã¹ã¦ã®ãµãã
ã³ããå¦çãããã¨ãã¹ãããï¼³ï¼ï¼ã§å
¥åä¿¡å·ã®ãã¹
ã¦ã®ãªã¼ãã£ãªãã¬ã¼ã ï¼ã¾ãããããã®ãµããã³ãã
ã¬ã¼ã ï¼ãå¦çããããå¦ãããã§ãã¯ãããå¦çãã¹
ããµããã³ããã¬ã¼ã ãã¾ã æ®ã£ã¦ããã¨ãã¯ãã¹ãã
ãï¼³ï¼ã«æ»ã£ã¦ããããã¡ã¡ã¢ãªï¼ï¼ããæ¬¡ã®ãµããã³
ããã¬ã¼ã ã®ãµã³ãã«ä¿¡å·ãèªã¿åºãã䏿¹ããã¹ã¦ã®
ãµããã³ããã¬ã¼ã ãå¦çãããã¨ãã¯ãä¿®æ£é¢æ°è¨ç®
å¦çãçµäºãããWhen all the sub-bands of the audio frame have been processed, it is checked in step S34 whether all the audio frames (and their sub-bands) of the input signal have been processed. When there are still sub-band frames to be processed, the process returns to step S1 to read the sample signal of the next sub-band frame from the buffer memory 14, while when all the sub-band frames have been processed, correction is made. The function calculation process ends.
ãï¼ï¼ï¼ï¼ãè¨ç®ãããä¿®æ£é¢æ°ã¯ãå©å¾å¶å¾¡é¨ï¼ï¼â
ï¼ä¹è³ï¼ï¼âï¼ã«éããã¦ããªã¼ãã£ãªãµã³ãã«ä¿¡å·ã«
ä¹ç®ãããã¨ã¨ãã«ããµã¤ãæ
å ±ã¨ãã¦ãããã¹ããªã¼
ã ãã«ããã¬ã¯ãµã«éããããThe calculated correction function is the gain control unit 15-
1 to 15-4, the audio sample signal is multiplied, and side information is sent to the bit stream multiplexer.
ãï¼ï¼ï¼ï¼ã以ä¸èª¬æããããã«ãæ¬çºæã«ä¿ã宿½å½¢
æ
ã®ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³åã³è£
ç½®ã並ã³ã«ç¬¦å·
ååã³è¤ååã·ã¹ãã ã«ããã°ãä¿®æ£é¢æ°ãããæ£ç¢ºã«
çæãã¦å©å¾å¶å¾¡ãå®è¡ãããã¨ã«ãããããªã¨ã³ã¼é
é³åã³ãã¹ãã¨ã³ã¼éé³ã確å®ã«æå§ãããã¨ãã§ã
ããã¾ããæ¬å®æ½å½¢æ
ã«ä¿ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹
æ³åã³è£
ç½®ã«ããã°ãçºè©±é³å£°ã«å¯¾ããåºæºãç¹å¥ã«çµ
è¾¼ããã¨ã«ãããå©å¾å¶å¾¡ã«ããã¢ã¼ãã£ãã¡ã¯ããé¤
å»ãããã¨ãã§ãããAs described above, according to the audio signal coding method and apparatus and the coding and decoding system of the embodiment of the present invention, the correction function is more accurately generated and the gain control is executed. As a result, the pre-echo noise and the post-echo noise can be surely suppressed. Further, according to the audio signal encoding method and apparatus according to the present embodiment, it is possible to remove the gain control artifacts by specially incorporating the reference for the speech voice.
ãï¼ï¼ï¼ï¼ãæ¬çºæã¯ãä¸è¿°ã®ï¼ã¤ã®æ¹åæ¡å¼µäºé
ã«å¯¾
å¿ãã¦ä»¥ä¸ã®ç¹æã®å¹æãæãããã¾ããç¾å¨ã®ãµãã
ã³ããã¬ã¼ã ã®å¾ã«ç¶ãæªæ¥ã®ãµããã³ããã¬ã¼ã ãã
ãèªç±ã«ç¨ãããã¨ã¯ããªãªã¼ã¹ä¿¡å·ã®ããè¯å¥½ãªæ¤åº
ãããããã¨åæã«ããå¢çã¢ã¿ãã¯ãã®å¶å¾¡ãå¼ãç¶
ãéæãã¦ããï¼æ¹åæ¡å¼µäºé
ï¼ï¼ï¼åç
§ï¼ãã¾ããã
ã®å¯å¤ãªãªãªã¼ã¹ä¿¡å·ã®åºæºã¨ãã¹ãã¨ã³ã¼ã«å¯¾ããç´
æ¥ã®å¶å¾¡ã¨ã¯ãé·æéã«ããã£ã¦ç¶ããªãªã¼ã¹ä¿¡å·ã®å
å²åã³å¶å¾¡ããã广çã«æ´å©ãã¦ããï¼æ¹åæ¡å¼µäºé
ï¼ï¼ï¼åã³ï¼ï¼ï¼åç
§ï¼ãããã«ãç®æ¨ãã¼ã¯å¤CurrFr
amePeakãç¨ãããã¨ã¯ããµããã³ããã¬ã¼ã å
¨ä½ã®å°º
åº¦ã§æã¾ããä¿¡å·ã¬ãã«ã«èª¿æ´ãããã¨ãæ´å©ãã鿥
çã«ã¯ãå
é¡ã®çºæã«ãã£ã¦å©å¾å¶å¾¡ããããµããã³ã
ãã¬ã¼ã ããããå©å¾å¶å¾¡ããããµããã³ããã¬ã¼ã ã
ããæ»ããã«å¤åãããçµæã¨ãªãï¼æ¹åæ¡å¼µäºé
ï¼ï¼ï¼åç
§ï¼ãããã¯ãé£ç¶ããï¼ã¤ã®ãµããã³ããã¬
ã¼ã ã®ä¿¡å·ã¬ãã«ããäºãã«å¤§ããªå·®ãåºãªãããã«å
æ§ã®ã¬ãã«ã«è¨å®ãã試ã¿ã«ãã£ã¦ããã«å¼·åããã
ï¼æ¹åæ¡å¼µäºé
ï¼ï¼ï¼ãï¼ï¼ï¼åã³ï¼ï¼ï¼åç
§ï¼ãæå¾
ã«ãèªç¶ãªä¸æã¨èªç¶ãªéä¸ãæ¤åºãããã¨ã§ãéå°ãª
å¶å¾¡ã«ãã£ã¦ããããã¢ã¼ãã£ãã¡ã¯ãã«èå¼±ãªçºè©±é³
声ã®ä¿¡å·ãç¹å¥ã«é
æ
®ãããï¼æ¹åæ¡å¼µäºé
ï¼ï¼ï¼å
ç
§ï¼ãThe present invention has the following unique effects in response to the above eight improvements and extensions. Firstly, the more free use of future subband frames following the current subband frame results in better detection of the release signal, while at the same time achieving "boundary attack" control (improved extension See item (2)). Further, the variable release signal reference and the direct control for the post echo more effectively assist the division and control of the release signal that continues for a long time (refer to improvement extension items (3) and (8)). ). Furthermore, the target peak value CurrFr
The use of amePeak helps to adjust to the desired signal level on a scale of the entire subband frame, and indirectly, the gain-controlled subband rather than the gain-controlled subband frame according to the invention of the earlier application. As a result, the frame is changed more smoothly (see improvement and extension item (4)). This is further strengthened by an attempt to set the signal levels of two consecutive subband frames to similar levels so as not to make a large difference from each other (improvement extensions (1), (6) and (7)). reference). Finally, by detecting natural rises and falls, special attention is given to speech signals that are vulnerable to artifacts caused by excessive control (see improvement extension (5)).
ãï¼ï¼ï¼ï¼ãä½åãªã³ã¼ãåã³è¨ç®ã«ãé¢ããããæ¬ã¢
ã«ã´ãªãºã ã¯ããã®ä½ãè¨ç®åã³è¨æ¶è¦ä»¶ã®ããã«ãL
SIå®è£
ãå®ç¾å¯è½ãªãã®ã§ãããDespite the extra code and computation, the present algorithm, due to its low computation and storage requirements,
SI implementation is feasible.
ãï¼ï¼ï¼ï¼ã以ä¸ã®å®æ½å½¢æ
ã«ããã¦ã¯ãï¼ï¼¤ï¼£ï¼´å¦ç
ãè¡ã£ã¦ããããæ¬çºæã¯ããã«éããã種ã
ã®ç´äº¤å¤
æå¦çãè¡ã£ã¦ããããAlthough MDCT processing is performed in the above embodiments, the present invention is not limited to this, and various orthogonal transformation processing may be performed.
ãï¼ï¼ï¼ï¼ã以ä¸ã®å®æ½å½¢æ
ã«ããã¦ã¯ãä¿®æ£é¢æ°ãè¨
ç®ããã¨ãã«ï¼ã¤ã®ãµããã³ããã¬ã¼ã ã«åºã¥ãã¦è¨ç®
ãã¦ããããæ¬çºæã¯ããã«éãããå°ãªãã¨ãï¼ã¤ã®
ãµããã³ããã¬ã¼ã ãåã¯ãµããã³ãã«åå²ããã¦ããª
ãå°ãªãã¨ãï¼ã¤ã®ãã¬ã¼ã ã«åºã¥ãã¦è¨ç®ãã¦ãã
ããIn the above embodiment, the correction function is calculated based on two subband frames, but the present invention is not limited to this, and at least two subband frames or subbands are calculated. It may be calculated based on at least two frames that are not divided.
ãï¼ï¼ï¼ï¼ã[0130]
ãçºæã®å¹æã以ä¸è©³è¿°ããããã«ãæ¬çºæã«ä¿ããªã¼
ãã£ãªä¿¡å·ã®ç¬¦å·åæ¹æ³åã¯è£
ç½®ã«ããã°ãå
¥åããã
ãªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®
ãã¦ãè¨ç®ãããä¿®æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·
ã«å¯¾ãã¦å©å¾å¶å¾¡ããã¹ãããã¨ãä¸è¨å©å¾å¶å¾¡ããã
ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ã
ãï¼ã¤ã®ãã¬ã¼ã æ¯ã«ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦
çãè¡ããã¨ã«ãã符å·åããããããã¹ããªã¼ã ä¿¡å·
ãå¾ãã¹ãããã¨ãå®è¡ããä¸è¨å©å¾å¶å¾¡ããã¹ããã
ã¯ãå
¥åããããªã¼ãã£ãªä¿¡å·ããã¬ã¼ã ã®æéããã
çãæéã®è¤æ°ã®åºåã«åå²ããä¸è¨åå²ãããååºå
ã®ãã¼ã¯ã®çµ¶å¯¾å¤ãè¨ç®ããã¹ãããã¨ãä¸è¨åå²ãã
ãåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®
ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½
ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ããã¹ããã
ã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥
ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ãããªãª
ã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦è
å¥ããã¹ãããã¨ãä¸è¨åå²ãããåºåãæãããªã¼ã
ã£ãªä¿¡å·ã«åºã¥ãã¦ãä¸è¨ä¿®æ£é¢æ°ã«å¾ã£ã¦å©å¾å¶å¾¡ã
ãã¨ãã«ææããããå¦çãã¹ãç¾å¨ã®ãã¬ã¼ã ã«ãã
ãç®æ¨ãã¼ã¯å¤ãè¨ç®ããã¹ãããã¨ãä¸è¨èå¥ããã
ã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨èå¥ããã
ãªãªã¼ã¹ä¿¡å·ã®çµäºä½ç½®ãå«ãåºåã¨ãä¸è¨è¨ç®ããã
ç®æ¨ãã¼ã¯å¤ã¨ã«åºã¥ãã¦ãå½è©²ç¾å¨ã®ãã¬ã¼ã ã®ååº
åã«ä¿ãä¿®æ£é¢æ°å¤ãããªãä¿®æ£é¢æ°ãè¨ç®ããã¹ãã
ãã¨ãå«ããAs described above in detail, according to the audio signal encoding method or apparatus of the present invention, the correction function is calculated for each frame based on the input audio signal, and the calculated correction is performed. A step of performing gain control on the audio signal according to a function, and an orthogonal transform process and an encoding process on the gain-controlled audio signal for every two adjacent frames connected in cascade. Obtaining an encoded bitstream signal, and performing the gain control, dividing the input audio signal into a plurality of sections of a time shorter than the time of a frame, and dividing each of the divided sections. Calculating the absolute value of the peak, and based on the audio signal having the above-mentioned divided sections, a signal including a sudden rise of sound is generated. Identifying a section including the start position of the attack signal, which is a part of the signal, as a correction point, and based on the audio signal having the divided section, a release signal which is a part of the signal including a sharp fall of the sound. Identifying the segment containing the end position as a correction point, and the desired peak value in the current frame to be processed desired when gain controlled according to the correction function based on the audio signal having the divided segment. The step of calculating, the section including the start position of the identified attack signal, the section including the end position of the identified release signal, and the calculated target peak value based on the current frame. Calculating a correction function consisting of the correction function values for each section of.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãæ¬çºæã§ã¯ãå
é¡ã®çºæã®ããã«
åºåæ¯ã«ä¿®æ£ãã¤ã³ãã®æç¡ã決å®ãã¦ãã®ä¿®æ£ã¬ãã«
ãè¨ç®ããã®ã§ã¯ãªããã¾ãä¿®æ£ãã¤ã³ãã®æç¡ã決å®
ããæ¬¡ãã§ãã¬ã¼ã ã®ç®æ¨ãã¼ã¯å¤ãè¨ç®ãããã®å¾ã«
ä¿®æ£ã¬ãã«ãè¨ç®ãããã¨ã«ãã£ã¦ããã¬ã¼ã å
¨ä½ãè
æ
®ããå©å¾å¶å¾¡ãå¯è½ã«ãªããããã«ãããä¿®æ£é¢æ°ã
ããæ£ç¢ºã«çæãã¦å©å¾å¶å¾¡ãå®è¡ãã¦ãããªã¨ã³ã¼é
é³åã³ãã¹ãã¨ã³ã¼éé³ã確å®ã«æå§ãããã¨ãã§ã
ããTherefore, in the present invention, the presence or absence of a correction point is not determined for each section to calculate the correction level as in the invention of the prior application, but the presence or absence of the correction point is first determined, and then the frame target is determined. By calculating the peak value and then the correction level, gain control considering the whole frame is possible. As a result, the correction function can be generated more accurately, the gain control can be executed, and the pre-echo noise and the post-echo noise can be surely suppressed.
ãï¼ï¼ï¼ï¼ãã¾ããæ¬çºæã«ä¿ããªã¼ãã£ãªä¿¡å·ã®ç¬¦å·
åæ¹æ³åã¯è£
ç½®ã«ããã°ãå
¥åããããªã¼ãã£ãªä¿¡å·ã«
åºã¥ãã¦ãã¬ã¼ã æ¯ã«ä¿®æ£é¢æ°ãè¨ç®ãã¦ãè¨ç®ããã
ä¿®æ£é¢æ°ã«å¾ã£ã¦ä¸è¨ãªã¼ãã£ãªä¿¡å·ã«å¯¾ãã¦å©å¾å¶å¾¡
ããã¹ãããã¨ãä¸è¨å©å¾å¶å¾¡ããããªã¼ãã£ãªä¿¡å·ã«
対ãã¦ãç¸¦ç¶æ¥ç¶ãããäºãã«é£æ¥ããï¼ã¤ã®ãã¬ã¼ã
æ¯ã«ç´äº¤å¤æå¦çãè¡ããã¤ç¬¦å·åå¦çãè¡ããã¨ã«ã
ã符å·åããããããã¹ããªã¼ã ä¿¡å·ãå¾ãã¹ãããã¨
ãå®è¡ããä¸è¨å©å¾å¶å¾¡ããã¹ãããã¯ãå
¥åããããª
ã¼ãã£ãªä¿¡å·ããã¬ã¼ã ã®æéãããçãæéã®è¤æ°ã®
åºåã«åå²ããã¹ãããã¨ãä¸è¨åå²ãããåºåãæã
ããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³ã®ç«ä¸ãããå«
ãä¿¡å·ã®é¨åã§ããã¢ã¿ãã¯ä¿¡å·ã®éå§ä½ç½®ãå«ãåºå
ãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ããã¹ãããã¨ãä¸è¨åå²ã
ããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãæ¥æ¿ãªé³
ã®ç«ä¸ãããå«ãä¿¡å·ã®é¨åã§ãããªãªã¼ã¹ä¿¡å·ã®çµäº
ä½ç½®ãå«ãåºåãä¿®æ£ãã¤ã³ãã¨ãã¦èå¥ããã¹ããã
ã¨ãä¸è¨åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥
ãã¦ãçºè©±é³å£°ã«å«ã¾ããæå®ã®ãããããªå¾é
ã§èªç¶
ã«éä¸ãã第ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¨ãä¸è¨
åå²ãããåºåãæãããªã¼ãã£ãªä¿¡å·ã«åºã¥ãã¦ãçº
話é³å£°ã«å«ã¾ããæå®ã®ãããããªå¾é
ã§èªç¶ã«ä¸æã
ã第ï¼ã®ä¿¡å·é¨åãæ¤åºããã¹ãããã¨ã®å°ãªãã¨ãä¸
æ¹ãå«ã¿ãä¸è¨å©å¾å¶å¾¡ããã¹ãããã¯ãå¦çãã¹ãç¾
å¨ã®ãã¬ã¼ã ã«ããã¦ãä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã®
éå§ä½ç½®ãå«ãåºåã¨ãä¸è¨èå¥ããããªãªã¼ã¹ä¿¡å·ã®
çµäºä½ç½®ãå«ãåºåã¨ã«åºã¥ãã¦ãä¸è¨æ¤åºããã第ï¼
ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ã«ããå©å¾å¶å¾¡ã䏿¢ãã䏿¹ã
ä¸è¨èå¥ãããã¢ã¿ãã¯ä¿¡å·ã«åºã¥ãã¦è¨ç®ãããä¿®æ£
颿°ãæ¸å°ããã¦ä¸è¨ç¬¬ï¼ã®ä¿¡å·é¨åã®ä¿®æ£é¢æ°ãè¨ç®
ãããã¨ã«ãããå½è©²ç¾å¨ã®ãã¬ã¼ã ã«ãããä¿®æ£é¢æ°
ãè¨ç®ããã¹ãããã¨ãå«ããFurther, according to the audio signal coding method or apparatus of the present invention, a correction function is calculated for each frame based on the input audio signal, and the audio signal is processed according to the calculated correction function. And a gain control step, and a bit stream signal encoded by performing an orthogonal transform process and an encoding process on every two adjacent frames that are cascade-connected to the gain-controlled audio signal. And the gain control step is based on dividing the input audio signal into a plurality of sections of a time shorter than the time of a frame, and based on the audio signal having the divided sections. , The section including the start position of the attack signal, which is the part of the signal including the sharp rise of the sound, and the correction point Identifying the segment containing the end position of the release signal, which is the portion of the signal containing the abrupt fall of the sound, as the correction point based on the audio signal having the divided segment. Detecting, based on the audio signal having the divided sections, a first signal portion included in the uttered voice that naturally falls at a predetermined gentle gradient; and based on the audio signal having the divided sections, And / or detecting a second signal portion that naturally rises with a predetermined gentle slope contained in the spoken voice, the gain controlling step including the step of identifying the gain in the current frame to be processed. Based on the section including the start position of the attack signal and the section including the end position of the identified release signal, The first, which is the serial detection
While stopping the gain control by the correction function of the signal part of
Calculating a correction function for the current frame by reducing a correction function calculated based on the identified attack signal to calculate a correction function for the second signal portion.
ãï¼ï¼ï¼ï¼ãå¾ã£ã¦ãæ¬çºæã«ããã°ãçºè©±é³å£°ã«ç¹æ
ãªä¿¡å·ã¬ãã«ã«ä¿ãèªç¶ãªä¸æåã³èªç¶ãªéä¸ãæ¤åº
ãããªã¼ãã£ãªä¿¡å·ã«é度ã®å¢å¹
åã³éåº¦ã®æ¸è¡°ãé©ç¨
ããããã¨ã鲿¢ããã®ã§ãçºè©±é³å£°ã§ãããªã¼ãã£ãª
ä¿¡å·ã«å¯¾ãã¦ã¢ã¼ãã£ãã¡ã¯ãã®çºçãæå¶ãããã¨ã
ã§ãããããã«ãããä¿®æ£é¢æ°ãããæ£ç¢ºã«çæãã¦å©
å¾å¶å¾¡ãå®è¡ãã¦ãããªã¨ã³ã¼éé³åã³ãã¹ãã¨ã³ã¼é
é³ã確å®ã«æå§ãããã¨ãã§ãããTherefore, according to the present invention, it is possible to detect the natural rise and the natural fall of the signal level peculiar to the spoken voice and prevent the excessive amplification and the excessive attenuation from being applied to the audio signal. , It is possible to suppress the occurrence of artifacts in the audio signal which is the uttered voice. As a result, the correction function can be generated more accurately, the gain control can be executed, and the pre-echo noise and the post-echo noise can be surely suppressed.
ãå³ï¼ã æ¬çºæã®ä¸å®æ½å½¢æ
ã«ä¿ããããã£ã¹ã¯è¨é²
åçã·ã¹ãã ã®æ§æã示ããããã¯å³ã§ãããFIG. 1 is a block diagram showing a configuration of a mini disk recording / reproducing system according to an embodiment of the present invention.
ãå³ï¼ã å³ï¼ã®ãªã¼ãã£ãªã¨ã³ã³ã¼ãï¼ã®è©³ç´°æ§æã
示ããããã¯å³ã§ãããFIG. 2 is a block diagram showing a detailed configuration of the audio encoder 2 of FIG.
ãå³ï¼ã å³ï¼ã®ãªã¼ãã£ãªãã³ã¼ãï¼ã®è©³ç´°æ§æã示
ããããã¯å³ã§ããã3 is a block diagram showing a detailed configuration of the audio decoder 6 of FIG.
ãå³ï¼ã å³ï¼ã®ï¼ï¼¤ï¼£ï¼´å¦çé¨ï¼ï¼âï¼ã«å
è¡ããå©
徿¤åºé¨ï¼ï¼åã³å©å¾å¶å¾¡é¨ï¼ï¼âï¼ã«ããã夿ãã
ãã¯ã®çæã示ããããã¯å³ã§ããã4 is a block diagram showing generation of a conversion block in a gain detection unit 14 and a gain control unit 15-1 preceding the MDCT processing unit 16-1 in FIG.
ãå³ï¼ã å
¥åããããªã¼ãã£ãªãµã³ãã«ä¿¡å·ã¨ãå³ï¼
ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ï½ãä¸è¨ãªã¼ãã£ãªãµã³ãã«ä¿¡å·
ãå¦çãã¦é
ç½®ãããä¿®æ£ãã¤ã³ãã¨ãçµæçã«å¾ãã
ãä¿®æ£é¢æ°ã¨ãå³ç¤ºããã°ã©ãã§ãããFIG. 5 shows an input audio sample signal and FIG.
14 is a graph showing the correction points arranged by the correction function calculation unit 14b processing the audio sample signal and the resulting correction function.
ãå³ï¼ã å³ï¼ã®å
¥åããããªã¼ãã£ãªãµã³ãã«ä¿¡å·ã«
対ããã¢ã¿ãã¯ä¿¡å·ã®åºæºã¨ãªãªã¼ã¹ä¿¡å·ã®åºæºã表ã
æ¦ç¥å³ã§ãããFIG. 6 is a schematic diagram showing a reference of an attack signal and a reference of a release signal with respect to the input audio sample signal of FIG.
ãå³ï¼ã ï¼ï½ï¼ã¯èªç¶ãªä¸æã示ãã¢ã¿ãã¯ä¿¡å·ãå«
ããªã¼ãã£ãªãµã³ãã«ä¿¡å·ã示ãå³ã§ãããï¼ï½ï¼ã¯
ï¼ï½ï¼ã®ä¿¡å·ãå¢å¹
ãããã¨ã«ãã£ã¦çºçããã¢ã¼ãã£
ãã¡ã¯ããå«ãä¿¡å·ã示ãå³ã§ãããï¼ï½ï¼ã¯ï¼ï½ï¼ã®
å¢å¹
ãç·©åãããã¨ã«ãã£ã¦ã¢ã¼ãã£ãã¡ã¯ãã使¸ã
ããä¿¡å·ã示ãå³ã§ããã7A is a diagram showing an audio sample signal including an attack signal showing a natural rise, and FIG. 7B is a diagram showing a signal including an artifact generated by amplifying the signal in FIG. Yes, (c) is a diagram showing a signal with reduced artifacts by mitigating the amplification of (b).
ãå³ï¼ã ï¼ï½ï¼ã¯ã¢ã¿ãã¯ä¿¡å·åã³ãªãªã¼ã¹ä¿¡å·ãå«
ããµããã³ããã¬ã¼ã ã®çµç«¯ã示ãå³ã§ãããï¼ï½ï¼ã¯
ï¼ï½ï¼ã®ã¢ã¿ãã¯ä¿¡å·åã³ãªãªã¼ã¹ä¿¡å·ã«å¯¾ãã¦å©å¾å¶
御ããå¾ã®ãµããã³ããã¬ã¼ã ã示ãå³ã§ãããï¼ï½ï¼
ã¯ãã¹ãã¨ã³ã¼ã«å¯¾æããããã«ï¼ï½ï¼ã®å¼±ãçµç«¯ãå¢
å¹
ãããµããã³ããã¬ã¼ã ã示ãå³ã§ããã8A is a diagram showing an end of a subband frame including an attack signal and a release signal, and FIG. 8B is a subband frame after gain control is performed on the attack signal and the release signal in FIG. 8A. It is a figure which shows, (c)
FIG. 3B is a diagram showing a subband frame in which the weak end in (b) is amplified to counter post-echo.
ãå³ï¼ã å³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ã«ãã£ã¦å®è¡ãã
ãä¿®æ£é¢æ°è¨ç®å¦çã®ç¬¬ï¼ã®é¨åã示ãããã¼ãã£ã¼ã
ã§ããã9 is a flowchart showing a first part of a correction function calculation process executed by the correction function calculation unit 14 in FIG.
ãå³ï¼ï¼ã å³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ã«ãã£ã¦å®è¡ã
ããä¿®æ£é¢æ°è¨ç®å¦çã®ç¬¬ï¼ã®é¨åã示ãããã¼ãã£ã¼
ãã§ããã10 is a flowchart showing a second part of the correction function calculation process executed by the correction function calculation unit 14 in FIG.
ãå³ï¼ï¼ã å³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ã«ãã£ã¦å®è¡ã
ããä¿®æ£é¢æ°è¨ç®å¦çã®ç¬¬ï¼ã®é¨åã示ãããã¼ãã£ã¼
ãã§ããã11 is a flowchart showing a third part of the correction function calculation processing executed by the correction function calculation unit 14 of FIG.
ãå³ï¼ï¼ã å³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ã«ãã£ã¦å®è¡ã
ããä¿®æ£é¢æ°è¨ç®å¦çã®ç¬¬ï¼ã®é¨åã示ãããã¼ãã£ã¼
ãã§ããã12 is a flowchart showing a fourth part of the correction function calculation process executed by the correction function calculation unit 14 of FIG.
ãå³ï¼ï¼ã å³ï¼ã®ä¿®æ£é¢æ°è¨ç®é¨ï¼ï¼ã«ãã£ã¦å®è¡ã
ããä¿®æ£é¢æ°è¨ç®å¦çã®ç¬¬ï¼ã®é¨åã示ãããã¼ãã£ã¼
ãã§ããã13 is a flowchart showing a fifth part of the correction function calculation process executed by the correction function calculation unit 14 in FIG.
ãå³ï¼ï¼ã å³ï¼ã®ã¹ãããï¼³ï¼ã®åæåå¦çã®ãµãã«
ã¼ãã³ã示ãããã¼ãã£ã¼ãã§ãããFIG. 14 is a flowchart showing a subroutine of initialization processing in step S3 of FIG.
ãå³ï¼ï¼ã å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ç®æ¨ãã¼ã¯å¤Cu
rrFramePeakè¨ç®å¦çã®ãµãã«ã¼ãã³ã示ãããã¼ãã£
ã¼ãã§ãããFIG. 15 is a target peak value Cu in step S22 of FIG.
It is a flow chart which shows a subroutine of rrFramePeak calculation processing.
ãå³ï¼ï¼ã å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ä¿®æ£ã¬ãã«alev
codeè¨ç®å¦çã®ãµãã«ã¼ãã³ã示ãããã¼ãã£ã¼ãã§ã
ããFIG. 16 is a modification level alev of step S23 of FIG.
It is a flow chart which shows a subroutine of code calculation processing.
ãå³ï¼ï¼ã å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ãµããã³ããã¬
ã¼ã éã®ã¬ãã«ã®å·®åå¶éå¦çã®ãµãã«ã¼ãã³ã示ãã
ãã¼ãã£ã¼ãã§ããã17 is a flowchart showing a subroutine of a level difference limiting process between subband frames in step S26 of FIG.
ãå³ï¼ï¼ã å³ï¼ï¼ã®ã¹ãããï¼³ï¼ï¼ã®ãµããã³ããã¬
ã¼ã ã®çµç«¯é¨å©å¾å¶å¾¡å¦çã®ãµãã«ã¼ãã³ã示ãããã¼
ãã£ã¼ãã§ãããFIG. 18 is a flowchart showing a subroutine of a termination part gain control process of the subband frame in step S28 of FIG.
ãå³ï¼ï¼ã ï¼ï½ï¼ã¯ãå©å¾å¶å¾¡ããããããªã¨ã³ã¼ã
ãã¹ãã¨ã³ã¼ãçºçããã¨ãã®å¾æ¥æè¡ã®ãªã¼ãã£ãªä¿¡
å·ã®ç¬¦å·ååã³å¾©å·åè£
ç½®ã®æ§æã示ããããã¯å³ã§ã
ããï¼ï½ï¼ã¯ãå©å¾å¶å¾¡ãè¡ããããªã¨ã³ã¼ããã¹ãã¨
ã³ã¼ãæå§ããã¨ãã®å¾æ¥æè¡ã®ãªã¼ãã£ãªä¿¡å·ã®ç¬¦å·
ååã³å¾©å·åè£
ç½®ã®æ§æã示ããããã¯å³ã§ããã19 (a) is a block diagram showing the configuration of a conventional audio signal encoding / decoding apparatus when pre-echo and post-echo are generated without gain control. FIG. FIG. 11 is a block diagram showing a configuration of a conventional audio signal encoding / decoding apparatus when performing gain control and suppressing pre-echo and post-echo.
ï¼â¦ï¼¡ï¼ï¼¤ã³ã³ãã¼ã¿ã ï¼â¦ãªã¼ãã£ãªã¨ã³ã³ã¼ãã ï¼â¦ãããã£ã¹ã¯è¨é²è£
ç½®ã ï¼â¦ãããã£ã¹ã¯ã ï¼â¦ãããã£ã¹ã¯åçè£
ç½®ã ï¼â¦ãªã¼ãã£ãªãã³ã¼ãã ï¼â¦ï¼¤ï¼ï¼¡ã³ã³ãã¼ã¿ã ï¼ï¼â¦ï¼±ï¼ï¼¦åå²ãã£ã«ã¿ã ï¼ï¼â¦ãã¼ãã¹ãã£ã«ã¿ã ï¼ï¼â¦ãã¤ãã¹ãã£ã«ã¿ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ãã£ã«ã¿ãã³ã¯ã ï¼ï¼â¦å©å¾æ¤åºé¨ã ï¼ï¼ï½â¦ãããã¡ã¡ã¢ãªã ï¼ï¼ï½ï¼ï¼ï¼ï½âï½ï¼ï¼ï¼ï½âï½ï¼ï¼ï¼ï½âï½â¦ä¿®æ£é¢
æ°è¨ç®é¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦å©å¾å¶å¾¡é¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ï¼ï¼¤ï¼£ï¼´å¦çé¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ãã¼ã³æå符å·åå¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ãã¼ã³æåéååå¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ã¹ãã¯ãã«éååå¨ã ï¼ï¼â¦ãããã¹ããªã¼ã ãã«ããã¬ã¯ãµã ï¼ï¼â¦ãããã¹ããªã¼ã ããã«ããã¬ã¯ãµã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ãã¼ã³æåééååå¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ã¹ãã¯ãã«ééååå¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦ãã¼ã³æå復å·åå¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦éï¼ï¼¤ï¼£ï¼´å¦çé¨ã ï¼ï¼âï¼ä¹è³ï¼ï¼âï¼â¦éå©å¾å¶å¾¡é¨ã ï¼ï¼â¦ï¼±ï¼ï¼¦åæãã£ã«ã¿ã COï¼ï¼ï¼£ï¼¯ï¼â¦ç¸¦ç¶æ¥ç¶æ¼ç®åã ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼ï¼ï¼°ï¼ï½ï¼
ï¼ï¼°ï¼ï½â¦ä¹ç®å¨ãDESCRIPTION OF SYMBOLS 1 ... A / D converter, 2 ... Audio encoder, 3 ... Mini disc recording device, 4 ... Mini disc, 5 ... Mini disc reproducing device, 6 ... Audio decoder, 7 ... D / A converter, 10 ... QMF division filter, 11 ... low-pass filter, 12 ... high-pass filter, 13-1 to 13-4 ... filter bank, 14 ... gain detection section, 14a ... buffer memory, 14b, 14b-a, 14b-b, 14b-c ... correction function calculation section, 15-1 to 15-4 ... Gain control unit, 16-1 to 16-4 ... MDCT processing unit, 17-1 to 17-4 ... Tone component encoder, 18-1 to 18-4 ... Tone component quantization , 19-1 to 19-4 ... Spectral quantizer, 20 ... Bitstream multiplexer, 21 ... Bitstream demultiplexer 22-1 to 22-4 ... Tone component dequantizer, 23-1 to 23-4 ... Spectral dequantizer, 24-1 to 24-4 ... Tone component decoder, 25-1 to 25-4 Inverse MDCT processing unit, 26-1 to 26-4 ... Inverse gain control unit, 27 ... QMF synthesis filter, CO1, CO2 ... Cascade operator, MP1a, MP1b, MP1c, MP2a, MP2b,
MP2c ... Multiplier.
âââââââââââââââââââââââââââââââââââââââââââââââââââââ ããã³ããã¼ã¸ã®ç¶ã (72)çºæè ã¹ã¢ ãã³ã»ã㪠ã·ã³ã¬ãã¼ã«534415ã·ã³ã¬ãã¼ã«ãã¿ã¤ã» ã»ã³ã»ã¢ããã¥ã¼ããããã¯1022ã04â 3530çªãã¿ã¤ã»ã»ã³ã»ã¤ã³ãã¹ããªã¢ã«ã» ã¨ã¹ãã¤ããããã½ããã¯ã»ã·ã³ã¬ãã¼ã« ç ç©¶ææ ªå¼ä¼ç¤¾å Fã¿ã¼ã (åèï¼ 5D045 DA20 5J064 BA16 BB07 BB12 BC01 BC02 BC06 BC07 BC09 BC12 BC16 BC25 BC29 BD03   âââââââââââââââââââââââââââââââââââââââââââââââââââ âââ Continued front page   (72) Inventor Suahone Neo             Singapore 534415 Singapore, Thailand             Sen Avenue, Block 1022, 04-             No. 3530, Thai Sen Industrial             Estate, Panasonic Singapore             Research Institute Co., Ltd. F-term (reference) 5D045 DA20                 5J064 BA16 BB07 BB12 BC01 BC02                       BC06 BC07 BC09 BC12 BC16                       BC25 BC29 BD03
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4