The invention provides a stereo vocal cancellation method and a related device, which are used for respectively providing two corresponding output signals after canceling vocal in two stereo vocal tract signals and increasing the stereo effect of the two output signals. The method comprises the following steps: generating a mono signal according to a result of the synthesis of the two channel signals; according to the difference between each sound channel signal and the single sound channel signal, the human voice elimination is respectively carried out on the two sound channel signals, and the low-frequency and high-frequency band compensation is carried out on the two output signals so as to generate the two output signals. Since the above method is to perform voice elimination on two channels, the two output signals provided by the method have substantial difference in frequency bands other than the high frequency band, thereby increasing the stereo feeling.
Description Translated from Chinese ç«ä½å£°ç人声æ¶é¤æ¹æ³åç¸å ³è£ ç½®Stereo human voice cancellation method and related deviceææ¯é¢åtechnical field
åææä¾ä¸ç§å¨äººå£°æ¶é¤æ¶å¢å ç«ä½å£°æçæ¹æ³åç¸å ³è£ ç½®ï¼ç¹å«æ¯æä¸ç§åå«å¯¹ä¸å声éä¿¡å·åèªè¿è¡äººå£°æ¶é¤ä»¥å¢å ç«ä½å£°æçæ¹æ³åç¸å ³è£ ç½®ãThe invention provides a method and a related device for increasing the sense of stereo when human voice is eliminated, in particular, a method and a related device for increasing the sense of stereo by performing human voice elimination on different channel signals respectively.
èæ¯ææ¯Background technique
éçä¿¡æ¯ãçµåææ¯çè¿æ¥ä¸æ®åï¼å¨ç°ä»£ç¤¾ä¼ä¸ï¼å¨±ä¹çå½¢æä¹è¶è¶å¤æ ·åã䏾便¥è¯´ï¼æå¡æOKä¹ç§°çä¼´å±ç³»ç»ï¼å°±è½å¤ææ¾ææ²çèæ¯é ä¹ï¼è®©ä½¿ç¨è ä¸éä¹å¢çä¼´å¥ï¼å°±è½éèæ¯é 乿å±ï¼äº«åä¸ä¸çº§ç娱ä¹ç¯å¢ãä¸è¿ï¼ä¸è¬æ¥è¯´ï¼å¨±ä¹ä¸è æ¨åºçææ²ä¸é½å 嫿é å±ç人声(vocal)ï¼å æ¤ï¼ä¸ºäºéåºä¼´å±ç³»ç»çéè¦ï¼ç°ä»£çä¿¡æ¯ä¸è ä¹åå±åºäºäººå£°æ¶é¤çææ¯ï¼è®¾æ³å°ææ²ä¸çé å±äººå£°åææ¶é¤ï¼çä¸èæ¯é ä¹ï¼ä¾ä¼´å±ç³»ç»ä½¿ç¨ãWith the advancement and popularization of information and electronic technology, entertainment forms are becoming more and more diverse in modern society. For example, the accompaniment system known as karaoke can play the background music of the song, allowing users to sing along with the background music without the accompaniment of an orchestra, enjoying a professional-level entertainment environment. However, generally speaking, the songs released by the entertainment industry all contain vocals (vocal) for accompanying singing; The accompaniment vocals are reduced and eliminated, leaving the background soundtrack for use by the accompaniment system.
请åèå¾1ãå¾1ä¸ºä¸ææ¾å¨10ä»¥å ¬ç¥ææ¯è¿è¡äººå£°æ¶é¤æ¶ç¸å ³åè½æ¹åé ç½®ç示æå¾ãä¸è¬æ¥è¯´ï¼ç°ä»£çææ¾ç³»ç»é½è½ææ¾ä¸¤å£°é(ææ´å¤å£°é)çç«ä½äººå£°ææï¼ä»¥ææ¾å¨ä¸ä¸åçæ¬å£°å¨æ¨¡åæ¥å嫿æ¾ä¸å声éçä¿¡å·ï¼è®©ä½¿ç¨è è½å¬å°æ´å ·ä¸´åºæçé³ä¹ææ¾ææãèææ¾å¨10ä¸å³è®¾æä¸ä¸ªè½æä¾ä¸¤å£°éä¿¡å·(忝左å³å£°é)ç鳿ºçµè·¯12ãä¸ä¸ªç¨æ¥è¿è¡äººå£°æ¶é¤çä¿¡å·æ¨¡å14ã以åä¸¤ä¸ªç¨æ¥ææ¾ç«ä½äººå£°ææçæ¬å£°å¨æ¨¡å16Aã16Bã鳿ºçµè·¯12å¯ä»¥æ¯ä¸å çè¯»åæºæï¼è½å©ç¨è¯»å头18ååä¸å çç20ä¸çææ²æ°æ®ï¼å¹¶å 以解æ(ä¾å¦éå½å°è§£è°ãè§£ç )ã对åºäºç«ä½äººå£°ææçå±ç°ï¼ç°ä»£ç娱ä¹ä¸è 卿ä¾å¨±ä¹èµæºåªä½(ä¾å¦æ¯å¨åæææ²çå ççåªä½)æ¶ï¼ä¹é½ä¼å¨å ¶ä¸è®°å½æä¸å声éçä¿¡å·ãè鳿ºçµè·¯12å¨ååå çç20çæ°æ®åï¼å³å¯è§£è¯»åºä¸¤ç«ä½å£°ç声éä¿¡å·PLiãPRiãä¿¡å·æ¨¡å14å³ç¨æ¥å¯¹å£°éä¿¡å·PLiãPRiè¿è¡äººå£°æ¶é¤ï¼ä»¥åå«äº§çåºè¾åºä¿¡å·PLoãPRoãæ¬å£°å¨æ¨¡å16Aã16Bå¯åå«è®¾æåèªçæ°å/模æè½¬æ¢å¨ãåçæ¾å¤§å¨ãæ¬å£°å¨ççµè·¯ï¼ä»¥åå«å°è¾åºä¿¡å·PLoãPRo转æ¢ä¸ºå£°æ³¢ææ¾åºæ¥ãPlease refer to Figure 1. FIG. 1 is a schematic diagram of the configuration of related functional blocks when a player 10 performs human voice cancellation in the known technology. Generally speaking, modern playback systems can play two-channel (or more) stereo vocal effects, and use different speaker modules in the player to play signals of different channels, so that users can hear More immersive music playback effect. And promptly be provided with a sound source circuit 12 that can provide two-channel signal (like left and right sound channels) in player 10, a signal module 14 that is used for vocal elimination, and two for playing stereo vocal effect Speaker modules 16A, 16B. The sound source circuit 12 can be an optical disc reading mechanism, which can use the reading head 18 to access the song data on an optical disc 20 and analyze (eg, properly demodulate, decode). Corresponding to the presentation of the stereo vocal effect, modern entertainment companies also record signals of different channels when providing entertainment resource media (such as CD media with songs stored therein). After the audio source circuit 12 accesses the data of the optical disc 20, it can interpret the two stereo channel signals PLi and PRi. The signal module 14 is used to perform vocal cancellation on the channel signals PLi, PRi to generate output signals PLo, PRo respectively. The speaker modules 16A, 16B can be respectively equipped with their own digital/analog converters, power amplifiers, speakers and other circuits to convert the output signals PLo, PRo into sound waves and play them.
为äºè¿è¡äººå£°æ¶é¤ï¼å ¬ç¥çä¿¡å·æ¨¡å14ä¸è®¾æä¸¤é«é模å26Aã26Bãä¸ä½é模å28åä¸äººå£°æ¶é¤æ¨¡å22ãé«é模å26Aã26Båå«ç¨æ¥å¯¹å£°éä¿¡å·PLiãPRiè¿è¡é«é滤波ï¼ä»¥äº§ç对åºçé«éä¿¡å·PLhãPRhï¼ä½é模å28ç¨æ¥å¯¹ä¸ä¿¡å·Psè¿è¡ä½é滤波ï¼äº§ç对åºçä½éä¿¡å·Plãè人声æ¶é¤æ¨¡å22忝å°ä¸¤å£°éä¿¡å·PLiãPRiç¸å以产çä¸äººå£°æ¶é¤åçä¸ä»ä¿¡å·PVCãå°å¯¹åºäºå£°éä¿¡å·PLiçé«éä¿¡å·PLhãä½éä¿¡å·Pl以åä¸ä»ä¿¡å·PVCæ··é³ç¸å åï¼å°±è½äº§çè¾åºä¿¡å·PLoï¼å°å¯¹åºäºå£°éä¿¡å·çé«éä¿¡å·PRhãä½éä¿¡å·Plåä¸ä»ä¿¡å·PVCç¸å æ··é³åï¼å³ä¸ºè¾åºä¿¡å·PRoãIn order to perform vocal cancellation, the known signal module 14 is provided with two high-pass modules 26A, 26B, a low-pass module 28 and a vocal cancellation module 22 . The high-pass modules 26A and 26B are respectively used to perform high-pass filtering on the channel signals PLi and PRi to generate corresponding high-pass signals PLh and PRh; the low-pass module 28 is used to perform low-pass filtering on a signal Ps to generate corresponding low-pass signals No. Pl. The human voice cancellation module 22 subtracts the two-channel signals PLi and PRi to generate a human voice-cancelled intermediate signal PVC. After mixing and adding the high-pass signal PLh corresponding to the channel signal PLi, the low-pass signal Pl and the intermediate signal PVC, the output signal PLo can be generated; the high-pass signal PRh and the low-pass signal Pl corresponding to the channel signal After being added and mixed with the intermediate signal PVC, the output signal PRo is obtained.
为äºè¯´æä¸è¿°å ¬ç¥ææ¯äººå£°æ¶é¤çåçï¼è¯·å åèå¾2(å¹¶ä¸å¹¶åèå¾1)ãå¾2为ç«ä½äººå£°ææä¸å声éä¿¡å·åç¸å ³ä¿¡å·å ¸åé¢è°±(spectrum)ç示æå¾ãå¾2ä¸åä¿¡å·é¢è°±ç横轴为é¢çï¼çºµè½´å为é¢è°±ç大å°(忝ç»å¯¹å¼ç大å°)ãIn order to illustrate the principle of vocal cancellation in the above known technology, please refer to FIG. 2 (and refer to FIG. 1 together). FIG. 2 is a schematic diagram of a typical spectrum (spectrum) of each channel signal and related signals in a stereo vocal effect. The horizontal axis of each signal spectrum in FIG. 2 is the frequency, and the vertical axis is the magnitude of the spectrum (like the magnitude of the absolute value).
ä¸è¬æ¥è¯´ï¼å¦çä¹ ææ¯äººå£«æç¥ï¼å¨å¨±ä¹ä¸è ææä¾çææ²ä¸ï¼æ¯å¨ä¸åç声éä¿¡å·ä¸æ··å ¥ä¸åçèæ¯é ä¹ä¿¡å·æ¥è¥é ç«ä½äººå£°ææï¼èé å±ç人声信å·åå½ä½æ¯ä¸»è¦çä¿¡å·ï¼é常ä¼ä»¥åççå¼ºåº¦æ··å ¥å¨å声éçä¿¡å·ä¸ãè¿æ ·ä¸æ¥ï¼å½ä½¿ç¨è ä»¥ææ¾å¨çç¸å¼æ¬å£°å¨æ¨¡åæ¥ææ¾ä¸å声éçä¿¡å·æ¶ï¼å°±è½æè§å°é å±ç人声彷ä½å¨é¢å(å å ¶å¨ä¸¤å£°é䏿份ç¸ç)ï¼èä¸å声éçä¸åèæ¯é ä¹åä¼è®©ä½¿ç¨è æåå°ç«ä½é³ææï¼å½·ä½èæ¯é³ä¹ç鳿ºæ¯ç¯ç»å¨ä½¿ç¨è åå¨ãå¨å¾2ä¸ï¼é¢è°±Vfå³ä»£è¡¨é å±ç人声信å·çé¢è°±ï¼ç¸å¼çé¢è°±LmfãRmfååå«ä»£è¡¨ä¸åèæ¯é ä¹ä¿¡å·çé¢è°±ãå¦ä¸æè¿°ï¼å°èæ¯é ä¹çé¢è°±Lmfä¸äººå£°é¢è°±Vfç¸å ï¼å°±ç»æä¸ºç«ä½å£°ä¸ä¸ä¸ªå£°éä¿¡å·çé¢è°±Lfï¼èå°èæ¯é ä¹çé¢è°±Rmfä¸äººå£°é¢è°±Vfç¸å ï¼åæä¸ºç«ä½å£°ä¸å¦ä¸å£°éä¿¡å·çé¢è°±Rfã忝å¨å¾1ä¸ç±é³æºçµè·¯12åå¾ç声éä¿¡å·PLiãPRiï¼å ¶ä¿¡å·çé¢è°±å°±å¯ä»¥åæ¯é¢è°±LfãRfæç¤ºãç±äºäººå声æºå¶çççéå¶ï¼ä½¿äººå£°ä¸è½å¤ªä½é¢æè¶ è¿ä¸å®çé«é¢ï¼æ 人声信å·çé¢è°±é常ä¹éå®äºä¸å®çé¢å¸¦èå´ä¸ãèå¾2䏿 åºçé¢çflãfhï¼å°±åå«ä»£è¡¨äºäººå£°ä¿¡å·çé¢çä¸éåä¸éï¼è人声信å·çé¢è°±Vfå³éä¸äºé¢çflè³fhé´çä¸é¢é¢å¸¦BMä¸ãç¸å¯¹äºå±éäºä¸é¢é¢å¸¦BMç人声é¢è°±Vfï¼èæ¯é ä¹ä¸ç±åç§ä¹å¨å æ»èµ·æ¥çé¢è°±å°±è½å»¶ä¼¸è³è¾å¹¿çé¢çèå´ï¼å°±å¦å¾2ä¸æç¤ºï¼å³ä½¿å¨ä½äºé¢çflçä½é¢é¢å¸¦BLåé«äºé¢çfhçé«é¢é¢å¸¦BHä¸ï¼é½ä¼æèæ¯é ä¹çé¢è°±LmfãRmfåå¸ãè¿å¸¦å°ï¼é¤äºäººå£°ä¿¡å·é¢è°±æå¨çä¸é¢é¢å¸¦BMï¼å声éä¿¡å·çé¢è°±LfãRfä¹é½ä¼å»¶ä¼¸å°ä½é¢é¢å¸¦BLåé«é¢é¢å¸¦BHãGenerally speaking, as those skilled in the art know, in the songs provided by the entertainment industry, different background soundtrack signals are mixed in different channel signals to create a stereo human voice effect; As the main signal, it is usually mixed with the signal of each channel with equal intensity. In this way, when the user uses different speaker modules of the player to play signals of different channels, he can feel that the vocals of the accompaniment seem to be in front of him (because the components in the two channels are equal), and different The different background music of the channel will allow the user to experience the stereo sound effect, as if the sound source of the background music surrounds the user. In FIG. 2 , the frequency spectrum Vf represents the frequency spectrum of the vocal signal for accompaniment, and the different frequency spectrums Lmf and Rmf represent the frequency spectrum of different background music signals. As mentioned above, the spectrum Lmf of the background music is added to the spectrum Vf of the human voice to form the spectrum Lf of a channel signal in stereo; and the spectrum Rmf of the background music is added to the spectrum Vf of the human voice to form Spectrum Rf of the other channel signal. Like the channel signals PLi, PRi obtained by the sound source circuit 12 in FIG. 1, the frequency spectrum of the signal can be shown as the frequency spectrum Lf, Rf. Due to the physiological limitation of the human voice mechanism, the human voice cannot be too low frequency or exceed a certain high frequency, so the frequency spectrum of the human voice signal is usually limited to a certain frequency band range. The frequencies fl and fh marked in Fig. 2 represent the lower limit and upper limit of the frequency of the human voice signal respectively; and the frequency spectrum Vf of the human voice signal is concentrated in the middle frequency band BM between the frequencies fl to fh. Compared with the human voice spectrum Vf limited to the mid-frequency band BM, the spectrum summed up by various instruments in the background music can extend to a wider frequency range; as shown in Figure 2, even at frequencies below fl In the low-frequency band BL and the high-frequency band BH higher than the frequency fh, there will be spectrum Lmf and Rmf distribution of the background music. Correspondingly, in addition to the middle frequency band BM where the vocal signal spectrum is located, the spectrum Lf and Rf of each channel signal also extend to the low frequency band BL and the high frequency band BH.
ç±äºäººå£°ä¿¡å·å¨å声éä¿¡å·ä¸çæä»½ç¸çï¼å ¬ç¥çä¿¡å·æ¨¡å14(请è§å¾1)峿¯å¨äººå£°æ¶é¤æ¨¡å22ä¸å°ä¸¤å£°éä¿¡å·PLiãPRiç¸åï¼ä»¥åå»ä¸¤å£°éä¸å ±æç人声信å·ï¼äº§ç人声æ¶é¤åçä¸ä»ä¿¡å·PVCãä¸è¿ï¼å¨å°å£°éä¿¡å·PLiãPRiç¸åçè¿ç¨ä¸ï¼å£°éä¿¡å·PLiãPRiä¸ä½äºä½é¢é¢å¸¦BLåé«é¢é¢å¸¦BHçä¿¡å·æä»½ä¹ä¼è¢«ç¸åï¼è人声æ¶é¤çåæå½ç¶è¿æ¯è¦ä¿çèæ¯é ä¹å»¶ä¼¸äºä½é¢é¢å¸¦BLåé«é¢é¢å¸¦BHçæä»½ãæ ä¿¡å·æ¨¡å14ä¸è¿è¦ä»¥é«é模å26Aã26B以åä½é模å28æ¥è¿è¡é«é¢è¡¥å¿åä½é¢è¡¥å¿ãå ¶ä¸ï¼é«é模å26Aå¯å°å£°éä¿¡å·PLiä¸å±äºé«é¢é¢å¸¦BHçæä»½ååºï¼æä¸ºé«éä¿¡å·PLhãèä½é模å28çä¿¡å·æ¥æºPså¯ä»¥æ¯å£°éä¿¡å·PLiãPRiå ¶ä¸ä¹ä¸ï¼ä½é模å28å°ä¿¡å·Psä½äºä½é¢é¢å¸¦BLçæä»½ååºæä¸ºä½éä¿¡å·Plï¼å°±ç¸å½äºå°å£°éä¿¡å·çä½é¢æä»½ååºäºä½éä¿¡å·Plãå¨å°é«éä¿¡å·PLhãä½éä¿¡å·Plä¸ä¸ä»ä¿¡å·PVCæ··é³ç¸å åï¼å°±è½è¡¥å¿ä¸ä»ä¿¡å·PVCå¨äººå£°æ¶é¤è¿ç¨ä¸æå¤±çé«é¢åä½é¢æä»½ï¼æä¸ºè¾åºä¿¡å·PLoãBecause the components of the human voice signal in each channel signal are equal, the known signal module 14 (seeing FIG. 1 ) promptly subtracts the two channel signals PLi and PRi in the human voice elimination module 22 to subtract the two voice signals. The common human voice signal in the channel generates the intermediate signal PVC after the human voice is eliminated. However, in the process of subtracting the channel signals PLi and PRi, the signal components in the low-frequency band BL and the high-frequency band BH in the channel signals PLi and PRi will also be subtracted; The components of the background soundtrack extending in the low frequency band BL and the high frequency band BH are retained. Therefore, high-pass modules 26A, 26B and low-pass module 28 are also used in the signal module 14 to perform high-frequency compensation and low-frequency compensation. Among them, the high-pass module 26A can extract the components belonging to the high-frequency band BH in the channel signal PLi to form a high-pass signal PLh. The signal source Ps of the low-pass module 28 can be one of the channel signals PLi and PRi, and the low-pass module 28 takes out the component of the signal Ps located in the low-frequency band BL to become the low-pass signal P1, which is equivalent to the low-frequency signal of the channel signal. The components are taken out from the low-pass signal Pl. After mixing and adding the high-pass signal PLh, low-pass signal Pl and intermediate signal PVC, the high-frequency and low-frequency components lost in the process of vocal cancellation of the intermediate signal PVC can be compensated to become the output signal PLo.
åçï¼é«é模å26Bå¨å°å£°éä¿¡å·PRiçé«é¢æä»½ååºä¸ºé«éä¿¡å·PRhåï¼ä¿¡å·æ¨¡å14å°±è½ä»¥é«éä¿¡å·PRhãä½éä¿¡å·Plæ¥å¯¹ä¸ä»ä¿¡å·PVCè¿è¡é«é¢ãä½é¢çè¡¥å¿ï¼äº§çè¾åºä¿¡å·PRoãä¸è¬æ¥è¯´ï¼å声éä¿¡å·ä¸å±äºä½é¢é¢å¸¦BLçæä»½æ¯è¾ä¸å ·ææåæ§ï¼ä¸¤å£°éä¿¡å·PRiãPLiå¨ä½é¢æä»½ç差弿¯è¾é¾ä»¥è¥é åºç«ä½é³ææï¼æ ä¿¡å·æ¨¡å14齿¯ä»¥ç¸åçä½é¢ä¿¡å·Plæ¥å¯¹è¾åºä¿¡å·PLoãPRoè¿è¡ä½é¢è¡¥å¿ãç¸å¯¹å°ï¼å¨å£°éä¿¡å·PRiãPLiä¸å±äºé«é¢é¢å¸¦BHçæä»½å°±ä¼æ¯è¾å¯ææåæ§ï¼ä¸¤å£°éä¿¡å·å¨é«é¢é¢å¸¦çå·®å¼è½å¤è®©ä½¿ç¨è è¾ä¸ºæ¾çå°ä½ä¼å°ç«ä½å£°çææï¼æ ä¿¡å·æ¨¡å14æ¯åå«ä»¥ä¸¤å£°éä¿¡å·PRiãPLié«é滤波åçé«éä¿¡å·PRhãPLhåé«éè¡¥å¿ï¼ä»¥å©ç¨è¾åºä¿¡å·PRoãPLoå¨é«é¢æä»½çå·®å¼å±ç°ç«ä½äººå£°ææãæ»ç»æ¥è¯´ï¼å ¬ç¥ä¿¡å·æ¨¡å14卿¥æ¶ä¸¤å£°éç声éä¿¡å·PLiãPRiåï¼åç»ä¸ä»¥äººå£°æ¶é¤æ¨¡å22产ççä¸ä»ä¿¡å·PVCä½ä¸ºä¸¤å£°é人声æ¶é¤çåºæ¬ç»æï¼å以ä½éä¿¡å·Plåé«éä¿¡å·PLhåPRhä½ä¸ºä½é¢åé«é¢è¡¥å¿ï¼åå«äº§çåºä¸¤å£°éçè¾åºä¿¡å·PLoãPRoï¼å½ä½å£°éä¿¡å·PLiãPRi人声æ¶é¤åçç»æãç°æä¿¡å·æ¨¡å14å³ä»¥æ¤ç§æ¹å¼åæä¸¤å£°éç声éä¿¡å·PLiãPRiä¸äººå£°çé¨åï¼å¹¶è¯å¾å¨ä¸¤å£°éçè¾åºä¿¡å·PLoãPRoä¸ä¿çèæ¯é ä¹çç«ä½äººå£°ææãIn the same way, after the high-pass module 26B takes out the high-frequency components of the channel signal PRi as the high-pass signal PRh, the signal module 14 can use the high-pass signal PRh and the low-pass signal P1 to perform high-frequency and low-frequency conversion of the intermediate signal PVC. Compensated to produce an output signal PRo. Generally speaking, the components belonging to the low-frequency band BL in each channel signal are relatively non-directional, and the difference in the low-frequency components of the two-channel signals PRi and PLi is relatively difficult to create a stereo effect, so the signal modules 14 are based on the same The low-frequency signal Pl is used to perform low-frequency compensation on the output signals PLo and PRo. In contrast, the components belonging to the high-frequency band BH in the channel signals PRi and PLi will be more directional, and the difference between the two-channel signals in the high-frequency band can make the user experience the stereo effect more significantly, so the signal The module 14 performs high-pass compensation with high-pass signals PRh, PLh after high-pass filtering of the two-channel signals PRi, PLi, respectively, so as to use the difference in high-frequency components of the output signals PRo, PLo to present a stereo vocal effect. To sum up, after the known signal module 14 receives the two-channel channel signals PLi and PRi, the intermediate signal PVC generated by the vocal canceling module 22 is used as the basic result of the two-channel vocal canceling, and then the low-communication Signal Pl and high-pass signals PLh and PRh are used as low-frequency and high-frequency compensation to generate two-channel output signals PLo, PRo respectively, which are regarded as the result of vocal cancellation of channel signals PLi and PRi. In this way, the existing signal module 14 reduces the human voice in the two-channel channel signals PLi, PRi, and tries to preserve the stereo human voice effect of the background music in the two-channel output signals PLo, PRO.
请åèå¾3ãå¾3æ¯å¾1ä¸ä¿¡å·æ¨¡å14å¨è¿ä½æ¶ç¸å ³ä¿¡å·é¢è°±ç示æå¾ãå»¶ç»å¾2ä¸çé¢è°±ç¤ºæä¾ï¼è¥å¾1ä¸å£°éä¿¡å·PLiãPRiçé¢è°±å嫿¯å¾2ä¸çé¢è°±LfãRfï¼åå¨ä¿¡å·æ¨¡å14è¿ä½åï¼è¾åºä¿¡å·PLoãPRoçé¢è°±å°±åå«å¦å¾3ä¸çé¢è°±PLofãPRofæç¤ºï¼å¾3åé¢è°±ç横轴亦为é¢çï¼çºµè½´ä¸ºé¢è°±å¤§å°(å¦ç»å¯¹å¼ç大å°)ãèå¾3䏿 åºçé¢çflãfhåä½é¢é¢å¸¦BLãä¸é¢é¢å¸¦BMä¸é«é¢é¢å¸¦BHï¼å ¶æä¹å°±å¦å¾2åç¸å ³è¯´æä¸ç¸åãä¸ºäºæ¯è¾ä¸¤é¢è°±PLofä¸PRofçå·®å¼ï¼å¾3ä¸ä¹å°é¢è°±PRof以è线æ¹å¼ä¸å®çº¿çé¢è°±PLofç»ç¤ºäºåä¸åæ ç³»ä¸ãPlease refer to Figure 3. FIG. 3 is a schematic diagram of related signal spectrums when the signal module 14 in FIG. 1 is in operation. Continuing the schematic example of the frequency spectrum in FIG. 2, if the frequency spectrums of the channel signals PLi and PRi in FIG. 1 are respectively the frequency spectrums Lf and Rf in FIG. Shown in the spectrum PLof and PRof in Fig. 3; the horizontal axis of each spectrum in Fig. 3 is also the frequency, and the vertical axis is the spectrum size (such as the size of the absolute value). The meanings of the frequencies fl, fh, low frequency band BL, middle frequency band BM and high frequency band BH marked in FIG. 3 are the same as in FIG. 2 and related descriptions. In order to compare the difference between the two spectrums PLof and PRof, the spectrum PRof is also shown in the same coordinate system as the dotted line and the solid line spectrum PLof in FIG. 3 .
ç±äºå¾1ä¸å ¬ç¥çä¿¡å·æ¨¡å14æäº§ççè¾åºä¿¡å·PLoåPRoé½å 嫿ç¸åçä¸ä»ä¿¡å·PVCãä½éä¿¡å·Plï¼å·®å¼ä» å¨äºæ¤ä¸¤è¾åºä¿¡å·æ¯ä»¥ä¸åçé«éä¿¡å·PLhãPRhåé«é¢è¡¥å¿ï¼æ ç±å¾3ä¸å¯æ¯è¾åºï¼è¾åºä¿¡å·PLoãPRoçé¢è°±PLofãPRofï¼ä¸¤è 主è¦çå·®å¼éä¸äºé«é¢é¢å¸¦BHï¼è³äºä¸¤é¢è°±PLofãPRofäºä¸é¢é¢å¸¦BMãä½é¢é¢å¸¦BLçæä»½ï¼åå 乿¯å®å ¨ç¸åçãè½ç¶ä¿¡å·ä¸é«é¢çæä»½è¾å ·æç«ä½é³æä¸çæåæ§ï¼ä½ç±äºé¢è°±PLofãPRofçä¿¡å·è½é大é¨åè¿æ¯éä¸äºä¸é¢ãä½é¢é¢å¸¦BMåBLï¼åå¸äºé«é¢é¢å¸¦BHçä¿¡å·è½éè¾å°ï¼æ é¢è°±PLofãPRofçæ´ä½å·®å¼å ¶å®ä¸å¤§ã彿æ¾å¨10å°è¾åºä¿¡å·PLoãPRoææ¾åºæ¥æ¶ï¼ç±äºä¸¤è é´çå·®å¼ä¸å¤§ï¼æ å ¶æè½å±ç°åºæ¥çç«ä½é³æä¹å°±å¤§æææ£ãèè¿ä¹å°±æ¯å ¬ç¥ææ¯ç缺ç¹ä¹ä¸ãæ¢å¥è¯è¯´ï¼å¨å¾1çå ¬ç¥ä¿¡å·æ¨¡å14ä¸ï¼ç±äºä¸¤å£°éçè¾åºä¿¡å·PLoãPRoç以åä¸ä¸ä»ä¿¡å·PVCä½ä¸ºäººå£°æ¶é¤åçåºæ¬ä¿¡å·ï¼ä» 使ç¨äºç¸å¼çé«éä¿¡å·PLhãPRhåé«é¢è¡¥å¿ï¼æ æ¤ä¸¤å£°éçè¾åºä¿¡å·PLoãPRoçå·®å¼ä» éä¸äºé«é¢ãè½éè¾å°çé¨åï¼æ æ³ä»åæ¥ç声éä¿¡å·PLiãPRi䏿·ååºè¶³å¤ç差弿¥äº§çè¾ä¸ºææ¾çç«ä½äººå£°ææï¼è¿ä¹ä½¿å¾ä¸¤å£°éä¿¡å·PLiãPRiå¨è¿è¡äººå£°æ¶é¤åï¼å ¶æè½å±ç°åºæ¥çç«ä½äººå£°ææå¤§æææ£ï¼æ æ³è®©ä½¿ç¨è 享åå°å ·æç¯åºé³æçä¼´å±é ä¹ãSince the output signals PLo and PRo produced by the known signal module 14 in FIG. 1 all include the same intermediate signal PVC and low-pass signal P1, the only difference is that the two output signals are made high by different high-pass signals PLh and PRh. Therefore, it can be compared from Fig. 3 that the main difference between the frequency spectrum PLof and PRof of the output signal PLo and PRo is concentrated in the high frequency band BH; as for the two frequency spectrum PLof and PRof in the middle frequency band BM and the low frequency band BL The ingredients are almost identical. Although the high-frequency components in the signal have more directivity in the stereo sound effect, most of the signal energy in the spectrum PLof and PRof is still concentrated in the middle frequency, low-frequency bands BM and BL, and the signal energy distributed in the high-frequency band BH is less, so The overall difference between spectrum PLof and PRof is actually not big. When the player 10 plays the output signals PLo, PRO, the stereo sound effect it can show is greatly reduced because the difference between them is not large. And this is one of the shortcoming of known technology. In other words, in the known signal module 14 of FIG. 1 , since the two-channel output signals PLo and PRO all use the same intermediate signal PVC as the basic signal after vocal cancellation, only the different high-pass signals PLh, PRh performs high-frequency compensation, so the difference between the output signals PLo and PRo of the two channels is only concentrated in the high-frequency and low-energy parts, and it is impossible to extract enough differences from the original channel signals PLi and PRi to produce a more obvious This also makes the two-channel signals PLi and PRi after the human voice is eliminated, the stereo vocal effect that it can show is greatly reduced, and it is impossible for the user to enjoy the accompaniment soundtrack with surround sound effect.
åæå 容Contents of the invention
å æ¤ï¼æ¬åæçç®çï¼å³å¨äºæåºä¸ç§è¾ä½³ç人声æ¶é¤æ¹æ³åç¸å ³è£ ç½®ï¼ä½¿å¾ä¸å声éç声éä¿¡å·å¨ç»è¿äººå£°æ¶é¤åï¼è¿è½ç»´æç¸å½ç¨åº¦çä¿¡å·å·®å¼ï¼äº§çè¾ä½³çç«ä½äººå£°ææï¼å æå ¬ç¥ææ¯ç缺ç¹ãTherefore, the purpose of the present invention is to propose a better human voice cancellation method and related devices, so that the channel signals of different channels can maintain a considerable degree of signal difference after the human voice is eliminated, resulting in better The stereo vocal effect overcomes the shortcoming of known technology.
å¨å ¬ç¥ææ¯ä¸ï¼æ¯ä»¥ä¸¤å£°éç声éä¿¡å·äº§çåºä¸ä¸ªä¸ä»ä¿¡å·ä½ä¸ºäººå£°æ¶é¤ç主è¦ä¿¡å·ï¼ä»¥æ¤ä¸ä»ä¿¡å·ä¸ºä¸»ï¼å¨è¿è¡ä½é¢åç¸å¼çé«é¢è¡¥å¿åå³äº§ç两声éçè¾åºä¿¡å·ãç±äºæ¤ä¸¤å£°éçè¾åºä¿¡å·çåºäºç¸åçä¸ä»ä¿¡å·ï¼ä¸¤è é´çå·®å¼ä» å±éäºé«é¢æä»½ï¼æ æ¤ä¸¤å£°éçè¾åºä¿¡å·æ æ³å±ç°åºè¾ä½³çç«ä½äººå£°æãIn the known technology, an intermediary signal is generated from two-channel channel signals as the main signal for vocal cancellation; this intermediary signal is the main signal, and two channels are generated after low-frequency and different high-frequency compensation. output signal. Since the output signals of the two channels are based on the same intermediate signal, the difference between the two is limited to the high-frequency components, so the output signals of the two channels cannot show a better stereo human sound effect.
卿¬åæä¸ï¼åæ¯ä»¥ä¸¤å£°éä¿¡å·çå¹³å产çåºä¸ä¸ªå声éä¿¡å·ï¼å以å声éä¿¡å·ä¸æ¤å声éä¿¡å·é´çå·®å¼ä½ä¸ºå声éä¿¡å·å¨äººå£°æ¶é¤å对åºçä¸ä»ä¿¡å·ï¼å¨å¯¹å声éä¿¡å·å¯¹åºçä¸ä»ä¿¡å·åä½é¢å对åºçé«é¢è¡¥å¿åï¼äº§çå声éä¿¡å·å¯¹åºçè¾åºä¿¡å·ãå¨ä¸è¿°æ¹æ³ä¸ï¼ç±äºå声éä¿¡å·å¯¹åºçä¸ä»ä¿¡å·æ¯ç±å声éçä¿¡å·ä¸è¯¥å声éä¿¡å·é´çå·®å¼æäº§çï¼æ å声éè¾åºä¿¡å·å¯¹åºçä¸ä»ä¿¡å·ä¹åä¸ç¸åãè¿æ ·ä¸æ¥ï¼å³ä½¿å¨äººå£°æ¶é¤åï¼å声éä¿¡å·å¨ä½é¢ãä¸é¢é¢å¸¦çå·®å¼è¿æ¯ä¼æç¸å½ç¨åº¦çä¿çï¼ä¹ä½¿å¾å声éçè¾åºä¿¡å·å¾ä»¥å±ç°åºè¾ä½³çç«ä½äººå£°ææï¼è®©ä½¿ç¨è å¾ä»¥äº«åå ·ææ¾èç¯åºé³æçä¼´å±é ä¹ãIn the present invention, a monophonic signal is generated by the average of the two-channel signals, and then the difference between each channel signal and the monophonic signal is used as the corresponding intermediary of each channel signal after the human voice is eliminated. signal; after performing low-frequency and corresponding high-frequency compensation on the intermediate signal corresponding to each channel signal, an output signal corresponding to each channel signal is generated. In the above method, since the intermediate signal corresponding to each channel signal is generated by the difference between the signal of each channel and the mono signal, the intermediate signal corresponding to the output signal of each channel is also different. In this way, even after the human voice is eliminated, the differences in the low-frequency and mid-frequency bands of each channel signal will still be preserved to a considerable extent, and the output signal of each channel can show a better stereo human voice effect. Allowing users to enjoy the accompaniment soundtrack with significant surround sound effects.
éå¾è¯´æDescription of drawings
å¾1为ä¸ä¸ªå ¬ç¥ææ¾å¨è¿è¡äººå£°æ¶é¤ç¸å ³åè½æ¹åç示æå¾ãFIG. 1 is a schematic diagram of functional blocks related to human voice cancellation in a known player.
å¾2为ç«ä½å£°å声éä¿¡å·åç¸å ³ä¿¡å·å ¸åé¢è°±ç示æå¾ãFIG. 2 is a schematic diagram of typical frequency spectra of stereo channel signals and related signals.
å¾3为å¾1䏿æ¾å¨è¿è¡äººå£°æ¶é¤åç¸å ³è¾åºä¿¡å·çé¢è°±ç¤ºæå¾ãFIG. 3 is a schematic diagram of the frequency spectrum of the relevant output signal after the player in FIG. 1 performs vocal cancellation.
å¾4为æ¬åæå¨ä¸ä¸ªææ¾å¨ä¸å®ç°äººå£°æ¶é¤çç¸å ³åè½æ¹åç示æå¾ãFIG. 4 is a schematic diagram of related functional blocks for implementing human voice cancellation in a player according to the present invention.
å¾5为å¾4䏿æ¾å¨è¿è¡äººå£°æ¶é¤åç¸å ³è¾åºä¿¡å·çé¢è°±ç¤ºæå¾ãFIG. 5 is a schematic diagram of the frequency spectrum of the relevant output signal after the player in FIG. 4 performs vocal cancellation.
å¾6为å¾4ä¸ä¿¡å·æ¨¡åçåè½ä»¥ä¸ç¨åºä»£ç å®ç°ç示æå¾ãFIG. 6 is a schematic diagram illustrating the function of the signal module in FIG. 4 implemented by a program code.
éå¾ç¬¦å·è¯´æDescription of reference symbols
10ã30ï¼ææ¾å¨ï¼10, 30: player;
12ã32ï¼é³æºçµè·¯ï¼12, 32: audio source circuit;
14ã34ï¼ä¿¡å·æ¨¡åï¼14, 34: signal module;
16A-16Bã36A-36Bï¼æ¬å£°å¨æ¨¡åï¼16A-16B, 36A-36B: speaker module;
18ã38ï¼è¯»å头ï¼18, 38: read head;
20ã40ï¼å ççï¼20, 40: CD-ROM;
22ã42A-42Bï¼äººå£°æ¶é¤æ¨¡åï¼22. 42A-42B: Human voice cancellation module;
26A-26Bã46A-46Bï¼é«é模åï¼26A-26B, 46A-46B: Qualcomm module;
28ã48ï¼ä½é模åï¼28, 48: low-pass module;
50ï¼å声éå¤ç模åï¼50: monophonic processing module;
52Aã52Bï¼æ··é³åå ï¼52A, 52B: mixing unit;
100ï¼ç¨åºä»£ç ï¼100: program code;
PRiãPLiãRiãLiï¼å£°éä¿¡å·ï¼PRi, PLi, Ri, Li: channel signal;
PVCãLVCãRVCï¼ä¸ä»ä¿¡å·ï¼PVC, LVC, RVC: intermediary signal;
PLhãPRhãLhãRhï¼é«éä¿¡å·ï¼PLh, PRh, Lh, Rh: Qualcomm signal;
PlãSlï¼ä½éä¿¡å·ï¼Pl, Sl: low-pass signal;
PLoãProãLoãRoï¼è¾åºä¿¡å·ï¼PLo, Pro, Lo, Ro: output signal;
PsãSï¼ä¿¡å·ï¼Ps, S: signal;
Mï¼å声éä¿¡å·ï¼M: mono signal;
LmfãRmfãVfãLfãRfãPLofãProfï¼é¢è°±ï¼Lmf, Rmf, Vf, Lf, Rf, PLof, Prof: spectrum;
BLãBMãBHï¼é¢å¸¦ï¼BL, BM, BH: frequency band;
flãfhï¼é¢çãfl, fh: frequency.
å ·ä½å®æ½æ¹å¼Detailed ways
请åèå¾4ãå¾4为æ¬åæçææ¯å¨ä¸ä¸ªææ¾å¨30ä¸å®ç°äººå£°æ¶é¤ç¸å ³åè½æ¹åç示æå¾ãææ¾å¨30ä¸è®¾æä¸ä¸ªé³æºçµè·¯32ãä¸ä¸ªä¿¡å·æ¨¡å34以åç¨æ¥ææ¾ç«ä½äººå£°æçæ¬å£°å¨æ¨¡å36Aã36Bã鳿ºçµè·¯32å¯ä»¥æ¯ä¸ä¸ªå çè¯»åæºæï¼ä»¥å©ç¨ä¸è¯»å头38ä»ä¸å çç40ä¸è¯»åææ²çä¿¡å·æ°æ®ï¼å¹¶è§£æåºç«ä½äººå£°ææä¸çå个声éä¿¡å·LiãRiãä¿¡å·æ¨¡å34åç¨æ¥å®ç°æ¬åæäººå£°æ¶é¤çåè½ï¼ä»¥æ ¹æ®ä¸¤å£°éä¿¡å·LiãRi产ç人声æ¶é¤åç两声éè¾åºä¿¡å·LoãRoãå ¶ä¸ï¼ä¿¡å·æ¨¡å34设æä¸ä¸ªå声éå¤ç模å50ãä¸ä½é模å48ï¼é å两声éç声éä¿¡å·LiãRiï¼ä¿¡å·æ¨¡å34ä¸ä¹å¯¹åºå°è®¾æä¸¤ä¸ªäººå£°æ¶é¤æ¨¡å42Aã42B以åé«é模å46Aã46Bãæ¬å£°å¨æ¨¡å36Aã36Bä¸å¯åå«è®¾ææ°å/模æè½¬æ¢å¨ãåçæ¾å¤§å¨åæ¬å£°å¨ççï¼ä»¥åå«å°è¾åºä¿¡å·LoãRo转æ¢ä¸ºå£°æ³¢ææ¾åºæ¥ãPlease refer to Figure 4. FIG. 4 is a schematic diagram of a player 30 implementing functional blocks related to vocal cancellation by the technology of the present invention. The player 30 is provided with a sound source circuit 32 , a signal module 34 and speaker modules 36A, 36B for playing stereo vocal effects. The sound source circuit 32 can be an optical disc reading mechanism, so as to use a reading head 38 to read the signal data of a song from an optical disc 40, and analyze each channel signal Li, Ri in the stereo vocal effect. The signal module 34 is used to implement the vocal cancellation function of the present invention to generate two-channel output signals Lo, Ro after vocal cancellation according to the two-channel signals Li, Ri. Wherein, the signal module 34 is provided with a monophonic processing module 50 and a low-pass module 48; in conjunction with the two-channel channel signals Li, Ri, the signal module 34 is also correspondingly provided with two vocal cancellation modules 42A, 42B And Qualcomm modules 46A, 46B. The speaker modules 36A, 36B can be respectively provided with a digital/analog converter, a power amplifier, a speaker, etc., so as to convert the output signals Lo, Ro into sound waves and play them.
æ¬åæä»¥ä¿¡å·æ¨¡å34è¿è¡äººå£°æ¶é¤çæ å½¢å¯æè¿°å¦ä¸ãä¿¡å·æ¨¡å34ä¸çå声éå¤ç模å50å¯è®¡ç®ä¸¤å£°éä¿¡å·LiãRiçå¹³åï¼äº§çä¸ä¸ªå声é(monochannel)ä¿¡å·Mï¼æ¢å¥è¯è¯´ï¼å³Mï¼(Li+Ri)/2ãèæ¬åæå³å©ç¨æ¤å声éä¿¡å·Mï¼åå«é对å个声éä¿¡å·è¿è¡åèªç人声æ¶é¤ãå¨å¯¹åºäºå£°éä¿¡å·Liç人声æ¶é¤æ¨¡å42Aä¸ï¼å£°éä¿¡å·Liä¼åå声éä¿¡å·Mç¸åï¼ä»¥å£°éä¿¡å·Liåå声éä¿¡å·Mä¹é´çå·®å¼å½¢æä¸ä»ä¿¡å·LVC(å³LVCï¼Li-M)ãå¨å¯¹åºäºäººå£°æ¶é¤æ¨¡å42Bä¸ï¼åæ¯ä»¥å£°éä¿¡å·Riåå声éä¿¡å·Mé´çå·®å¼å½¢æä¸ä»ä¿¡å·RVC(å³RVCï¼Ri-M)ãThe situation that the present invention uses the signal module 34 to cancel the human voice can be described as follows. The mono channel processing module 50 in the signal module 34 can calculate the average of the two channel signals Li and Ri to generate a mono channel signal M; in other words, M=(Li+Ri)/2. However, the present invention utilizes the monophonic signal M to perform respective human voice cancellation for each channel signal. In the human voice elimination module 42A corresponding to the channel signal Li, the channel signal Li will be subtracted from the mono signal M, and the difference between the channel signal Li and the mono signal M will form an intermediate signal LVC (i.e. LVC=Li-M). In the module 42B corresponding to the human voice cancellation, the difference between the channel signal Ri and the monaural signal M is used to form the intermediate signal RVC (ie, RVC=RiâM).
å¦å¾2åç¸å ³è¯´æä¸è®¨è®ºè¿çï¼å¨ç«ä½å£°çå声éä¿¡å·ä¸ï¼äººå£°é å±çé¨åé常æ¯ä»¥ç¸çç大å°å¼ºåº¦æ··é³è³å声éä¿¡å·ä¸ï¼æ æ¬åæä¸ä»¥ä¸¤å£°éä¿¡å·LiãRiçå¹³åæäº§çåºæ¥çå声éä¿¡å·Mï¼åºè¯¥ä¹åå声éä¿¡å·ä¸æ ·å æ¬æç¸çç人声é¨åãèæ¬åæå³æ¯å©ç¨ä¸å声éä¿¡å·å¯¹åºç人声æ¶é¤æ¨¡åå°å声éä¿¡å·ä¸æ¤å声éä¿¡å·é´ç¸åï¼ä»¥åå«å¯¹å声éä¿¡å·è¿è¡äººå£°æ¶é¤ï¼åæå声éä¿¡å·ä¸äººå£°çé¨åãä¸å ¬ç¥ææ¯ä¸åçæ¯ï¼æ¬åææ¯é对å声éä¿¡å·åèªè¿è¡äººå£°æ¶é¤ï¼æä»¥ä¸åç声éä¿¡å·å¨åèªè¿è¡äººå£°æ¶é¤åæäº§çåºæ¥ç对åºä¸ä»ä¿¡å·ï¼èªç¶ä¹ä¼ææå·®å¼ãå°±å¦å¾4ä¸ç宿½ä¾ï¼å£°éä¿¡å·Liå¨äººå£°æ¶é¤å产ççä¸ä»ä¿¡å·LVCçäº(Li-M)ï¼å¦ä¸å£°éä¿¡å·Ri人声æ¶é¤åçä¸ä»ä¿¡å·RVCçäº(Ri-M)ï¼èªç¶å°±åä¸ä»ä¿¡å·LVCä¸åãå¦åé¢è®¨è®ºè¿çï¼ç«ä½äººå£°æææ¯ä»¥å声éä¿¡å·é´çä¿¡å·å·®å¼æ¥å±ç°çï¼è卿¬åæä¸ï¼åå å¨ä¸¤å£°éä¿¡å·LiãRié´å½¢æç«ä½äººå£°æçä¿¡å·å·®å¼ï¼è¿æ¯ä¼ä¿çäºäººå£°æ¶é¤åçä¸ä»ä¿¡å·LVCä¸RVCä¸ãèæ¬åæä¸»è¦å³æ¯å©ç¨ä¸ä»ä¿¡å·LVCãRVCé´çä¿¡å·å·®å¼ï¼äºäººå£°æ¶é¤åå±ç°æ´èäºå ¬ç¥ææ¯ãæ´ä¸°å¯çç«ä½äººå£°ææã请注æï¼å¦å¾1ä¸æç¤ºï¼å¨å ¬ç¥ç人声æ¶é¤ææ¯ä¸ï¼å³ä½¿æ¯ä¸å声éç声éä¿¡å·ï¼è¿æ¯ä»¥åä¸äººå£°æ¶é¤æ¨¡åè¿è¡äººå£°æ¶é¤ï¼ä»¥åä¸ä¸ªä¸ä»ä¿¡å·ä½ä¸ºäººå£°æ¶é¤çåºæ¬ç»æãç¸è¾ä¹ä¸ï¼æ¬åææ¯åå«é对ä¸å声éç声éä¿¡å·è¿è¡åèªç人声æ¶é¤ï¼äº§çåºä¸åçä¸ä»ä¿¡å·ï¼æ´è½ä¿ç忥å¨å声éä¿¡å·ä¸ç¨æ¥å±ç°ç«ä½äººå£°æçä¿¡å·å·®å¼ãAs discussed in Fig. 2 and related descriptions, in each channel signal of stereophonic sound, the part of vocal accompaniment is usually mixed into each channel signal with equal size and intensity, so in the present invention, two-channel The monophonic signal M produced by the average of the signals Li and Ri should also include an equal part of the human voice as the signal of each channel. And the present invention promptly utilizes the human voice elimination module corresponding to each channel signal to subtract between each channel signal and this monophonic signal, to respectively carry out human voice elimination to each channel signal, reduce each channel signal the vocal part. Different from the known technology, the present invention performs vocal cancellation on each channel signal, so the corresponding intermediate signals generated by different channel signals after the vocal cancellation are naturally different. Just like the embodiment in Fig. 4, the intermediary signal LVC produced by the channel signal Li after the human voice is eliminated is equal to (Li-M), and the intermediary signal RVC of the other channel signal Ri after the human voice is eliminated is equal to (Ri-M) , which is naturally different from the intermediary signal LVC. As discussed above, the stereo human voice effect is presented by the signal difference between the channel signals; and in the present invention, the signal difference that originally formed the stereo human voice effect between the two-channel signals Li and Ri will still be Retained in the intermediate signals LVC and RVC after vocal cancellation. However, the present invention mainly uses the signal difference between the intermediate signals LVC and RVC to display a richer stereo human voice effect than the known technology after the human voice is eliminated. Please note that, as shown in Figure 1, in the known vocal cancellation technology, even if the channel signals of different channels are used, the same vocal cancellation module is used for vocal cancellation, and the same intermediate signal is used as the vocal cancellation the basic result of . In contrast, the present invention performs separate human voice cancellation on channel signals of different channels to generate different intermediate signals, and can better preserve the signal differences that were originally used to show the stereo human voice effect in each channel signal .
å¦å¾4æç¤ºï¼å¨æ ¹æ®å声éä¿¡å·LiãRiåå«äº§çä¸ä»ä¿¡å·LVCãRVCåï¼ä¿¡å·æ¨¡å34å°±å¯é对ä¸ä»ä¿¡å·LVCãRVCè¿è¡é«é¢åä½é¢çè¡¥å¿ï¼äº§çè¾åºä¿¡å·LoãRoãå ¶ä¸ï¼é«é模å46Aå¯å°å£°éä¿¡å·Liä¸å±äºé«é¢é¢å¸¦çé¨å(ä¸»è¦æ¯é«äºäººå£°ä¸é¢é¢å¸¦çé¨åï¼è¯·åèå¾2åç¸å ³è¯´æ)ååºï¼æä¸ºé«éä¿¡å·Lhï¼ä½é模å48åè½å°ä¿¡å·Sä¸å±äºä½é¢é¢å¸¦çé¨åååºæä¸ºä½éä¿¡å·Slãæ¤ä¿¡å·Så¯ä»¥æ¯å£°éä¿¡å·LiãRiå ¶ä¸ä¹ä¸ï¼ææ¯å声éä¿¡å·Mã以混é³åå 52Aå°å£°éä¿¡å·Li对åºçé«éä¿¡å·Lhãä¸ä»ä¿¡å·LVC以åä½éä¿¡å·Slæ··é³ç¸å ï¼çæä¸å°±æ¯å¯¹ä¸ä»ä¿¡å·LVCè¿è¡é«é¢åä½é¢è¡¥å¿ï¼å¹¶äº§çåºå¯¹åºäºå£°éä¿¡å·Liçè¾åºä¿¡å·Lo(å³Loï¼LVC+Sl+Lh)ãåçï¼é«é模å46Bè½å°å£°éä¿¡å·Riä¸å±äºé«é¢é¢å¸¦çé¨åååºä¸ºé«éä¿¡å·Rhï¼æ¥å¯¹ä¸ä»ä¿¡å·RVCåé«é¢è¡¥å¿ã以混é³åå 52Bå°å£°éä¿¡å·Ri对åºçä¸ä»ä¿¡å·RVCãé«éä¿¡å·Rh以åä½éä¿¡å·Slç¸å ï¼å°±è½å¯¹ä¸ä»ä¿¡å·RVCè¿è¡é«é¢åä½é¢è¡¥å¿ï¼å½¢æè¾åºä¿¡å·Ro(å³Roï¼RVC+Sl+Rh)ãAs shown in FIG. 4 , after the intermediate signals LVC and RVC are respectively generated according to the channel signals Li and Ri, the signal module 34 can perform high-frequency and low-frequency compensation for the intermediate signals LVC and RVC to generate output signals Lo and Ro. Among them, the high- pass module 46A can take out the part belonging to the high-frequency band (mainly the part higher than the mid-frequency band of the human voice, please refer to FIG. 2 and related descriptions) in the channel signal Li to become the high-pass signal Lh; 48 can take out the part belonging to the low-frequency band in the signal S to become the low-pass signal S1. The signal S can be one of the channel signals Li and Ri, or the mono signal M. Using the mixing unit 52A to mix and add the high-pass signal Lh corresponding to the channel signal Li, the intermediate signal LVC, and the low-pass signal S1, it is equivalent to performing high-frequency and low-frequency compensation on the intermediate signal LVC, and generating a signal corresponding to The output signal Lo of the channel signal Li (ie Lo=LVC+Sl+Lh). Similarly, the high-pass module 46B can extract the part of the channel signal Ri belonging to the high-frequency band as the high-pass signal Rh to perform high-frequency compensation on the intermediate signal RVC. By adding the intermediate signal RVC corresponding to the channel signal Ri, the high-pass signal Rh and the low-pass signal S1 with the mixing unit 52B, high-frequency and low-frequency compensation can be performed on the intermediate signal RVC to form an output signal Ro (that is, Ro=RVC +Sl+Rh).
请继ç»åèå¾5(å¹¶ä¸å¹¶åèå¾4)ãå¾5å³ä¸ºå¾4䏿¬åæä¿¡å·æ¨¡å34è¿ä½å产çåºæ¥çè¾åºä¿¡å·LoãRoçé¢è°±ç¤ºæå¾ï¼å¾5çæ¨ªè½´ä¸ºé¢çï¼çºµè½´ä¸ºé¢è°±å¤§å°ãå»¶ç»å¾2ä¸çä¾åï¼å设å¾4ä¸ç声éä¿¡å·LiãRiå ¶é¢è°±åå«å¦å¾2ä¸çé¢è°±LfãRfæç¤ºï¼åæ¬åæè¾åºä¿¡å·LoãRoçé¢è°±å°±åå«å¦å¾5ä¸çé¢è°±LofãRofæç¤º(ä¸ºäºæ¯è¾æ¹ä¾¿ï¼å¾5ä¸äº¦ä»¥è线å°é¢è°±Rofä¸å®çº¿çé¢è°±Lofç»äºåä¸åæ ç³»ä¸ï¼å¾2ä¸çé¢çflãfhä¸é¢å¸¦BLãBMåBH亦ä¸å¹¶æ 示äºå¾5ä¸)ãç±å¾5ä¸å¯çåºï¼ç±äºæ¬åææ¯é对ä¸åç声éä¿¡å·åå«è¿è¡äººå£°æ¶é¤ï¼æ 忬å声éä¿¡å·é´åå¸äºä½é¢é¢å¸¦BLãä¸é¢é¢å¸¦BMçä¿¡å·å·®å¼ï¼ä¹ä¼ä¿çäºè¾åºä¿¡å·LoãRoä¹é´ï¼ä½¿å¾æ¬åæçè¾åºä¿¡å·LoãRoä¸ä» å¨é«é¢é¢å¸¦BHææå·®å¼ï¼ä¹ä¼å¨ä½é¢ãä¸é¢é¢å¸¦ææå·®å¼ãè¿æ ·ä¸æ¥ï¼å½æ¬åæçææ¾å¨30以æ¬å£°å¨æ¨¡å36Aã36Båå«å°å声éçè¾åºä¿¡å·LoãRoææ¾åºæ¥æ¶ï¼ä½¿ç¨è å°±è½å¬å°æ¯å ¬ç¥ææ¯æ´ä¸°å¯ãæ´å ·ç«ä½äººå£°æçé ä¹ï¼äº«åæ´å¥½çä¼´å±ç¯å¢ãPlease continue to refer to FIG. 5 (and refer to FIG. 4 together). FIG. 5 is a schematic diagram of the frequency spectrum of the output signals Lo and Ro generated after the operation of the signal module 34 of the present invention in FIG. 4; the horizontal axis of FIG. 5 is the frequency, and the vertical axis is the spectrum size. Continuing the example in Fig. 2, assuming that its frequency spectrum of channel signal Li in Fig. 4, Ri is shown in frequency spectrum Lf, Rf in Fig. 2 respectively, then the frequency spectrum of output signal Lo of the present invention, Ro is just respectively in Fig. 5 Shown in spectrum Lof, Rof (for convenience, in Fig. 5, spectrum Rof and the spectrum Lof of solid line are also drawn in the same coordinate system by dashed line; Frequency fl, fh among Fig. 2 and frequency band BL, BM and BH are also same and marked in Figure 5). It can be seen from Fig. 5 that since the present invention performs vocal cancellation for different channel signals, the signal differences between the original channel signals distributed in the low frequency band BL and the intermediate frequency band BM will also be retained in the output Between the signals Lo and Ro, the output signals Lo and Ro of the present invention differ not only in the high frequency band BH, but also in the low frequency and intermediate frequency bands. In this way, when the player 30 of the present invention uses the speaker modules 36A, 36B to play out the output signals Lo and Ro of each channel respectively, the user can hear the audio that is richer and more stereoscopic than the known technology. Soundtrack, enjoy a better accompaniment environment.
æ¬åæäºå¾4ä¸ä¿¡å·æ¨¡å34çå个åè½æ¹åï¼è½åå«ä»¥ç¡¬ä»¶çµè·¯æé§ä½ã软件ç形弿¥å®ç°ã䏾便¥è¯´ï¼ä¸è¬çææ¾å¨é½è®¾æå¯ç¨åºçä¿¡å·å¤ççµè·¯ï¼èæ¬åæå°±å¯ä»¥ç¨é§ä½çæ¹å¼æ¥å®ç°ï¼å°å®ç°æ¬åæææ¯çç¨åºä»£ç å¨åäºææ¾å¨çå å(ä¾å¦æ¯éæå¤±æ§å å)ä¸ï¼å½ä¿¡å·å¤ççµè·¯æ§è¡ç¨åºä»£ç æ¶ï¼å°±è½å®ç°æ¬åæäººå£°æ¶é¤çåè½ãå¦å¤ï¼ä¾å¦è®¡ç®æºä¸å¸¸ä¼ä½¿ç¨ææ¾ç¨åºé åéå½çå¤å´è£ ç½®(ä¾å¦å£°å¡ãå 驱)ææ¾ææ²é³ä¹ï¼æ¬åæä¹å¯ä»¥è½¯ä»¶çæ¹å¼å®ç°äºæ¤ç±»ææ¾ç¨åºä¸ï¼ä»¥æ¶é¤äººå£°ï¼æä¾ä¼´å±çèæ¯é ä¹ã请åèå¾6(å¹¶ä¸å¹¶åèå¾4)ãå¾6ä¸çç¨åºä»£ç 100峿¯ç¨æ¥å®ç°æ¬åæäººå£°æ¶é¤çåè½ï¼å ¶ä¸ï¼æ°ç»åæçåéx_Lãx_R峿¯ç¨æ¥ä»£è¡¨ä¸å声éç声éä¿¡å·LiãRi(å¦å¾4)ï¼æ°ç»åéMono代表å声éä¿¡å·Mï¼åç¨åºHi_Passç¨æ¥å®ç°é«é滤波模åçåè½ï¼Low_Passåç¨æ¥å®ç°ä½é模åçåè½ï¼æ°ç»åéh_Lãh_Råå«ä»£è¡¨é«éä¿¡å·LhãRhï¼æ°ç»åélow代表ä½éä¿¡å·Slï¼èæ°ç»åéL_outãR_outä¹å°±åå«ä»£è¡¨äºè¾åºä¿¡å·LoåRoãç¨åºä»£ç 100ä¸çæ´æ°ææ jåç¨æ¥ä»£è¡¨ä¸æ°ç»åé第j个å ç´ çå¼ï¼ä¹å°±æ¯è¯¥æ°ç»åé对åºçä¿¡å·äºç¬¬j个æ¶ç¹çåæ ·å¼ãå¦ç¨åºä»£ç 100æç¤ºï¼åéMono代表çå声éä¿¡å·ä¸ºå声éä¿¡å·å¯¹åºåéx_Lãx_Rçå¹³åï¼åéx_Lãx_Ré«é滤波çç»æåå«å¨åäºåéh_Lãh_Rã以åéx_R代表ç声éä¿¡å·Riä½ä¸ºå¾4ä¸çä¿¡å·Sï¼èä½é滤波æäº§çä½éä¿¡å·Slï¼ä¹å°±ç±åélowæ¥ä»£è¡¨ãæåï¼ä¸ä»ä¿¡å·LVCãPVCå°±åå«ç±ç¨åºä»£ç 100ä¸çè¿ç®x_L[j]-Mono[j]ãx_R[j]-Mono[j]æ¥å®ç°ï¼åå ä¸ä½é¢è¡¥å¿çåélowãé«é¢è¡¥å¿çåéh_Lãh_Rï¼å°±è½äº§çæ¬åæäººå£°æ¶é¤åçè¾åºä¿¡å·ï¼åå«å¨åäºåéL_outãR_outä¸ãEach functional block of the signal module 34 in FIG. 4 of the present invention can be realized in the form of a hardware circuit, firmware, or software. For example, a general player is provided with a programmable signal processing circuit, and the present invention can be realized in the form of firmware, and the program code for realizing the technology of the present invention is stored in the memory of the player (such as a non-volatile When the signal processing circuit executes the program code, the function of vocal elimination of the present invention can be realized. In addition, for example, playing programs are often used in computers to play songs and music with appropriate peripheral devices (such as sound cards, CD-ROMs). The present invention can also be implemented in such playing programs in the form of software to eliminate human voices and provide background music for accompaniment. Please refer to FIG. 6 (and refer to FIG. 4 together). The program code 100 among Fig. 6 promptly is to be used for realizing the function of vocal elimination of the present invention; Wherein, the variable x_L of array type, x_R promptly is used for representing the channel signal Li of different channels, Ri (as Fig. 4) , the array variable Mono represents the mono signal M, the subroutine Hi_Pass is used to realize the function of the high-pass filter module, and Low_Pass is used to realize the function of the low-pass module; the array variables h_L and h_R represent the high-pass signal Lh and Rh respectively, and the array variable low represents the low-pass signal S1, and the array variables L_out and R_out represent the output signals Lo and Ro respectively. The integer index j in the program code 100 is used to represent the value of the jth element of an array variable, that is, the sampled value of the signal corresponding to the array variable at the jth time point. As shown in the program code 100, the monaural signal represented by the variable Mono is the average of the corresponding variables x_L and x_R of each channel signal, and the high-pass filtering results of the variables x_L and x_R are respectively stored in the variables h_L and h_R. The channel signal Ri represented by the variable x_R is taken as the signal S in FIG. 4 , and the low-pass signal S1 generated by the low-pass filter is also represented by the variable low. Finally, the intermediary signals LVC and PVC are respectively realized by the calculations x_L[j]-Mono[j] and x_R[j]-Mono[j] in the program code 100, plus the low frequency compensation variable low and high frequency compensation The variables h_L and h_R of the present invention can produce the output signal after the vocal cancellation of the present invention, which are stored in the variables L_out and R_out respectively.
å¨å ¬ç¥ç人声æ¶é¤ææ¯ä¸ï¼ç±äºä¸å声éçè¾åºä¿¡å·é½æ¯ä»¥ç¸åçä¸ä»ä¿¡å·ä½ä¸ºäººå£°æ¶é¤ç主è¦ç»æï¼é¤äºé«é¢è¡¥å¿æå¼å ¥çä¿¡å·å·®å¼å¤ï¼è¾åºä¿¡å·å¨ä½é¢ãä¸é¢é¢å¸¦çæ²¡æææ¾çä¿¡å·å·®å¼ï¼æ ç°æææ¯æäº§çåºæ¥çå声éè¾åºä¿¡å·æ æ³å±ç°åºè¾ä½³çç«ä½äººå£°ææãç¸è¾ä¹ä¸ï¼æ¬åæåæ¯é对ä¸å声éçä¿¡å·åèªè¿è¡å¯¹åºç人声æ¶é¤ï¼æ å¾ä»¥å¨å声éçè¾åºä¿¡å·ä¸è¾ä¸ºå®æ´å°ä¿ç忥å声éä¿¡å·é´çä¿¡å·å·®å¼ï¼å½ä¸å声éçè¾åºä¿¡å·ç±ä¸åçæ¬å£°å¨æ¨¡åææ¾åºæ¥åï¼å°±è½å±ç°èäºç°æææ¯çç«ä½äººå£°æï¼è®©ä½¿ç¨è å¾ä»¥å¨ä¼´å±ç³»ç»ä¸äº«åå°æ´å¥½çç«ä½å£°é ä¹ãæ¬åæçææ¯é¤äºå¯ä»¥è¿ç¨äºå¾4ä¸çå çææ¾å¨ä¹å¤ï¼è¿å¯ä»¥åºç¨äºå ¶å®ç§ç±»çææ¾å¨ï¼ä¸¾ä¾æ¥è¯´ï¼å¾4ä¸ç鳿ºçµè·¯å¯ä»¥æ¯ä¸ç½ç»æ¨¡åï¼è½éè¿æçº¿ææ çº¿ç½ç»åå¾ææ²æ°æ®ä¿¡å·ï¼è§£æåºå声éä¿¡å·ãIn the known human voice cancellation technology, since the output signals of different channels all use the same intermediate signal as the main result of human voice cancellation, except for the signal difference introduced by high frequency compensation, the output signal is in the low frequency and intermediate frequency bands. There is no obvious signal difference, so the output signals of each channel produced by the prior art cannot show a better stereo vocal effect. In contrast, the present invention performs corresponding human voice cancellation on the signals of different channels, so the signal differences between the original channel signals can be relatively completely preserved in the output signals of each channel; After the output signals of the channel are played by different speaker modules, the stereo vocal effect better than that of the prior art can be displayed, so that the user can enjoy a better stereo soundtrack in the accompaniment system. The technology of the present invention can be applied to other types of players besides being applicable to the CD player in Fig. 4; for example, the sound source circuit in Fig. 4 can be a network module, which can be The network obtains the song data signal, and analyzes the signal of each channel.
ä»¥ä¸æè¿°ä» ä¸ºæ¬åæçè¾ä½³å®æ½ä¾ï¼å¡ä¾æ¬åæç³è¯·ä¸å©èå´æåçåçååä¸ä¿®é¥°ï¼çåºå±æ¬åæä¸å©çæ¶µçèå´ãThe above descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made according to the scope of the patent application of the present invention shall fall within the scope of the patent of the present invention.
Claims (12) Translated from Chinese1.ä¸ç§å¨äººå£°æ¶é¤æ¶äº§çç«ä½å£°çæ¹æ³ï¼ä»¥æ ¹æ®ä¸ä¸ªç¬¬ä¸å£°éä¿¡å·åä¸ä¸ªç¬¬äºå£°éä¿¡å·æä¾ä¸ä¸ªç¬¬ä¸è¾åºä¿¡å·åä¸ä¸ªç¬¬äºè¾åºä¿¡å·ï¼è¯¥æ¹æ³å 嫿ï¼1. A method for producing stereophonic sound when human voice is eliminated, to provide a first output signal and a second output signal according to a first sound channel signal and a second sound channel signal; the method comprises: æ ¹æ®è¯¥ç¬¬ä¸å£°éä¿¡å·å该第äºå£°éä¿¡å·çåæç»æäº§çä¸ä¸ªå声éä¿¡å·ï¼generating a mono signal according to a synthesis result of the first channel signal and the second channel signal; 第ä¸é«é滤波æ¥éª¤ï¼ç¨äºæ ¹æ®ä¸ä¸ªé¢è®¾çé«é¢é¢å¸¦å¯¹è¯¥ç¬¬ä¸å£°éä¿¡å·è¿è¡é«é滤波ï¼å¹¶äº§çä¸ä¸ªå¯¹åºç第ä¸é«éä¿¡å·ï¼ä»¥ä½¿è¯¥ç¬¬ä¸é«éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥é«é¢é¢å¸¦ï¼The first high-pass filtering step is used to perform high-pass filtering on the first channel signal according to a preset high-frequency band, and generate a corresponding first high-pass signal, so that the signal frequency of the first high-pass signal is substantially focus on the high frequency band; 第äºé«é滤波æ¥éª¤ï¼ç¨äºæ ¹æ®è¯¥é«é¢é¢å¸¦å¯¹è¯¥ç¬¬äºå£°éä¿¡å·è¿è¡é«é滤波ï¼å¹¶äº§çä¸ä¸ªå¯¹åºç第äºé«éä¿¡å·ï¼ä»¥ä½¿è¯¥ç¬¬äºé«éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥é«é¢é¢å¸¦ï¼The second high-pass filtering step is used to perform high-pass filtering on the second channel signal according to the high-frequency band, and generate a corresponding second high-pass signal, so that the signal frequency of the second high-pass signal is substantially concentrated on the high frequency band; 第ä¸äººå£°æ¶é¤æ¥éª¤ï¼ç¨äºæ ¹æ®è¯¥ç¬¬ä¸å£°éä¿¡å·å该å声éä¿¡å·é´çå·®å¼äº§ç第ä¸ä¸ä»ä¿¡å·ï¼The first human voice elimination step is used to generate a first intermediate signal according to the difference between the first channel signal and the mono signal; 第äºäººå£°æ¶é¤æ¥éª¤ï¼ç¨äºæ ¹æ®è¯¥ç¬¬äºå£°éä¿¡å·å该å声éä¿¡å·é´çå·®å¼äº§ç第äºä¸ä»ä¿¡å·ï¼The second human voice elimination step is used to generate a second intermediate signal according to the difference between the second channel signal and the mono signal; ç¬¬ä¸æ··é³æ¥éª¤ï¼ç¨äºå°è¯¥ç¬¬ä¸ä¸ä»ä¿¡å·å该第ä¸é«éä¿¡å·æ··é³ä»¥äº§ç该第ä¸è¾åºä¿¡å·ï¼ä»¥åa first mixing step for mixing the first intermediate signal and the first high-pass signal to generate the first output signal; and ç¬¬äºæ··é³æ¥éª¤ï¼ç¨äºå°è¯¥ç¬¬äºä¸ä»ä¿¡å·å该第äºé«éä¿¡å·æ··é³ä»¥äº§ç该第äºè¾åºä¿¡å·ï¼ä½¿è¯¥ç¬¬ä¸è¾åºä¿¡å·å该第äºè¾åºä¿¡å·æ¤ä¸¤ä¿¡å·ä¸é¢çå¨è¯¥é«é¢é¢å¸¦ä¹å¤çé¨å亦æå®è´¨ä¸çå·®å¼ãThe second mixing step is used to mix the second intermediate signal and the second high-pass signal to generate the second output signal, so that the frequencies of the first output signal and the second output signal are within the frequency of the two signals Substantial differences also exist outside the high-frequency band. 2.妿å©è¦æ±1æè¿°çæ¹æ³ï¼å ¶å¦å 嫿ï¼2. The method of claim 1, further comprising: æ ¹æ®ä¸ä¸ªé¢è®¾çä½é¢é¢å¸¦äº§çä¸ä½éä¿¡å·ï¼ä½¿è¯¥ä½éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥ä½é¢é¢å¸¦ï¼generating a low-pass signal according to a predetermined low-frequency band such that the signal frequency of the low-pass signal is substantially concentrated in the low-frequency band; èå¨è¿è¡è¯¥ç¬¬ä¸æ··é³æ¥éª¤æ¶ï¼æ¯å°è¯¥ç¬¬ä¸ä¸ä»ä¿¡å·ã该第ä¸é«éä¿¡å·å该ä½éä¿¡å·è¿è¡æ··é³ä»¥äº§ç该第ä¸è¾åºä¿¡å·ï¼å¨è¿è¡è¯¥ç¬¬äºæ··é³æ¥éª¤æ¶ï¼æ¯å°è¯¥ç¬¬äºä¸ä»ä¿¡å·ã该第äºé«éä¿¡å·å该ä½éä¿¡å·è¿è¡æ··é³ä»¥äº§ç该第äºè¾åºä¿¡å·ãWhen performing the first mixing step, the first intermediate signal, the first high-pass signal and the low-pass signal are mixed to generate the first output signal; when performing the second mixing step , mixing the second intermediate signal, the second high-pass signal and the low-pass signal to generate the second output signal. 3.妿å©è¦æ±2æè¿°çæ¹æ³ï¼å ¶ä¸ï¼å¨æ ¹æ®è¯¥ä½é¢é¢å¸¦äº§ç该ä½éä¿¡å·æ¶ï¼æ ¹æ®è¯¥ä½é¢é¢å¸¦å¯¹è¯¥ç¬¬ä¸å£°éä¿¡å·æè¯¥ç¬¬äºå£°éä¿¡å·è¿è¡ä½é滤波以产ç该ä½éä¿¡å·ã3. The method according to claim 2, wherein, when generating the low-pass signal according to the low-frequency band, performing low-pass filtering on the first channel signal or the second channel signal according to the low-frequency band to generate the low pass signal. 4.妿å©è¦æ±2æè¿°çæ¹æ³ï¼å ¶ä¸ï¼å¨æ ¹æ®è¯¥ä½é¢é¢å¸¦äº§ç该ä½éä¿¡å·æ¶ï¼æ¯æ ¹æ®è¯¥ä½é¢é¢å¸¦å¯¹è¯¥å声éä¿¡å·è¿è¡ä½é滤波以产ç该ä½éä¿¡å·ã4. The method of claim 2, wherein when generating the low-pass signal according to the low-frequency band, the low-pass signal is generated by performing low-pass filtering on the monaural signal according to the low-frequency band. 5.妿å©è¦æ±1æè¿°çæ¹æ³ï¼å ¶ä¸ï¼è¯¥é«é¢é¢å¸¦çé¢å¸¦èå´é«äºäººå£°çé¢å¸¦èå´ã5. The method as claimed in claim 1, wherein the frequency range of the high frequency band is higher than that of human voice. 6.ä¸ç§ææ¾å¨ï¼å ¶å 嫿ï¼6. A player comprising: ä¸ä¸ªé³æºçµè·¯ï¼ç¨æ¥æä¾ä¸ä¸ªç¬¬ä¸å£°éä¿¡å·åä¸ä¸ªç¬¬äºå£°éä¿¡å·ï¼ä»¥åa sound source circuit for providing a first channel signal and a second channel signal; and ä¸ä¸ªä¿¡å·æ¨¡åï¼ç¨æ¥å¯¹è¯¥ç¬¬ä¸å£°éä¿¡å·å该第äºå£°éä¿¡å·è¿è¡äººå£°æ¶é¤å¤çå¹¶æä¾ç«ä½å£°ç第ä¸è¾åºä¿¡å·å第äºè¾åºä¿¡å·ï¼è¯¥ä¿¡å·æ¨¡åå 嫿ï¼A signal module is used to perform vocal cancellation processing on the first channel signal and the second channel signal and provide a stereo first output signal and a second output signal; the signal module includes: ä¸ä¸ªå声éå¤ç模åï¼ç¨æ¥æ ¹æ®è¯¥ç¬¬ä¸å£°éä¿¡å·å该第äºå£°éä¿¡å·çåæç»æäº§çä¸ä¸ªå声éä¿¡å·ï¼A monophonic processing module, used to generate a monophonic signal according to the synthesis result of the first channel signal and the second channel signal; ä¸ä¸ªç¬¬ä¸é«é模åï¼ç¨æ¥æ ¹æ®ä¸ä¸ªé¢è®¾çé«é¢é¢å¸¦å¯¹è¯¥ç¬¬ä¸å£°éä¿¡å·è¿è¡é«é滤波ï¼å¹¶äº§çä¸ä¸ªå¯¹åºç第ä¸é«éä¿¡å·ï¼ä»¥ä½¿è¯¥ç¬¬ä¸é«éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥é«é¢é¢å¸¦ï¼A first high-pass module, used for high-pass filtering the first channel signal according to a preset high-frequency band, and generating a corresponding first high-pass signal, so that the signal frequency of the first high-pass signal is substantially focus on the high frequency band; ä¸ä¸ªç¬¬äºé«é模åï¼ç¨æ¥æ ¹æ®è¯¥é«é¢é¢å¸¦å¯¹è¯¥ç¬¬äºå£°éä¿¡å·è¿è¡é«é滤波ï¼å¹¶äº§çä¸ä¸ªå¯¹åºç第äºé«éä¿¡å·ï¼ä»¥ä½¿è¯¥ç¬¬äºé«éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥é«é¢é¢å¸¦ï¼A second high-pass module, used for high-pass filtering the second channel signal according to the high-frequency band, and generating a corresponding second high-pass signal, so that the signal frequency of the second high-pass signal is substantially concentrated on the high frequency band; ä¸ä¸ªç¬¬ä¸äººå£°æ¶é¤æ¨¡åï¼ç¨æ¥æ ¹æ®è¯¥ç¬¬ä¸å£°éä¿¡å·å该å声éä¿¡å·é´çå·®å¼äº§çä¸ä¸ªç¬¬ä¸ä¸ä»ä¿¡å·ï¼A first vocal cancellation module, used to generate a first intermediate signal according to the difference between the first channel signal and the mono signal; ä¸ä¸ªç¬¬äºäººå£°æ¶é¤æ¨¡åï¼ç¨æ¥æ ¹æ®è¯¥ç¬¬äºå£°éä¿¡å·å该å声éä¿¡å·é´çå·®å¼äº§çä¸ä¸ªç¬¬äºä¸ä»ä¿¡å·ï¼A second vocal cancellation module, used to generate a second intermediate signal according to the difference between the second channel signal and the mono signal; ä¸ä¸ªç¬¬ä¸æ··é³åå ï¼ç¨æ¥å°è¯¥ç¬¬ä¸ä¸ä»ä¿¡å·å该第ä¸é«éä¿¡å·æ··é³ä»¥äº§ç该第ä¸è¾åºä¿¡å·ï¼ä»¥åa first mixing unit for mixing the first intermediate signal and the first high-pass signal to generate the first output signal; and ä¸ä¸ªç¬¬äºæ··é³åå ï¼ç¨æ¥å°è¯¥ç¬¬äºä¸ä»ä¿¡å·å该第äºé«éä¿¡å·æ··é³ä»¥äº§ç该第äºè¾åºä¿¡å·ï¼ä½¿è¯¥ç¬¬ä¸è¾åºä¿¡å·å该第äºè¾åºä¿¡å·æ¤ä¸¤ä¿¡å·ä¸é¢çå¨è¯¥é«é¢é¢å¸¦ä¹å¤çé¨å亦æå®è´¨ä¸çå·®å¼ãA second mixing unit is used to mix the second intermediate signal and the second high-pass signal to generate the second output signal, so that the frequencies of the first output signal and the second output signal are between Parts outside the high-frequency band also have substantial differences. 7.妿å©è¦æ±6æè¿°çææ¾å¨ï¼è¿å 嫿ï¼7. The player according to claim 6, further comprising: ä¸ä¸ªä½é模åï¼ç¨æ¥æ ¹æ®ä¸ä¸ªé¢è®¾çä½é¢é¢å¸¦äº§çä¸ä¸ªä½éä¿¡å·ï¼è¯¥ä½éä¿¡å·çä¿¡å·é¢çå®è´¨éä¸äºè¯¥ä½é¢é¢å¸¦ï¼a low-pass module, used to generate a low-pass signal according to a preset low-frequency band, the signal frequency of the low-pass signal is substantially concentrated in the low-frequency band; èè¯¥ç¬¬ä¸æ··é³åå ç¨äºå°è¯¥ç¬¬ä¸ä¸ä»ä¿¡å·ã该第ä¸é«éä¿¡å·å该ä½éä¿¡å·è¿è¡æ··é³ä»¥äº§ç该第ä¸è¾åºä¿¡å·ï¼è¯¥ç¬¬äºæ··é³åå åç¨äºå°è¯¥ç¬¬äºä¸ä»ä¿¡å·ã该第äºé«éä¿¡å·å该ä½éä¿¡å·è¿è¡æ··é³ä»¥äº§ç该第äºè¾åºä¿¡å·ãAnd the first mixing unit is used for mixing the first intermediate signal, the first high-pass signal and the low-pass signal to generate the first output signal; the second mixing unit is used for the second The two intermediate signals, the second high-pass signal and the low-pass signal are mixed to generate the second output signal. 8.妿å©è¦æ±7æè¿°çææ¾å¨ï¼å ¶ä¸ï¼è¯¥ä½é模åç¨äºæ ¹æ®è¯¥ä½é¢é¢å¸¦å¯¹è¯¥ç¬¬ä¸å£°éä¿¡å·æè¯¥ç¬¬äºå£°éä¿¡å·è¿è¡ä½é滤波以产ç该ä½éä¿¡å·ã8. The player as claimed in claim 7, wherein the low-pass module is configured to low-pass filter the first channel signal or the second channel signal according to the low frequency band to generate the low-pass signal. 9.妿å©è¦æ±7æè¿°çææ¾å¨ï¼å ¶ä¸ï¼è¯¥ä½é模åç¨äºæ ¹æ®è¯¥ä½é¢é¢å¸¦å¯¹è¯¥å声éä¿¡å·è¿è¡ä½é滤波以产ç该ä½éä¿¡å·ã9. The player as claimed in claim 7, wherein the low-pass module is configured to low-pass filter the mono signal according to the low-frequency band to generate the low-pass signal. 10.妿å©è¦æ±6æè¿°çææ¾å¨ï¼å ¶ä¸ï¼è¯¥é«é¢é¢å¸¦çé¢å¸¦èå´é«äºäººå£°çé¢å¸¦èå´ã10. The player as claimed in claim 6, wherein the frequency range of the high frequency band is higher than that of human voice. 11.妿å©è¦æ±6æè¿°çææ¾å¨ï¼å ¶ä¸ï¼è¯¥é³æºçµè·¯å¯ä»ä¸å ççä¸è¯»åä¿¡å·ä»¥å½¢æè¯¥ç¬¬ä¸å£°éä¿¡å·å该第äºå£°éä¿¡å·ã11. The player as claimed in claim 6, wherein the audio source circuit can read signals from an optical disc to form the first channel signal and the second channel signal. 12.妿å©è¦æ±6æè¿°çææ¾å¨ï¼è¿å 嫿ï¼12. The player according to claim 6, further comprising: ä¸ä¸ªç¬¬ä¸æ¬å£°å¨æ¨¡åï¼ç¨æ¥å°è¯¥ç¬¬ä¸è¾åºä¿¡å·è½¬æ¢ä¸ºå£°æ³¢ææ¾åºæ¥ï¼ä»¥åA first loudspeaker module, used to convert the first output signal into a sound wave and play it; and ä¸ä¸ªç¬¬äºæ¬å£°å¨æ¨¡åï¼ç¨æ¥å°è¯¥ç¬¬äºè¾åºä¿¡å·è½¬æ¢ä¸ºå£°æ³¢ææ¾åºæ¥ãA second loudspeaker module is used to convert the second output signal into a sound wave and play it.
CNB031557627A 2003-09-01 2003-09-01 Stereo human voice cancellation method and related device Expired - Fee Related CN100353813C (en) Priority Applications (1) Application Number Priority Date Filing Date Title CNB031557627A CN100353813C (en) 2003-09-01 2003-09-01 Stereo human voice cancellation method and related device Applications Claiming Priority (1) Application Number Priority Date Filing Date Title CNB031557627A CN100353813C (en) 2003-09-01 2003-09-01 Stereo human voice cancellation method and related device Publications (2) Family ID=34598192 Family Applications (1) Application Number Title Priority Date Filing Date CNB031557627A Expired - Fee Related CN100353813C (en) 2003-09-01 2003-09-01 Stereo human voice cancellation method and related device Country Status (1) Families Citing this family (6) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title KR100636248B1 (en) * 2005-09-26 2006-10-19 ì¼ì±ì ì주ìíì¬ Vocal Removal Device and Method CN101577117B (en) * 2009-03-12 2012-04-11 æ é¡ä¸æå¾®çµåæéå ¬å¸ Extraction method and device of accompaniment music CN101944355B (en) * 2009-07-03 2013-05-08 æ·±å³Tclæ°ææ¯æéå ¬å¸ Obbligato music generation device and realization method thereof CN101894559B (en) * 2010-08-05 2012-06-06 å±è®¯éä¿¡ï¼ä¸æµ·ï¼æéå ¬å¸ Audio processing method and device thereof CN109429167B (en) * 2017-08-31 2020-10-13 çæ±å导ä½è¡ä»½æéå ¬å¸ Audio enhancement device and method US10491179B2 (en) * 2017-09-25 2019-11-26 Nuvoton Technology Corporation Asymmetric multi-channel audio dynamic range processing Citations (3) * Cited by examiner, â Cited by third party Publication number Priority date Publication date Assignee Title CN1277530A (en) * 1999-06-09 2000-12-20 çå®¶è²å©æµ¦çµåæéå ¬å¸ Stereo signal processing device TW437256B (en) * 1999-03-12 2001-05-28 Ind Tech Res Inst Apparatus and method for virtual sound enhancement CN1327705A (en) * 1999-06-22 2001-12-19 çå®¶è²å©æµ¦çµåæéå ¬å¸ Stereophonic signal processing apparatusOwner name: MEDIATEK INC.
Free format text: FORMER OWNER: YANGZHI SCIENCE + TECHNOLOGY CO. LTD.
Effective date: 20050408
2005-05-11 C10 Entry into substantive examination 2005-05-11 C41 Transfer of patent application or patent right or utility model 2005-05-11 SE01 Entry into force of request for substantive examination 2005-05-11 TA01 Transfer of patent application rightEffective date of registration: 20050408
Address after: Hsinchu Science Industrial Park, Hsinchu County, Taiwan
Applicant after: MEDIATEK Inc.
Address before: Taipei County of Taiwan Province
Applicant before: ALI CORPORATION
2007-12-05 C14 Grant of patent or utility model 2007-12-05 GR01 Patent grant 2023-09-15 CF01 Termination of patent right due to non-payment of annual fee 2023-09-15 CF01 Termination of patent right due to non-payment of annual feeGranted publication date: 20071205
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4