RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://patents.google.com/patent/CN103618986B/en below:

CN103618986B - The extracting method of source of sound acoustic image body and device in a kind of 3d space

CN103618986B - The extracting method of source of sound acoustic image body and device in a kind of 3d space - Google PatentsThe extracting method of source of sound acoustic image body and device in a kind of 3d space Download PDF Info

Publication number: CN103618986B
Authority: CN; China
Prior art keywords: acoustic image; centerdot; source; sound; eta
Prior art date: 2013-11-19
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Active

Application number

CN201310580928.7A

Other languages

Chinese (zh)

Other versions

CN103618986A (en

Inventor

æ±æ¸¸

é»èè¹

çæ

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Shenzhen Xinyidai Information Technology Research Institute Co Ltd

Original Assignee

Shenzhen Xinyidai Information Technology Research Institute Co Ltd

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2013-11-19

Filing date

2013-11-19

Publication date

2015-09-30

2013-11-19 Application filed by Shenzhen Xinyidai Information Technology Research Institute Co Ltd filed Critical Shenzhen Xinyidai Information Technology Research Institute Co Ltd

2013-11-19 Priority to CN201310580928.7A priority Critical patent/CN103618986B/en

2014-03-05 Publication of CN103618986A publication Critical patent/CN103618986A/en

2014-06-04 Priority to US14/422,070 priority patent/US9646617B2/en

2014-06-04 Priority to PCT/CN2014/079177 priority patent/WO2015074400A1/en

2015-09-30 Application granted granted Critical

2015-09-30 Publication of CN103618986B publication Critical patent/CN103618986B/en

Status Active legal-status Critical Current

2033-11-19 Anticipated expiration legal-status Critical

Links

238000000034 method Methods 0.000 title claims abstract description 17
238000000605 extraction Methods 0.000 claims description 12
238000006243 chemical reaction Methods 0.000 claims description 7
238000011084 recovery Methods 0.000 abstract description 3
230000004888 barrier function Effects 0.000 abstract description 2
238000005516 engineering process Methods 0.000 description 7
230000000694 effects Effects 0.000 description 5
238000000354 decomposition reaction Methods 0.000 description 2
230000007812 deficiency Effects 0.000 description 1
238000010586 diagram Methods 0.000 description 1
230000008447 perception Effects 0.000 description 1
210000000697 sensory organ Anatomy 0.000 description 1
230000035939 shock Effects 0.000 description 1
230000001360 synchronised effect Effects 0.000 description 1

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMSÂ
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMSÂ
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction

Landscapes

Engineering & Computer Science (AREA)
Physics & Mathematics (AREA)
Signal Processing (AREA)
Acoustics & Sound (AREA)
Mathematical Physics (AREA)
Computational Linguistics (AREA)
Health & Medical Sciences (AREA)
Audiology, Speech & Language Pathology (AREA)
Human Computer Interaction (AREA)
Multimedia (AREA)
Stereophonic System (AREA)

Abstract

The invention provides extracting method and the device of source of sound acoustic image body in a kind of 3d space, comprise the locus determining source of sound acoustic image, according to the locus (Ï, Î¼, Î·) of gained source of sound acoustic image, determine the loud speaker of source of sound acoustic image place spatial proximity; Calculate the correlation of selected loud speaker each sound channel signal in the horizontal and vertical directions, obtain the parameter set { IC of acoustic image body _h, IC _v, Min{IC _h, IC _vand preserve, wherein Min{IC _h, IC _vbe IC _hand IC _vin smaller value.The expression parameter that the present invention obtains acoustic image body is that the size recovering source of sound acoustic image in 3D live audio system accurately provides technical guarantee, solves the technical barrier that the acoustic image of current 3D Audio recovery is too narrow and small.

Description The extracting method of source of sound acoustic image body and device in a kind of 3d space

Technical field

The invention belongs to field of acoustics, particularly relate to extracting method and the device of source of sound acoustic image body in 3d space.

Background technology

In the end of the year 2009,3D film " A Fanda " climbs up top box-office value in more than 30 country in the whole world, and at the beginning of 2010 9 months, accumulative box office, the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain so brilliant box office achievement, be it have employed brand-new 3D special effect making technology and bring the effect of the shock on people's sense organ.The gorgeous picture that " A Fanda " represents and sound effect true to nature have not only shaken spectators, also made industry have the asserting of " film enters the 3D epoch ".Moreover, it also will expedite the emergence of technology and the standard of more relevant video display, recording, broadcasting aspect.In the international consumption electronic product exhibition that in January, 2010 holds at Las Vegas, US, the TV new product band that each colour TV giant reveals one after another gives people new expectation---and 3D has become the new focus of global Ge great colour TV manufacturer competition.Want to reach better audiovisual experience, need the 3D sound field auditory effect synchronous with 3D video content, could really reach hearing experience on the spot in person.Early stage 3D audio system (as Ambisonics system), due to its complex structure, requires higher to collection and playback apparatus, is difficult to promote practicality.Japanese NHK company is proposed 22.2 sound channel systems in recent years, by the 3D sound field that 24 loudspeaker reproduction are original.MPEG in 2011 sets about the international standard formulating 3D audio frequency, wishes by fewer loud speaker or earphone to reduce 3D sound field, so that can by this Technique Popularizing to ordinary family user while reaching certain code efficiency.The 3D audio frequency and video technology study hotspot having become multimedia technology field and the important directions further developed as can be seen here.

But traditional 3D audio frequency only focuses on locus or the physics sound field of recovering source of sound, and not for the size of the acoustic image of source of sound, particularly acoustic image body recovers.In order to reach better hearing effect, need the size recovering source of sound acoustic image accurately, simultaneously for the ease of the process of the systems such as encoding and decoding, also need to find the expression parameter expressing source of sound acoustic image body, so just by also perfectly original sound image can be recovered after the process of 3D audio system.

Summary of the invention

The present invention is directed to the deficiencies in the prior art, propose extracting method and the device of source of sound acoustic image body in a kind of 3d space.

Technical scheme provided by the invention provides the extracting method of source of sound acoustic image body in a kind of 3d space, comprises the following steps:

Step 1, determine the locus of source of sound acoustic image, implementation is as follows,

The signal of each sound channel is carried out time-frequency conversion, identical sub-band division is carried out to each sound channel; Take auditor as spheric coordinate system initial point, to being positioned at horizontal angle Î¼ _iwith elevation angle Î· _iloud speaker, if vector p _i(k, n) represents the time-frequency representation of corresponding signal,

p i ( k , n ) = g i ( k , n ) cos μ i · cos η i sin μ i · cos η i sin η i

Wherein, i is the index value of loud speaker, and k is band index, and n is time domain frame number index, g _i(k, n) is the strength information of frequency domain point;

The horizontal angle Î¼ of source of sound acoustic image and elevation angle Î· adopts following formulae discovery,

tan μ ( k , n ) = Σ i = 1 N g i ( k , n ) · cos μ i · cos η i Σ i = 1 N g i ( k , n ) · sin μ i · cos η i

tan η ( k , n ) = [ Σ u = 1 N g i ( k , n ) · cos μ i · cos η i ] 2 + [ Σ i = 1 N g i ( k , n ) · sin μ i · cos η i ] 2 Σ i = 1 N g i ( k , n ) · sin η i

Wherein, N is the sum of loud speaker, and the value of i is 1,2 ... N, Î¼ (k, n), Î· (k, n) the i.e. horizontal angle Î¼ of the n-th frame kth frequency band source of sound acoustic image and elevation angle Î·;

Source of sound acoustic image gets the average distance of all loud speakers to auditor to the distance Ï of spheric coordinate system initial point;

Step 2, according to the locus (Ï, Î¼, Î·) of step 1 gained source of sound acoustic image, determines the loud speaker of source of sound acoustic image place spatial proximity;

Step 3, the correlation of each sound channel signal in the horizontal and vertical directions of loud speaker selected by calculation procedure 2, implementation is as follows:

Selected loud speaker is divided into left and right two parts according to acoustic image position, with the middle vertical plane at source of sound acoustic image and auditor place for projection plane, calculates the right and left signal component sum vertical with this projection plane respectively, be designated as P _land P _r, calculate the correlation IC of the right and left signal _hit is as follows,

IC H = cov ( P L , P R ) cov ( P L , P L ) · cov ( P R , P R )

Selected loud speaker is divided into upper and lower two parts according to acoustic image position, with the plane at source of sound acoustic image and auditor place for projection plane, calculates the component sum that upper and lower both sides signal is vertical with this projection plane respectively, be designated as P _uand P _d, calculate the correlation IC of upper and lower both sides signal _vit is as follows,

IC V = cov ( P U , P D ) cov ( P U , P U ) · cov ( P D , P D )

Step 4, obtains the parameter set { IC of acoustic image body _h, IC _v, Min{IC _h, IC _vand preserve, wherein Min{IC _h, IC _vbe IC _hand IC _vin smaller value.

The present invention is the corresponding extraction element providing source of sound acoustic image body in a kind of 3d space also, comprises with lower unit:

Locus extraction unit, for determining the locus of source of sound acoustic image, implementation is as follows,

p i ( k , n ) = g i ( k , n ) cos μ i · cos η i sin μ i · cos η i sin η i

Wherein, i is the index value of loud speaker, and k is band index, and n is time domain frame number index, g _i(k, n) is the strength information of frequency domain point;

The horizontal angle Î¼ of source of sound acoustic image and elevation angle Î· adopts following formulae discovery,

tan μ ( k , n ) = Σ i = 1 N g i ( k , n ) · cos μ i · cos η i Σ i = 1 N g i ( k , n ) · sin μ i · cos η i

tan η ( k , n ) = [ Σ u = 1 N g i ( k , n ) · cos μ i · cos η i ] 2 + [ Σ i = 1 N g i ( k , n ) · sin μ i · cos η i ] 2 Σ i = 1 N g i ( k , n ) · sin η i

Source of sound acoustic image gets the average distance of all loud speakers to auditor to the distance Ï of spheric coordinate system initial point;

Unit chosen by loud speaker, for the locus (Ï, Î¼, Î·) according to locus extraction unit gained source of sound acoustic image, determines the loud speaker of source of sound acoustic image place spatial proximity;

Correlation extraction unit, choose the correlation of each sound channel signal in the horizontal and vertical directions of loud speaker selected by unit for calculating loud speaker, implementation is as follows,

IC H = cov ( P L , P R ) cov ( P L , P L ) · cov ( P R , P R )

IC V = cov ( P U , P D ) cov ( P U , P U ) · cov ( P D , P D )

The acoustic image body of source of sound refers in the 3 d space relative to the size in the front and back/degree of depth of acoustic image auditor, left and right/length and up and down/height three dimensions.The present invention is directed to the 3D audio system of multichannel, by utilizing the correlation between different sound channel to describe the size of source of sound acoustic image body from three dimensions.The expression parameter that the present invention obtains acoustic image body is that the size recovering source of sound acoustic image in 3D live audio system accurately provides technical guarantee, solves the technical barrier that the acoustic image of current 3D Audio recovery is too narrow and small.

Accompanying drawing explanation

Fig. 1 is loudspeaker position and the calculated signals relation schematic diagram of the embodiment of the present invention.

Embodiment

Below in conjunction with drawings and Examples, the invention will be further described.

Technical scheme of the present invention can realize automatic operational process by those skilled in the art based on computer software technology.Described in the flow process of embodiment is specific as follows:

Step 1, determines the locus of source of sound acoustic image, take auditor as the origin of coordinates, and the spherical coordinate of loud speaker can be set to (Ï, Î¼, Î·), and Ï is the distance of loud speaker to spheric coordinate system initial point, and Î¼ is horizontal angle, and Î· is elevation angle, as shown in Figure 1.

Take auditor as reference point, Orthogonal Decomposition is carried out to each sound channel signal of multi-channel system, obtain the X of each sound channel in 3d space cartesian coordinate system, the component on Y and Z axis.The component of each sound channel is the decomposition of former single-tone source in this sound channel.Therefore after the component on the X obtaining each sound channel, Y and Z axis, respectively each component is added, the component of former single-tone source for listener location can be obtained.Being implemented as follows of embodiment:

First the signal of each sound channel is carried out time-frequency conversion, carry out identical sub-band division to each sound channel, available prior art carries out time-frequency conversion and sub-band division.

Because generally there is multiple loud speaker, the spherical coordinate of each loud speaker (Ï, Î¼, Î·) can be pressed index value respectively as subscript, be designated as (Ï _i, Î¼ _i, Î· _i).Consider that is positioned at a horizontal angle Î¼ _i, elevation angle Î· _iloud speaker, can with a vector p _i(k, n) represents the time-frequency representation of the corresponding sound channel signal of loud speaker, computing formula as the formula (1):

p i ( k , n ) = g i ( k , n ) cos μ i · cos η i sin μ i · cos η i sin η i · · · ( 1 )

Wherein, i is the index value of loud speaker, and k is band index, and n is time domain frame number index, g _i(k, n) is the strength information of frequency domain point.The azimuth of source of sound acoustic image also can be divided into horizontal angle Î¼ and elevation angle Î·, and through type (2), formula (3) calculate:

tan μ ( k , n ) = Σ i = 1 N g i ( k , n ) · cos μ i · cos η i Σ i = 1 N g i ( k , n ) · sin μ i · cos η i · · · ( 2 )

tan η ( k , n ) = [ Σ u = 1 N g i ( k , n ) · cos μ i · cos η i ] 2 + [ Σ i = 1 N g i ( k , n ) · sin μ i · cos η i ] 2 Σ i = 1 N g i ( k , n ) · sin η i · · · ( 3 )

So just can obtain horizontal angle Î¼ and the elevation angle Î· of source of sound acoustic image, because loud speaker is generally arrange centered by auditor, source of sound acoustic image roughly gets the distance Ï of all loud speakers to auditor to the distance Ï of spheric coordinate system initial point _imean value, usual Ï=Ï ₁=Ï ₂=...=Ï _n.

Step 2, determines the loud speaker of source of sound acoustic image place spatial proximity.

After determining the locus (Ï, Î¼, Î·) rebuilding source of sound acoustic image, find out the loud speaker near it according to its position.During concrete enforcement, can first according to each loud speaker (Ï _i, Î¼ _i, Î· _i) sort from the near to the remote to source of sound acoustic image, the loud speaker that then selected distance is near, can select flexibly according to actual conditions, generally chooses 4-8 and is advisable.

Step 3, the correlation of each sound channel signal in the horizontal and vertical directions of loud speaker selected by calculation procedure 2, this correlation can represent acoustic image size in the horizontal and vertical directions.

Selected loud speaker is divided into left and right two parts according to acoustic image position, if P _ifor the frequency domain value of i-th sound channel of source of sound, with the middle vertical plane at source of sound acoustic image and auditor place for projection plane, calculating the right and left signal component sum vertical with this projection plane respectively, is P _land P _r.Namely from loud speaker selected by step 2, be taken at all loud speakers on the left side, acoustic image position, obtain the respective tones thresholding P of each loud speaker _icomponent vertical with this projection plane respectively, then summation obtains P _l; From loud speaker selected by step 2, be taken at all loud speakers on the right of acoustic image position, obtain the respective tones thresholding P of each loud speaker _icomponent vertical with this projection plane respectively, then summation obtains P _r.Calculate the correlation IC of the right and left signal _h, as the formula (4):

IC H = cov ( P L , P R ) cov ( P L , P L ) · cov ( P R , P R ) · · · ( 4 )

Equally selected loud speaker is divided into upper and lower two parts according to acoustic image position, with the plane at source of sound acoustic image and auditor place for projection plane, this plane is vertical with above-mentioned middle vertical plane, calculates the component sum that upper and lower both sides signal is vertical with this projection plane respectively, is P _uand P _d, from loud speaker selected by step 2, be namely taken at all loud speakers of top, acoustic image position, obtain the respective tones thresholding P of each loud speaker _icomponent vertical with this projection plane respectively, then summation obtains P _u; From loud speaker selected by step 2, be taken at the following all loud speakers in acoustic image position, obtain the respective tones thresholding P of each loud speaker _icomponent vertical with this projection plane respectively, then summation obtains P _d.Then the correlation IC of upper and lower both sides signal is calculated _v, as the formula (5):

IC V = cov ( P U , P D ) cov ( P U , P U ) · cov ( P D , P D ) · · · ( 5 )

So just obtain the expression parameter of acoustic image size on horizontal and vertical direction, the perception of adjusting the distance due to people is sensitive not, and therefore distance parameter can IC _hand IC _vin smaller value represent, i.e. Min{IC _h, IC _v.

By above method, can according to the horizontal angle Î¼ of the source of sound acoustic image of each frequency band of every frame signal and elevation angle Î·, the corresponding acoustic image body obtaining each frequency band of every frame signal.

During concrete enforcement, the acoustic image body available parameter collection { IC extracted _h, IC _v, Min{IC _h, IC _vrepresent and store, for recovery source of sound acoustic image.

Technical solution of the present invention also can adopt software modularity technology, is embodied as device.The corresponding extraction element providing source of sound acoustic image body in a kind of 3d space of the embodiment of the present invention, comprises with lower unit:

Locus extraction unit, for determining the locus of source of sound acoustic image, implementation is as follows,

p i ( k , n ) = g i ( k , n ) cos μ i · cos η i sin μ i · cos η i sin η i

Wherein, i is the index value of loud speaker, and k is band index, and n is time domain frame number index, g _i(k, n) is the strength information of frequency domain point;

The horizontal angle Î¼ of source of sound acoustic image and elevation angle Î· adopts following formulae discovery,

tan μ ( k , n ) = Σ i = 1 N g i ( k , n ) · cos μ i · cos η i Σ i = 1 N g i ( k , n ) · sin μ i · cos η i

tan η ( k , n ) = [ Σ u = 1 N g i ( k , n ) · cos μ i · cos η i ] 2 + [ Σ i = 1 N g i ( k , n ) · sin μ i · cos η i ] 2 Σ i = 1 N g i ( k , n ) · sin η i

Wherein, N is the sum of loud speaker, and the value of i is 1,2 ... N, Î¼ (k, n), Î· (k, n) the i.e. horizontal angle Î¼ of source of sound acoustic image and elevation angle Î·;

Source of sound acoustic image gets the average distance of all loud speakers to auditor to the distance Ï of spheric coordinate system initial point;

IC H = cov ( P L , P R ) cov ( P L , P L ) · cov ( P R , P R )

IC V = cov ( P U , P D ) cov ( P U , P U ) · cov ( P D , P D )

Acoustic image bulk properties storage unit, for obtaining the parameter set { IC of acoustic image body _h, IC _v, Min{IC _h, IC _vand preserve, wherein Min{IC _h, IC _vbe IC _hand IC _vin smaller value.Adopt IC _h, IC _v, Min{IC _h, IC _videntify characteristic in the front and back/degree of depth of acoustic image, left and right/length and up and down/height three dimensions respectively.

Above-mentioned example of the present invention is only and illustrates that method of the present invention realizes; any people being familiar with this technology is in the technical scope disclosed by the present invention; all can expect its change easily and replace, therefore scope all should be encompassed within the protection range that limited by claims.

Claims (2)

1. the extracting method of source of sound acoustic image body in 3d space, is characterized in that, comprise the following steps:

Step 1, determine the locus of source of sound acoustic image, implementation is as follows,

p i ( k , n ) = g i ( k , n ) · cos μ i · cos η i sin μ i · cos η i sin η i

Wherein, i is the index value of loud speaker, and k is band index, and n is time domain frame number index, g _i(k, n) is the strength information of frequency domain point;

The horizontal angle Î¼ of source of sound acoustic image and elevation angle Î· adopts following formulae discovery,

tan μ ( k , n ) = Σ i = 1 N g i ( k , n ) · cos μ i · cos η i Σ i = 1 N g i ( k , n ) · sin μ i · cos η i

tan η ( k , n ) = [ Σ i = 1 N g i ( k , n ) · cos μ i · cos η i ] 2 + [ Σ i = 1 N g i ( k , n ) · sin μ i · cos η i ] 2 Σ i = 1 N g i ( k , n ) · sin η i

Source of sound acoustic image gets the average distance of all loud speakers to auditor to the distance Ï of spheric coordinate system initial point;

Step 2, according to the locus (Ï, Î¼, Î·) of step 1 gained source of sound acoustic image, determines the loud speaker of source of sound acoustic image place spatial proximity;

Step 3, the correlation of each sound channel signal in the horizontal and vertical directions of loud speaker selected by calculation procedure 2, implementation is as follows,

Selected loud speaker is divided into left and right two parts according to acoustic image position, with the middle vertical plane at source of sound acoustic image and auditor place for projection plane, the component sum that the right and left signal that all loud speakers and all loud speakers on the right of acoustic image position that calculate the left side, acoustic image position respectively produce respectively is vertical with this projection plane, is designated as P _land P _r, calculate the correlation IC of the right and left signal _hit is as follows,

IC H = cov ( P L , P R ) cov ( P L , P L ) · cov ( P R , P R )

Selected loud speaker is divided into upper and lower two parts according to acoustic image position, with the plane at source of sound acoustic image and auditor place for projection plane, the component sum that the signal of both sides up and down that all loud speakers that all loud speakers of calculating top, acoustic image position are following with acoustic image position respectively produce respectively is vertical with this projection plane, is designated as P _uand P _d, calculate the correlation IC of upper and lower both sides signal _vit is as follows,

IC V = cov ( P U , P D ) cov ( P U , P U ) · cov ( P D , P D )

Step 4, obtains the parameter set { IC of acoustic image body _h, IC _v, Min{IC _h, IC _vand preserve, wherein Min{IC _h, IC _vbe IC _hand IC _vin smaller value.

2. the extraction element of source of sound acoustic image body in 3d space, is characterized in that, comprise with lower unit: