Wave-Field-Synthesis

 

Wave Field Synthesis

The WFS is a spatial audio reproduction procedure. Its perception no longer remain dependent psychoacoustic effects, like the phantom sound source perception of the conventionally procedures. The sound field becomes reconstructed physically. For that reason the synthesis emulates nature like wave fronts according Huygens principle by assembling of elementary waves. A computer synthesis moves each solitary speaker membrane, arranged in arrays around the listener, just in that moment, if the wave front of a virtual point source would meet their point in space. As shown in the animation the initial wave front becomes restore physically.

WFS principle

Mathematical base

Prof. Berkhout invented the procedure 1988 at the University of Delft , based on the Kirchhoff- Helmholtz integral. It expresses if proven sound pressure and sound particle velocity concerning the surface of a source free volume, the sound pressure at any point within this volume is determined. If known all surface values the sound pressure conditions inside a volume is restorable on each point.

The assigned surface values are deducible by the audio signal, associated by location of the genuine source. The practical implementation of this approach is very complex because each individual secondary source should remain smaller as the wavelength of the upper transfer bound. Secondly the entire surface of the volume, i.e. all the walls of the reproduction area, must be closely loaded by such discrete controlled sources. Over and above the volume should be empty of own sound sources. That can be realized only in an anechoic room, because otherwise mirror sound sources inside the volume would occur.

According Rayleigh II the sound pressure at the point A within a half-space is determined, if only the pressure distribution on a plain is known.

On both sides of this plain an acoustic field occurs. If the rear sound suppress, half-space radiation accrue.

 

Physically principle

According Huygens's principle, any point of a wave front may consider as starting point for an elementary wave. Therefore, arbitrarily wave front will be synthesizable from such elementary waves. If these secondary sources arranged upon a plane, they can create virtual sound sources before or behind that reproduction plain. Such speaker arrangement realizes the principle of " acoustic curtain ". Similar idea some engineers were proposing already in the thirties of the last century:

acoustic curtain

Such plain comparably pierced wall throughout goes through the sound. Get each of that holes blocked by a loudspeaker, guided by his microphone on the other side of the wall those conditions would not changed generally. As far as the amount of elementary waves is sufficient, the synthesized wave front doesn't differ from a natural wave front. Its starting point portrays a virtual sound source with same behavior as a natural source by this position.

The idea for this perfect reconstruction of the wave front was deserted in the thirties alone, because was never consider possible transmitting such amount of routes of communication. Still today that would be a problem. Though as closer look regarding each of the single holes is more likely to show no difference in the signal shape arise. Only time and amplitude differs, caused by the different way regarding the virtual starting point. In the state of the art of digital signal processors it is easy to produce these delays. For each virtual sound source thus only one mono signal transmitting is need.

In contrast to the phantom sound sources at the conventional transmission methods this virtual sound source is not roaming with the listener furthermore, if he moves in the playback area. Like a real source we locate the virtual sound source always at their virtual starting point. Evidently possible constituting a lot of virtual sound sources simultaneously. Each separate source need a separate mono transmitting channel and its position for the synthesis must be known.

The virtual acoustic source can be located before or behind the speaker arrangement; convex or even concave wave fronts may set up. Only restriction, the listener cannot be placed between the loudspeakers and the virtual source. The generator field would cause in that range wrong Interaural Time Delays, causing wrong perception of the source position.

By that reason it isn't possible to create immersive impression alone by the speaker arrangement in front of the listener. The simulated virtual sources may situate behind or before the speaker wall, but not behind the listener though. This could be avoided if the speaker walls on all sides around audience would be built, what however leads to hardly bearable effort.

Looking for practicable solution, the developers abandoning the elevation level representation in compromise. Reducing the speakers upon single line around the listener was an acceptable solution, realizable in the nineties already. Our detection in azimuth is mainly based by time detection, which becomes reconstructed perfectly by such horizontal loudspeaker lines.

wfs by horicontal speaker rows

As shown above, the recording room sound field isn't established from the direct wave of the source alone. A complicated pattern of recording room reflections constitutes the spatial impression. Besides the direct wave all those reflections must recreated correct in time and direction for true spatial audio. Fortunately the signal content is the same for genuine source and its reflections.

Thus would possible the restoration of each reflection by simply calculation of time and level regarding each single loudspeaker. This model based approach is easily to calculate by the room geometry and the reflection factors, but a huge amount of reflections is caused by a single source in the recording room. Each reflection is starting point for a lot of new mirror sources; many thousands of virtual sources must be calculate in time regarding each single speaker.

More practicable seems the impulse response based solution in that purpose. For that solution is need the spatial impulse response of the recording room. The spatial impulse response consists of a lot of impulse responses, recorded on different positions in the recording room for different source positions. As shown in that little animation, the impulse response is dependent of both, the source position and the recording position.

In principle, all impulse responses for all possible positions would be need for each primary source regarding each single loudspeaker. But by means of interpolation and extrapolation each specifically impulse response can be calculated by a sufficient set of measurements. By that way is possible calculate the impulse response of each direct source regarding each speaker position. By convolution of the source signal in those different impulse responses for each speaker the procedure restores the correct position of the source and all of its mirror sources in the recording room. Would be solvable such approach for loudspeaker walls all around the listener such Holophony approach would restore the genuine sound field entirely.

But the connected calculations are very complex, near the limit of the available computing power still. Especially for moved sources the operating expense hardly bearable today, even if the principle limited to the horizontal speaker rows.

Within the European "CARROUSO" project was developed the suitable transmission standard MPEG4. But even conventional, dry-recorded audio signals are able reproducing by wave field synthesis principle, as far as merged in a suitable virtual environment.

 

Procedure advantages

The wave field synthesis solution doesn't reliant upon the uncertain psychoacoustic based phantom source detection. The virtually acoustic sources , produced by sufficient amount of elementary waves, don't remain distinguishable from natural point sources. The Loudspeaker itself no longer remains as referring point. By that reason the perception of the sound source becomes possible by closer proximity on audience as the loudspeakers themselves. That allow the "you are here" perception, no longer only "they are around you". The stable space position of the sound sources enables to pass across the virtually sources. Just like in the primary sound field, evaluating of head movements and Doppler effects encourage the detection of the sound source position. For fast moving sources the frequency shifting arise alike natural sources during the synthesizing process, without any changes in the source signal itself.

The more precise location of the sound source and its early strong reflections ensure clearly more authentic reproduction. More accurate signs concerning distance perception causing incomparably better depth impression of the acoustic scene.

Regarding conventional multi-channel productions engender the proceedings a distinct advantage: "Virtual panning spot" called virtual sound sources, guided by signals of the conventional multi-channel recordings, becomes positioned far beyond the real playback room walls. That lessens the influence regarding listener position, because the relative position changes become decrease, resulting in enlarged sweet- spot, covers now almost entire playing area. The procedure is not compatible only; it improves conventional recorded reproduction significantly, hence.

Neighbouring diaphragms work synchronously in the lower frequency range. Therefore the air has not the ability to evade any longer to side, as inevitable for single speakers. Especially two-dimensional loudspeaker domains work therefore upon much better adapted load resistance, what caused significantly increased efficiency. High sound pressure may produced now by very small diaphragm excursions.

Also the efficiency regarding signal transmission becomes improved. Each sound source constitutes its reflection pattern in the recording room, but the signal content remains unique. Therefore the transfer of the assigned pure audio information in a single mono channel is sufficient. The spatial sound field in the recording room becomes create alone by the fact, a large number of spatial dispersed mirror sound sources occur. According her room positions diverse delays regarding the listener position arise. Ancillary the surface reflection factors influencing frequency response and level of those mirrors sound sources. The signal itself always the mono source content remains, though. The conventional methods are now seeking to reduce the spatial distribution of those mirror sound sources upon few audio channels. This always caused a significant loss of spatial information. Much more effective is the synthesis by the rendition side, which enables mapping of all sound sources in their correct position in principle.

Furthermore, that procedure for different language versions of records delivers common instrumental and noise tracks, because no signal mix get transmit. Alone the text or vocal track must be changed. Works only one voice in different spaced sources simultaneously, the connected data can trace that channel sequentially. Besides the tight spacing of the single microphones nearby the source avoid many of the recording problems wich caused by different runtimes in conventionally productions.

 

Remaining problems

The most perceptible difference regarding the genuine sound field still the reduction on the horizontal plane of the listener remains. It is possible to equalize on help of a 2.5 D synthesis operator the arising errors in the impulse response regarding source offset outside the speaker plane. Our perception exposed this trick mercilessly, hence. Because of the otherwise nearly perfect reproduction remains the reduction on the horizontal plain clearly audible in the test setups.

Further problem until today is the high effort. Several speakers have to be spaced very closely. Otherwise spatial Aliasing- effects become audible. They arise because not an unlimited amount of secondary sources can be generated, as it describes the mathematical approach. The discretization caused narrowband voids in the frequency response within the rendition area. On the other hand the size of the speaker field determines the depiction range and the lower cut-off frequency. But the transducer distance cannot become increase. That justifies the reduction upon the horicontal lines until today, although the wave field synthesis in principle is not reduced on this level. If the loudspeaker arrangement not closed around the listener become formed at the last of the elementary waves a shadow wave as unwanted signal. This "truncation effect" can by lowering if the levels of outer speakers somewhat diminished, with sound sources in front of the speaker arrangement the shadow wave run ahead, because it become perceptible unfortunately.

The audio rendition can be done so much improved by the wave field synthesis, because the virtual sources able to locate now much more stable than the phantom sound sources of the traditional reproduction procedures. At the level of the listener sound sources beyond the wfs- speaker arrangement may produce very authentic, but the reproduction inside the rendition area bearing two elementary problems:
First, in the range between producing loudspeakers and virtual source overlaid the origin field with the yielded field. Moreover, the concave shape of the wave fronts causing interaural runtime detection errors in this area. Thus run the sound waves apparently in the wrong direction. On that score the listener should not positioned in that range. These artifacts may not as yet eliminate, only reduced. Moreover virtual sound sources in the spectator area with a combined representation of the optical image impression only for a single listener square concerning the acoustic impression. That should be explained:

The scetch shows wave field synthesis controlled loudspeaker line developed around the audience (1a). The Listeners in auditorium (1b) watching the presented sound source (1d) by the picture screen (1c). For a single appropriated listener position (1e) become possible to build up a virtual sound source (1f) by an according perception. For all another listeners in the auditorium this position differs regarding the pictured source position. The problem may solved by connected shifting of virtual source positions and its dedicated levels, as described in DE 10 2006 054 961 A 1 application.

Further fundamental problem is that the reproduction area must be damped severely. Otherwise the secondary sources for the reflections causing injury of the condition of source free volume. The speaker lines reconstruct the acoustics of the recording room including early reflections and reverberation. Therefore, the reproduction room reflections are additional, disturbing signals, which can be tolerated only in a certain frame.

Closely by listener spaced sound souces constitute longer Initial Time Delay Gap in the signal. This gap between the direct wave and the arrival of first strong reflection is an important sign for nearby sources. However horizontal lines generate undirected waves in elevation level, which returned too early by the bottom and the ceiling into the playback room mostly. That is a misguiding cue for a really nearby depiction.

Albeit of those remaining problems the WFS- arrangements work with great success in several cinemas and in the event world since some years. Above all until today the wave field synthesis procedure hasn't the proper breakthrough by home audio application. Besides the high effort still the acceptance problems for the all around the listener arranged loudspeaker arrays seem crucial. An attempt for resolving those problems is simply described at the Holophony page….

 

 

last update 2010-02-01