--- 1 June 2016 ---

The Motivation

Telephone conferences are one essential communication tool to maintain productivity in distributed teams, with business partners, and also in scientific research. This allows to discuss topics within a group and also helps maintaining personal relationships. In fact, telephone conferences are easy to set up and, if enabled, allow access from everywhere using a common telephone often by calling a specific telephone number and entering a code or password. This enables participants to join from almost anywhere, using established telecommunication network infrastructure without requiring specialized equipment.

A telephone conferencing system (i. e., centralized conferecing bridge) receives the mono-channel transmission from each participant, mixes the signals together into one mono channel, and transmits the mixed signal to each participant. It is thus hard for a listener to identify individual speakers and understand them clearly, which is especially problematic for telephone conferences with a large number of participants. It also hard to articulate the desire to speak without interupting others. This delays the actual conversation and distracts from the underlying goal of a joint information exchange.

The spatial presentation of the virtual telephone conferencing space for each participant is a promising option to overcome these limitations. Here, all participants are seated around a virtual table, so a listener can use the spatial information to differentiate individual speakers. This is enables to easier identification, which is problematic for similar sound speakers, and if cross-talk occurs listen to a specific speaker. A spatial presentation can be created by adding spatial cues (i. e., frequency shifts, delay, and reflections) individually for each ear of a listener. Presenting the resulting stereo signal via a pair of headphones creates the virtual telephone conferencing space. Beside reducing the effort for speaker identification, this might allow to attribute degradations to individual participants, such as background noise and keyboard typing noise.

In fact, the spatial presentation for telephone conferencing has shown promising results and was already implemented several times. Available implementations are limited due to the required usage of specialized software or even hardware to participate in a spatialized telephone conference.

For more details, about the mechanics of 3D audio you can take a look at “The Theory of Spatial Rendering”.