My wife and I were talking about the BP oil spill this weekend, and I had some thoughts:
- In the Seattle area, much of the business nowadays is high tech. The actual work is done on a computer. Theoretically, there is no reason this couldn’t be done on a laptop at home.
- However, most people still commute into work 5 days a week. At many jobs, at least 1/2 the day is spent in meetings.
- If people were able to attend at least some of these meetings remotely, this would greatly cut down on the amount of commuting that is necessary. In addition, a lot of business trips could be avoided. This would greatly reduce the amount of fossil fuels used around here.
- Yet people hate attending meetings remotely. The reason that is always cited: It is usually difficult to impossible to understand what is being said by the people in the conference room.
Let’s look at a typical conference room (and if you haven’t experienced this situation in your business life, consider yourself lucky). There is a large table, made of a hard material like wood or Formica. There is a single conference phone in the middle of the table, which acts as both speaker and microphone. The conference phone is connected to the outside world via phone lines, which have a 4 KHz bandwidth, mono. The conference phone usually shares the table with a number of laptops, each with their own fan, and is often located close to The room usually has at least 1 wall that is highly reflective to sound, due to a whiteboard being there. In addition, it is common to have one wall be made of glass windows to the outside world (or to elsewhere in the office). Glass reflects sound to a greater degree than your typical drywall found in offices.
The resulting sound is HORRIBLE, especially when you dial into the meeting. The room is reverberant, the table is reverberant, and the fans from the laptops and projector add a noise floor that the reverberant voices need to be heard above. People that are not right next to the conference phone have their voices swallowed by the early reflections. Compressing all of this info into a mono 4KHz stream makes it difficult to differentiate one voice from another – for the people in the room, the spatial information from both ears allows them to discriminate between sound sources.
To add insult to injury, the codecs used by the phone companies for voice transmission assume that the sound they are transmitting comes from a single human voice. Instead of transmitting a downsampled version of the voice, the codec separates the voice into a number of filter coefficients (representing the resonances of the vocal tract) and a “residual” signal that is used to excite the filter coefficients at a given pitch (representing a glottal pulse from the larynx). Trying to transmit several voices speaking at the same time through this model would be like a single person trying to speak in several voices at once, and adding reverb to each of these voices. The result would be a burbling, blippy mess, which anyone who owns a cell phone would instantly recognize.
What is the point of my rant? I believe that if corporations prioritized high-quality audio reproduction and transmission for meetings, then the number of meetings that people have to drive and/or fly to could be greatly reduced. The primary mode of real-time communication between people is speech. The current phone networks are designed around the assumption that speech is being transmitted between two people, and the bandwidth and codecs used for this transmission are inadequate for communication between groups of people, where many of those people share the same acoustic space.
In order to allow conference calls to be tolerable, the issue of reverberation within the conference room needs to be dealt with. There are a variety of solutions that would improve the situation:
- Conference rooms could be acoustically treated. Adding sound absorbing panels can cut down reflections from a wall.
- More microphones can be used, of higher quality. By having microphones closer to the people speaking, the ratio of direct to reverberant sound can be increased.
- People can use individual wireless headsets. This would put the microphone for each person at an ideal location (i.e. right in front of their mouth), largely eliminating the influence of room reverberation. In addition, this would eliminate the need for a speaker in the room, which gets around the feedback issues that can happen with speakerphones.
- Instead of using phone lines, the audio codec could be built into software that allows audio and video conferencing. This allows the codec to be optimized for transmission of several simultaneous voices, without being squeezed into a 4KHz bandwidth. A higher bandwidth stereo codec, such as MP3 or AAC, would allow easy localization and discrimination of voices within the typical conference room environment.
None of the technology described above is new or radical. The big change in the corporate world would be the prioritization of audio quality, as a way of making online meetings a viable alternative to in-person meetings. If corporations were encouraged to switch over to online meetings, as a way of reducing fossil fuel consumption, the need for higher quality acoustics and codecs would become immediately apparent. Better acoustics = better world.