Immersive Audio: A Look Inside the Next Generation of Sound

Erminia Fiorino
Monday, June 03, 2019 11:43

In our everyday lives, we experience sound in all distances and directions. Our ears are able to detect and differentiate between tone, pitch, volume and location. In short, we hear sound three-dimensionally.

Historically, sound for broadcasting and cinema has been produced two-dimensionally. Up until recently, the ability to record and mix using what’s known as immersive audio (i.e. spatial or 3D audio) was only a discussion among techies.

Before we explore the next generation of sound further, let’s go over the basics.

What is Immersive Audio?

Immersive audio mimics a true-to-life experience by allowing the viewer to hear in all directions. It does this by adding a third dimension of height-sounds overhead and can involve at least three parts: channels, ambisonics, and audio objects.

Channels

An audio channel can be defined as a collection of audio samples coming from or going to a single loudspeaker or group of loudspeakers. A typical home cinema setup usually includes a 5.1 system to manage the “front, left, right, left rear, right rear, and a subwoofer.” And with immersive audio, it can now mimic a higher array.

Ambisonics

Ambisonics produces audio in a full 360-degree acoustic space that moves with the person. For example, if a listener moves their head in a given direction, the audio will change to “reflect the movement.”

Audio Objects

An audio object consists of one audio source accompanied by metadata which conveys the spatial location of a specific sound. You want to be able to reproduce the sound in a given direction.

The results are impressive, yet the broadcast industry has been slow to adopt the next generation format—even with the increased use across virtual reality and on the silver screen. Perhaps the only exception would be with sports broadcasting, which has been making strides to improve realism with these enhancements.

Sports Broadcasting and the Next Generation of Sound

It’s been noted that sports broadcasting has been considered the only suitable case for using immersive audio. Other entertainment and television producers just aren’t seeing the value of it yet. Even if sports broadcasters are willing to take the plunge, there are still some hurdles to overcome.

It can be difficult to capture sound from high-speed objects and a strenuous manual process for the sound supervisor. However, new object-audio capture[1]  equipment and automated workflow systems are being developed; others have already hit the market to tackle these challenges.

In fact, two recent sports broadcasting examples utilizing immersive sound include the National Hot Rod Association (NHRA) and SBS:

1. Dolby Atmos and the NHRA

In 2017, the NHRA looked to Dolby to help bring its televised presence to life, offering a more immersive experience for those at home. After a year of planning, site visits, audio capture, strategizing and testing, a Dolby Atmos system for live viewing was created using a hybrid production strategy. Since then, it’s been implemented for broadcasts.

2. SBS and the 2018 FIFA World Cup

SBS, the South Korean television and radio network, transmitted ultra high definition coverage of the 2018 FIFA World Cup with immersive sound to avid soccer fans watching around the world. Based on the ATSC 3.0 standard, they used Fraunhofer’s MPEG-H audio format—making it the “world’s first regular broadcast of immersive and interactive audio powered by MPEG-H.”

Immersive audio has improved exponentially over the last few years to envelop viewers in hyper-realism. And, it won’t be slowing down any time soon.

Smart TVs are beginning to incorporate immersive sound capabilities, and soundbars are becoming more readily available in the home. As more immersive content arrives and broadcasts are mixed with spatial audio, it will only fuel the growing fascination.

Do you think immersive audio should be more widely adopted? Comment below!

To learn more about immersive media, check out the June 2019 edition of the SMPTE Motion Imaging Journal.