Posted on Sat 4 Aug 2018
Virtually Useful: Step Five - Sound
Great sound design can elevate immersive experiences, whilst poorly handled audio can feel clumsy, uncomfortable and disjointed. But what is it that actually makes the difference when using audio in VR/XR?
It’s a Tuesday, I’m walking to work in a new green jumpsuit which may or may not suit me, and which is certainly too Summery for the inky, gathering clouds on the horizon. I’ve just finished listening to a podcast about how you should have all of your finances in order by your 30th birthday. It is my 36th birthday and I’m feeling anxious.
In an attempt to quiet my brain, I pop on a random spotify playlist and into my left ear drops the warm, bouncing, twang of a bass guitar, whilst Bill Withers smoothes my right ear with the assurance that “just one look at you (dum dum dee dum) and the world’s all right with me”. My whole body unclenches, I’m smiling, I’m walking in time, it’s my birthday, I look ace and I know it’s going to be, a lovely daaaaay. I am reminded for the umpteenth time of the power that sound has to radically alter my mood.
Feel free to listen to a bit of Bill yourself for the next 4m15s. You will not regret it ☺
Sound has been on my mind a lot lately. When I first started this blog about virtual reality, I received a great provocation on twitter from the brilliant Duncan Speakman who said:
It's a shame this piece doesn't mention audio based experiences as a fundamental form of AR
— duncan speakman (@_dspk) September 20, 2017
To which I replied:
I just think it's important in a guide to say that unlike VR, the term AR is not related just visual technology.
— duncan speakman (@_dspk) September 20, 2017
To be honest, 'half-heartedly musing' might have been a better description than 'planning' at the time. However, his point was bang on. Sound in this medium deserves a lot more attention that it currently receives. Really accomplished sound design in immersive experiences can elevate and distinguish a piece, whilst poorly handled audio can make an otherwise well constructed world feel clumsy, uncomfortable and disjointed. But what is it that actually makes the difference when using audio in VR/XR? I asked some of the people who’s opinions I really rate, to share their pro tips:
Wait, where is that noise coming from?
Of course the first person I spoke to was Duncan Speakman, who shared that we are less exact in our spatial perception than we might like to think when it comes to sound, and identifying the source of a specific noise can be tricky. Duncan suggests that we “Try an experiment [...] lose your phone then call it… can you instantly pick the exact direction… or is it more of a feeling that it’s ‘around the back of the sofa somewhere’?”
So when it comes to sound design in 360 environments “It’s really not enough to just go ‘oh we have a 3D visual space, so we will just associate each sound with its’ appropriate physical position in the environment’, while that method will have suitable situations to be used, sticking to it will never let your sound truly speak to the audience.”
3D environments require a different approach
Duncan points out “We’ve had films with sound for almost 100 years now, and in that time there has been a rich development of understanding in how sound works with film.”
“Sound design for film/tv articulates the space we’re looking at on the screen, and it does this through artificial (but deeply designed) layering, highlighting, editing etc i.e. what we hear is often not what we would hear in ‘real life’ but what the creators want us to hear. Sometimes this is all about perspective, about changing our viewpoint on a scene or narrative by changing our position through the way we hear it.”
“When we think about VR/360 we are suddenly giving the viewer control of the visual frame, and in doing so we have to entirely rethink how sound positions us in the work. One example consideration might be what sounds should be headlocked (e.g. not spatialised, but staying in the listeners ears wherever they move), the answer is not always as obvious as it might seem.”
It’s about the orchestra not the trumpet
In a recent lunchtime talk at the Pervasive Media Studio, Olie Kay from All Seeing Eye spoke about the level of detail and design that went into creating the audio for Immersive Histories: Dam Busters. In this experience, you are in the seat of the radio operator, inside a WW2 Lancaster bomber during ‘Operation Chastise’, better know as the Dam Busters raids.
Immersive Histories: Dam Busters by All Seeing Eye
In order to create the most authentic audio experience possible, the team meticulously researched and layered up spatially positioned elements such as the engine noise of the aircraft, the acoustics of the cabin, the weather outside, air whipping over the fuselage, the sound of incoming anti-aircraft fire and myriad other factors, all of which appear to move dynamically as the aircraft changes course. A Subpac™ backpack is built into the aviator’s life vest that you wear throughout, and a speaker under your seat translates some of the lower frequency sounds directly into your body.
One of the most compelling aspects of the experience for me, is voice of your fellow crew members, peppered through the experience over an intercom. This aspect is ‘headlocked’ as Duncan describes it above. No matter where you move, the grainy, static distorted voice of the navigator comes with you, giving the impression that you are wearing the sort of headphones, and listening to the kind of communication that a radio operative would have experienced in the 1940s.
In several VR experiences, such as Notes on Blindness, and Zero Days VR, practitioners use this same technique of ‘headlocking’ certain tracks, whilst spatially positioning others, to indicate an omniscient narrator or to reveal an inner monologue. The affect suggests a voice in your head, something that is not present in the observable world around you. Film studies buffs might recognise that as ‘extra-diegetic’, rather than ‘intra-diegetic’ sound.
Sound you can help you to move through and explore virtual worlds
Sometime, VR affords you the opportunity to move through an environment, either real or fictional, and the way that sound operates as you move around directly impacts your experience of a space.
Rachel Godfrey from GoVirtually recently developed a 360 degree tour of the we the curious science centre in Bristol, and is particularly thinking about the role that sound can play in making a building feel accessible and inaccessible to different people. She says:
“Using audio for this tour was pretty crucial as the goal is to make an unfamiliar place familiar. Many people might not think the sound of a place would make a difference, but to people with autism it can prepare them for what they might hear when visiting. It meant there will be no surprises for any sudden sounds they might hear, such as the woman in labour in the pregnancy area”
Virtual tour of 'we the curious' by GoVirtually
“High frequency sounds are easier for our ears to locate compared to low ones”
In ‘Open Space’, a recent piece created by Liam Taylor-West, Emma Hughes and All Seeing Eye, audiences are encouraged to seek and discover an invisible orchestra, using only audio to navigate through a virtual space. Liam explains that he “tried thinking about the quality of the musical sound being created by the players, and asked them to include more ‘noise’ than they might normally do in their playing. That way the audience is helped out without the actual notes being all high pitched (which would have been incredibly annoying I think), or having to boost the high frequencies in post production” This included “asking the Marimba player to use harder mallets for more ‘click” and “picking up on all the breath sounds from the clarinetist.”
'Balance' extract from the soundtrack to Open Space by Roomsize
“For me, the beauty of immersive experiences is their ability to represent sound in a way that cannot usually be appreciated. By combining 6DOF (6 degree of freedom) experiences and spatial sound, I have been able to 'hear' music with a new perspective. I think for those of us, that are not naturally musically talented, immersive audio, can allow you to understand the creation and complexity of music.”
Sound needs to have a strong connection to the narrative:
I asked Catherine Allen, founder of Limina Immersive and VR pioneer to pick out just one sonic element in a VR piece that she had created that caused her to think – ‘yep, that really works for this medium’. She shared that, in a pioneering piece that she developed for the BBC:
“the sound of the Irish flag flapping in Easter Rising: Voice of a Rebel really worked because it highlighted the importance of that flag for Willie and the romance he felt towards it.”
Easter Rising Voice of a Rebel by BBC Learning
John Durrant, Creative Director and co-founder of BDH & BDH Immersive added that “The job of a binaural sound design is to seduce the viewer and to compliment the content rather than swamp it. I’ve found that the tiniest moment of sync-sound can make the world breath with life.”
"For Wonderful You VR we 'grey-boxed' the animations with guide visuals, guide music and guide script to help us deliver effective story beats. Samantha Morton (who did the final compelling voice-over in a semi dream state) had to deliver some quite technical information about the beginnings of our sight, hearing and sense of smell, while imagining she was a mother whispering to her child inside the womb. The trick was to allow the beats per minute (bpm) in the music to drive the way we built the scenes and allow space for her voice. The result was that Wonderful You had a dreamy quality that delivered a potentially heavy and factually accurate script, in a sympathetic way.”
Wonderful You VR by BDH
Hearing is a haptic experience
Our standard model for the five senses suggests that touch and hearing are quite different things. But of course what we interpret as sound is essentially vibrations, passed through the eardrum to the middle ear bones where thousands of tiny hairs jiggle about, those movements are translated into electrical signals that make sense to the brain and help us to interpret the sonic world around us. This feels/sounds like a profoundly tactile experience to me, and in VR, the parallels between sound and touch can become even more apparent.
Sound handled poorly this can negatively affect your sense of balance and induce nausea in VR. Conversely, well crafted sound in VR can make you feel more bodily present and involved in the scene than you might with film and TV. It can even feel euphoric.
Catherine Allen explains that:
“rousing, suspensey music just works really well in VR in general; e.g. N'To's Chez Noir in Fantasynth” which won a recent award at the Limina VR Weekender here at Watershed.” You can hear a bit of it here.
“‘Hovering’ by Shigeto from Within is a fascinating fly-through an extra-terrestrial world. When coupled with the tactile bass seat SubStrike™, the background bass drone makes you really feel as if you were hovering.”
So, how do you make good sound for VR?
Many of the people I asked reported that:
"If there is a noticeable latency between seeing the visuals and hearing or feeling their related sounds, it can immediately damage the XR experience."
Nick Inoue, Substrike™
“Like in TV, losing the sync sound can ruin any piece and you can lose the impact on the viewer.”
John Durrant, BDH
Things that break immersion:
- hearing something that you can't see - seeing something that you can't hear- stereo panning just sounds 'wrong' when you turn your head in a VR experience, especially if sound drops off dramatically in one ear or the other- poor quality sound
One of the best ways that I have found to understand the dynamics of spatial sound, is to remember that you have a big, solid head! And that sounds will reach each of your ears at slightly different times, thanks to the 21.5cm gap between them deemed necessary to store things like your spectacles and your brain. We are incredibly well attuned, at a sub-conscious level, to interpret what something is, how far away it is, and whether it is approaching and receding. This is based largely on the tiny delays and variations between one what one ear detects and the other. As most headset based VR is experienced through headphones, either built in or external, the timing (or phase delay) of how a sound reaches each of your ears is critical.
Duncan Speakman advises that, when it comes to doing this yourself, or (better) working with a sound designer to create new work:
“While ambisonic and spatialised sound are technologies that have been around for a while the workflow for creating VR pieces is still not as fluid or established as in the broadcast/film industry.”
“It’s worth checking what the state of format compatibilities is for your intended project. For example a 2nd order Ambisonic audio file (which is what your sound designer might export for you) is at the time of writing not possible to import into Unity, (see here https://github.com/resonance-audio/resonance-audio-unity-sdk/issues/17 ). What does this mean to you? Well a second order file gives more definition and fidelity in position than a first order file, but luckily an ambisonic mix can be outputted as both, so if you’re using Unity you could update your project once it accepts 9-channel files (which is what a 2nd order mix is)."
"This limitation doesn’t apply to 360 video though, and the Facebook Spatial workstation is a pretty good system for embedding your 2nd order ambisonic file into a video. In short, check compatibilities of systems before you base your entire aesthetic on something that may not be practical.”
There is loads more to say about amplitude, phase, pitch etc that has me wading way out of my depth. If you’d like a deeper dive, Dan Page from Bristol VR Lab, Creative Director of VR World Congress recommended this excellent video from Oculus’ sound teams from a couple of years ago:
One final thought; and this one is from me. Please, give serious, practical thought to how your audience will encounter the work. For this medium to find it’s place in contemporary culture, we cannot keep making work assuming that we will be standing next to it at a festival or tech expo when it is experienced. Where will your audience/participant/user be? What will they have with them? What do they need to feel confident and comfortable in giving their full attention to your experience? If there are likely to be ushers/assistants, what briefing will they need to ensure that your carefully crafted, delicately interweaved multi-sensory environment is experienced as you intended?
Amongst all of the invaluable technical and conceptual insight that Duncan has shared, he also has a gloriously simple and practical tip for makers:
“If you’re using a headset to present pieces with external headphones, it’s worth labeling the left and right sides VERY CLEARLY. Headset experiences shown at festivals are commonly managed by (often underpaid and overworked) staff who may have also been tasked with pushing as many people through an experience as possible, in these situations it’s easy for headphones to be given out the wrong way around (this has happened to me as an audience member on numerous occasions!). Make their life easier… use big labels!”
My congratulations to the team behind 'Aeons', a binaural sound walk along the River Tyne by composer, Martin Green currently playing as part of the Great Exhibition of the North. L and R clearly visible on the headphones and the volunteers kindly checked that I had them on the right ears!
We would love to hear your top tips for creating compelling sound in VR/XR experiences, and your horror stories about where it has all gone array. You can find me on twitter @veritymcintosh and sign up to the Pervasive Media Studio’s newsletter for more news, opportunities and events along these sorts of lines.