Thoughts on Sound Design

Within user experience design sound design is an area ripe for innovation. I've been thinking a lot about how sound might be used more effectively in interactive products and I would like to present five considerations and an opportunity.

Consider Multiple Reproduction Methods & Contexts

I recently designed a custom alert tone for a client's iOS app. The tone was meant to inform users of events within the app. The first challenge I encountered will be familiar to audio engineers who have long considered the multiple methods and contexts that frame their work. This way of thinking will also feel familiar to anyone who has been practicing responsive design. Like the rapidly multiplying number of browsers and devices that must be considered in web design, audio engineers have long considered fixed and mobile devices and the many contexts they are used in.

Good sound design must respond to a multitude of contexts. Even a sound as specific as an iOS alert tone must respond to both stereo headphones and the small integrated monaural speaker (among other possible manners of reproduction). This tone would be played in many different environments from a loud echoing hallway to a small carpeted room. And because the tone would be played by a mobile device we had to consider the user's shifting prioritization profile. Our application (and its alerts) was more important to our users in certain settings than others.

What I'll Try Next:

I would like to experiment with using the technology available to me to better detect reproduction method and context. In my iOS example instead of trying to design a sound that reproduces equally well for headphones and the speaker (which is admittedly the more responsive approach) I'd like to try using a different sound depending on the hardware route. This would free me to more specifically design for the small iPhone speaker (which was particularly challenging).

I'm less sure how I might begin to detect and utilize context. It might be possible to detect if you are indoors our out based on GPS signal strength. However, iOS at least has no public API for this. CoreLocation will return horizontal and vertical accuracy which might be a possible route. Using the GPS to calculate speed could be used to make a good guess that you are in a automobile or train. Airplane mode could be used to make a guess about where you are.

Consider Temporal Priority

Audition is a powerful sensory channel that often detects change long before our vision inform us. In most cases it is an extremely attention demanding channel and as designers we must ensure the information communicated by this channel is presently prioritized by the user.

Consider this story which is unfortunately still a bit too common. A few nights ago with my children in bed I retreated to my echoey office just a few feet away from their recently quite bedrooms. I sat down at my computer and for one reason or another I clicked on a link for a site selling music lessons. The site loaded quickly and within seconds guitar riffs screamed from my computer.

I fumbled for the mute key. It took two or three seconds to silence the site. I realized later that I should have simply closed the tab but my heart was racing and I was distracted and annoyed. Angry even. It takes a special design to blow out the bottom of the Likert satisfaction scale.

An explosion of sound at the right time (in this case once I had my headphones plugged in) would have been fine. However, a highly disruptive sound played without knowledge of either my context nor my prioritization profile is a significant violation that lead to an emotional response.

What I'll Try Next:

We've used user studies to establish a baseline for temporal priority. Depending on the application it may also be possible for it to ask users for additional information to establish individual prioritization profiles. Prioritization profiles can consider a number of factors including location, time of day, other simultaneous tasks, and more.

Consider Brand & Emotion

I'll start with another story. Today I arrived at work and began my day as I often do—responding to email. Recently I've been using a Mac application called Sparrow for email. I read, I typed, and then I clicked the bright blue send button. Sparrow drew a breath and my email was sent.

Sparrow's sound design is whimsical, clean, and fully in keeping with the overall aesthetic of the application. But every time Sparrow draws its breath I find myself holding my own for a split second. I feel like the application is subtly reminding me of the tiny bit of anxiety each email represents (Did I spell everything correctly? Will I be understood? Will she say yes? Will the contract be accepted? Did that joke work?).

Karin Schrock wrote an amazing article on the emotional impact of music for Scientific American titled, "Why Music Moves Us" (behind a pay wall). In it she describes the work of Steven Pinker, a Harvard University psychologist, who theorized that music hijacks brain systems which evolved for other purposes such as language, emotion and movement.

Music, she proposes, provides a method of communication rooted in emotion rather than meaning. She goes on to reveal the growing evidence that there is indeed a universal (potentially even culturally agnostic) language of music. One experiment she describes was conducted by neuroscientist Tom Fritz of the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig, Germany. She writes,

"[Fritz] and his colleagues exposed members of the Mafa ethnic group in Cameroon who had never heard Western music to excerpts of classical piano music. The researchers found that the adults who listened to the excerpts consistently identified them as happy, sad or scary just as Western listeners would. Thus, the ability of a song to elicit a particular emotion does not necessarily depend on cultural background."

What I'll Try Next:

A deeper, richer understanding of sound design in our interactive products promises to provide more universal experiences. I will seek out partners with a more thorough education in music theory to better help me apply what is already understood about melody, timbre, harmony, rhythm, tempo and more in these new contexts.

Moving Beyond Alerts

Audition as a sensory channel provides information asynchronously with the user's focus1. Think of a movie sound track. In a well-crafted movie your focus will be on the story but even before the story takes a fateful turn you will already feel fearful, excited, happy, or hopeful. The music will swell or the foley will gain intensity and your emotion will take the lead before your brain has received the plot twist.

Sound within interaction design has often been informed by the fact that a user can control where they look but not what they hear. This coupled with the asynchronous way we process this information leads designers to create "alerts" more than "landscapes". While very effective this misses a lot of possibility2.

What I'll Try Next:

Of all the things I've learned from this project this is the area that excites me the most. I believe this is largely unexplored territory outside of video games and that there are many promising areas for exploration. I've read a bit of Daniel Hug's research but my next step will probably be to make it through the rest of his work.


  1. I gained a number of userful insights on this topic from the paper "Microsounds: An Experimental Investigation of Sound Cues for Interaction" by David Theil, Mary Czerwinski, and Barry Peterson.

  2. Daniel Hug is doing fascinating research in this area which he has documented well including in his 2009 paper "Investigating Narrative and Performative Sound Design Strategies for Interactive Commodities".