A short audio clip of a computer-generated voice has become the most divisive subject on the internet since the gold/blue dress controversy of 2015.

The audio “illusion”, which first appeared on Reddit, seems to be saying one word – but whether that word is “Yanny” or “Laurel” is the source of furious disagreement.

Professor David Alais from the University of Sydney’s school of psychology says the Yanny/Laurel sound is an example of a “perceptually ambiguous stimulus” such as the Necker cube or the face/vase illusion.

“They can be seen in two ways, and often the mind flips back and forth between the two interpretations. This happens because the brain can’t decide on a definitive interpretation,” Alais says.

“If there is little ambiguity, the brain locks on to a single perceptual interpretation. Here, the Yanny/Laurel sound is meant to be ambiguous because each sound has a similar timing and energy content – so in principle it’s confusable.

“All of this goes to highlight just how much the brain is an active interpreter of sensory input, and thus that the external world is less objective than we like to believe.”

Alais says that for him, and presumably many others, it’s “100% Yanny” without any ambiguity.

That lack of ambiguity he says is probably down to two reasons: firstly his age. At 52 his ears lack high frequency sensitivity, a natural result of ageing; and secondly, a difference in pronunciation between the North American accented computer-generated “Yanny” and “Laurel” and how the words would naturally be spoken in Australian or British English.

This argument is further supported by the assistant professor of audition and cognitive neuroscience Lars Riecke at Maastricht University. Speaking to the Verge, Riecke suggests the “secret is frequency … but some of it is also the mechanics of your ears, and what you’re expecting to hear”.

“Most sounds – including L and Y, which are among the ones at issue here – are made up of several frequencies at once … frequencies of the Y might have been made artificially higher, and the frequencies that make the L sound might have been dropped.”

Prof Hugh McDermott from Melbourne’s Bionics Institute suggests that while the frequency of the device you are listening on does have an impact, there are “a lot of different factors playing into it”.

“When the brain is uncertain of something, it uses surrounding cues to help you make the right decision,” he said.

“If you heard a conversation happening around you regarding ‘Laurel’ you wouldn’t have heard ‘Yanny’.

“Personal history can also give an unconscious preference for one or another. You could know many people named ‘Laurel’ and none called ‘Yanny’.”

McDermott also thinks visual cues may have played a part. “You would have noticed it had both the names appearing on the screen with no other context or information. This forces the brain to make a choice between those two alternatives.

“It is a compelling illusion and you can hear both those sounds either way.”

In National Geographic, Brad Story from the University of Arizona’s speech acoustics and physiology lab, claimed the original recording was “Laurel”but because the audio clip isn’t clear it leaves room for confusion and varying interpretations.

Story has experimented by recording his own voice pronouncing both words and found similar sound patterns for “Yanny” and “Laurel”.

Online commentators have added their own theories as to why people are hearing different words in the clip – and pointed out it varies depending on the level of frequency, amplitude and the type of speakers used to play back the clip.

This article originally appeared on The Guardian. Read the original here.

Get Tech insights and latest news here.

Read more about Software Co agency.

Leave a comment

Your email address will not be published. Required fields are marked *

Subscribe to our newsletter

Subscribe to our newsletter and we will send you industry news directly to your inbox

    Let’s talk.

    Note: We’ll keep your idea confidential with a signed NDA.