By Matthew Hutson
Somehow, even in a room full of loud conversations, our brains can focus on one voice in something called a cocktail party effect. But the higher it gets – or the older you get – the harder it has to be. Now, researchers may have figured out how to fix that – with a machine learning method called the silence cone.
Computer scientists were training a neural network, which largely mimics the stringing of the brain, to detect and separate the voices of several people speaking in a room. The network did so in part by measuring how long it took for the sounds to hit several microphones in the center of the room.
When the researchers tested their setup with a very loud background noise, they found that the silent cone detected two voices within 3.7º of their sources, they described the this month at the online-only Conference on Neural Information Processing Systems. That compares with a sensitivity of just 11.5º for the previous state-of-the-art technology. When the researchers trained their new system on additional voices, he conducted the same trick with eight voices – to a sensitivity of 6.3º – even though he had never heard more than four at a time. .
Such a system could be used one day in hearing aids, monitoring settings, talking phones, or laptops. The new technology, which also monitors mobile voices, could even make your Zoom calls easier, by separating and distracting background noise, from vacuum cleaners to children or -normal.