Microsoft has been constantly updating the Skype team voice and video calling platform with new features that make it easier to connect with others. The platform has also been very important for organizations and at an individual level for making connections during last year’s lockdown. But the company has not yet updated the software. This time, it has paused sound to its desktop version.
In the blog post, the Skype team says that this tech was originally made for Microsoft Teams. “We are delighted to announce the launch of our latest audio background feature in the Skype desktop app. Originally developed for Microsoft Teams, this new feature is designed to simulate everything except your voice when meeting Skype, ”he adds.
This feature is found under the ‘Settings’ option and can be built with various options – Auto, Low and High.

Microsoft Skype ‘Stop sound’ feature. (Skype)
It is worth noting that the audio elimination feature is not yet available on the web or mobile versions of Skype. You can also find it in the Windows desktop app and on a Mac.
Talking more about the tech behind this feature, the blog post adds that the new sound suppression technology works by analyzing your audio feed and using sophisticated cloud networks to remove the sound without affecting the speaker’s voice. This is different because traditional sound suppression algorithms can address simple, consistent sounds as a fan, it can make no difference when it comes to more complex sounds like typing on the keyboard, barking at a food wrapper or when your pet dog is shaking.
Read also: Microsoft is finally backing up a vague background on Skype for Android
“This technology relies on machine learning (ML) to learn the difference between pure speech and sound and is often referred to as artificial intelligence (AI). A representative data set is used to train an ML module to operate in most of the situations our Skype users experience. There needs to be enough diversity in the database in terms of clean language, types of sound, and the environments from which our users interact with online calls, ”the post says.
To achieve this tech level, the team used approximately 760 hours of clean speech data and 180 hours of audio data in their datasets for training. For audio data, they included 150 audio types to handle the diverse situations including keyboard typing, running water, snoring and more.