See and hear: AI headphones block out background noise and focus on one speaker at a glance



A University of Washington team has developed an artificial intelligence system called Target Speech Hearing that allows a user wearing noise-canceling headphones to listen to a specific speaker by simply looking at them for a few seconds. The system cancels out all other noise in the environment and plays only the enrolled speaker’s voice, even as the listener moves around in noisy places. This system is an advancement of the previous semantic hearing system developed by the same team, which allowed users to selectively filter sounds from their environment.

The new system uses off-the-shelf headphones equipped with binaural microphones that capture the speaker’s voice. By tapping a button and directing their head towards the speaker, the user enrolls the speaker into the system. The machine learning software in the embedded computer learns the speaker’s vocal patterns and continues to play back the speaker’s voice in real time, even as the listener and speaker move around. The system’s ability to focus on the enrolled voice improves as the speaker continues talking, providing more training data for the AI.

While noise-canceling headphones like Apple’s AirPods Pro can automatically adjust sound levels during a conversation, the UW team’s prototype takes it a step further by allowing users to control whom they want to listen to and when. This technology could be especially helpful in crowded places such as restaurants or cafeterias where background noise makes it difficult to hear the person you’re speaking with. With a simple button press and a glance at the speaker, the system enhances the clarity of the conversation.

Currently, the system can only enroll one speaker at a time and cannot enroll a speaker if there is another loud voice coming from the same direction. However, users can run another enrollment on the speaker to improve clarity. The team presented its findings at the ACM CHI Conference on Human Factors in Computing Systems, and the code for the proof-of-concept device is available for others to build on. While the system is not yet commercially available, it shows promise in revolutionizing the way we listen to specific speakers in noisy environments.

In conclusion, the University of Washington’s artificial intelligence system called Target Speech Hearing represents a significant advancement in noise-canceling technology. By allowing users to enroll a specific speaker and hear only their voice in real time, even as they move around in noisy places, this system has the potential to revolutionize how we communicate in crowded environments. While the system is still in the prototype phase, its availability for others to build on opens up possibilities for further development and commercialization in the future.

