Kickstarter for Buddie: open source, AI-enabled earbuds
How many times have you forgotten a name or fact that someone recently told you, or missed jotting down an important note in a meeting? Imagine getting an immediate, accurate response from a virtual assistant when you ask aloud, “What is my new acquaintance’s name?” or, “What are my action items for the project?”
For a virtual assistant to give you an accurate response to a question like this right away, it needs to be aware of the context behind the question. Context awareness requires the virtual assistant to be listening to conversations before your question or request for help.
This is the premise of Buddie. Buddie consists of earbuds and a smartphone app that provide a context-aware voice interface for artificial intelligence (AI) agents. It was developed by Electrical and Computer Engineering Professor Robert Dick with an international group of collaborators including Li Shang and Fan Yang at Fudan University in Shanghai, China. These researchers launched a Kickstarter campaign on December 23 to release the technology for everyday users to try and software developers to experiment with.
Steve Jobs once revolutionized mobile phones by defining the touch screen as their primary interface. Dick, Shang, and Yang believe that in the AI era, context-aware voice is the next transformative interface, and earbuds are the ideal form factor to enable effortless access to AI services that are hands-free and available anywhere, anytime.
To realize this vision, Buddie earbuds are always “listening” to gather context about the user’s conversations and interactions. Buddie was designed to put users in control of where their private data are sent, to safeguard privacy. The earbuds record talking, use an energy-efficient method to transmit the information to the user’s smartphone, convert the spoken words into a written transcript, and immediately delete the audio recording. The transcripts are saved to the user’s phone, where they can view the files, delete them, and ask a question for a third-party large language model (LLM) like ChatGPT to answer using the recorded context. Any answers from the LLM are read back out by voice.
“The spoken word is the primary communication interface for conversations among humans,” explained Dick, “and context awareness makes verbal communication more efficient and precise. Imagine walking into a room where others are in the middle of a conversation. You wouldn’t know what was going on unless they stopped and restated the context for you. That’s a situation AI assistants frequently face. If they had context, they could be more helpful and stop burdening you with repeated explanations. Without context, you can ask an AI assistant about an encyclopedia. With context, you can ask it about your life.”
Despite its utility for AI applications, constant listening introduces technical challenges for researchers. It can easily run down the batteries of earbuds and smartphones with its increased power consumption. Buddie uses carefully designed, energy-efficient, compression-based approaches to address the challenges of continuous communication.
The open source nature of the Buddie project is inspired by Arduino, a successful open source electronics platform that allows users of any skill level to create and share their own interactive projects. Dick hopes that Buddie users and researchers who purchase the device from the Kickstarter will craft and share their own uses, software modifications, and ideas for improvement. For this purpose, Buddie will be offered at cost—$40. The team would eventually like to have millions of people use it and share their experiences.
“The ideas behind Buddie were partially inspired by Vannevar Bush’s 1945 article, ‘As We May Think’ in The Atlantic, about a lifelogging system that essentially enables unbounded memory of one’s experiences and documents,” said Dick.
Related and future work on concepts of “lifelogging” and context-aware AI includes the development of MemX: attention-aware smart glasses, named after Vennevar’s imagined “Memex” (memory extender) system. Dick and his collaborators also envision smart glasses providing some of the advantages of one-on-one education by tracking the subject of the student’s attention, relating this to the meaning of what they are viewing, and inferring emotional state (e.g., confusion, frustration, focus) through facial expression.
For now, the research team has chosen to focus on audio through Buddie, due to its potential to enable widely available, context-aware verbal communication with AI assistants. They are also working on ways to further enhance privacy. Future versions will make it easy for users to select AI assistants with the strongest privacy policies, offer approaches to keep data under user control with on-board intelligence, and use methods of protecting user privacy during machine learning and inference processes.