What if AI assistants like Siri, Cortana, or Alexa had a human face? What would they look like? In an episode from The Big Bang Theory, Rajesh daydreamed about what it would be like to date Siri in person.

Rajesh’s example, while an exaggerated one, reveals that humans are wired to look for a physical and visual character “behind” a voice or text during conversation. This is because among all our sensory channels like sight, touch, hearing, smell, taste, etc., humans rely heavily on visual inputs. One neurological finding states that roughly 30% of neurons in our brain are devoted for sight, whereas only 8% for touch and 2% for hearing (Grady, 1993). Moreover, more than 60% of the brain is somewhat involved with vision, including neurons devoted for vision + touch, vision + motor, vision + attention, and vision + spatial navigation (Goldstein, 2013).

Consciously or not, available visual cues aid us to interact with others in a more smooth and enjoyable way. The case holds true among human-to-human as well as human-to-robot interactions. In an IEEE 2013 study, a fluffy dog tail was attached to a Roomba vacuum robot to suggest its working status via different motion patterns. For instance, if the robot is cleaning a room smoothly, the tail would be wagging to indicate its happiness. People who participated in the study found it easier to understand what the robot was “feeling” and found themselves amused by the robot.

Similarly, AI assistant with a physical body would be able to convey much more information than the current voice-only agents. For instance, the AI assistant’s facial expressions could show emotions, gestures like a simple nod could provide feedback, and movements like pointing to a direction could suggest what to do next. Plus, we get more engaged in the conversation being able to see the other party, since people spend more than 30% of the time trying to make eye contact during face-to-face interactions (Schwind & Jäger, 2016).

Adding visual cues can aid communication with robots: a Roomba with a fluffy tail (Image Link)

ObEN’s artificial intelligence technology quickly and easily creates lively personal avatars that look and talk like you with one selfie and a brief voice recording. Imagine listening to songs recommended by Adele’s avatar or going through today’s NBA sports news reported by Lebron James’ avatar. Advances in AI technology is making this a possibility in the future. As depicted in an episode from Black Mirror, you could even have yourself as the AI assistant who knows and understands you the best. 

In the not so far future, we’ll be able to have our personalized avatar in mobile, virtual reality or augmented reality worlds interact with each other, and speak out famous lines from the blockbuster movie Avatar (2009) to loved ones: “I see you”.

About the Author: Jackie is a researcher on ObEN’s computer vision team

Having a digital self who knows you the best as AI assistant. From Black Mirror (Image Link)

ObEN的专利人工智能技术能够迅速将一个人的平面图像和声音结合起来,创造3D虚拟形象。将你的个人3D虚拟形象复制到任何虚拟现实或者增强现实的情境中,享受更深刻,记忆更持久的社交体验。创始于2014年,ObEN是HTC旗下VIVE X速创项目计划中的公司之一,目前在加州帕萨迪纳的领先科技孵化器创意工作室内办公。