writing »

Symbiotic Artificial Intelligence

An evolving catalog of what I’ve learned in the fields of artificial intelligence and human-computer interaction.

22 Jun 2019



We are witness to the convergence of multiple scientific trends that will enable us humans to take control of our bodies like never before: CRISPR allows us humans to modify our own DNA, the internet augments the human memory to the entire sum of human memory, AI will allow humans to process inordinate amounts of information thus increasing the scope of our intuition, and AR will increase the information bandwidth of our senses.

It’s a matter of when we humans will gain full control over evolution: we will soon be able to control our genetics and augment the brain that allowed us to become the dominant creatures on Earth.

Symbiotic Artificial Intelligence is a term that encompasses a broad set of fields in computing. It expresses the principle that humans and machines can learn from each other in a way that improves the way humans live and work in the real world. Symbiotic AI is a fundamentally optimistic view of the future of human-computer interaction. Symbiotic AI also takes on synonymous terms like “human-machine symbiosis” and (less commonly) “human-machine co-evolution”. For our purposes, the word “machine” can be used interchangeably with “computer.”

Symbiotic AI can take many forms:

  • a 12 year old spending the entire day with her smartphone being able to research any fact from all of human knowledge after just a few taps, or
  • a student using a Remembrance Agent [Rhodes 1997] to automatically pull up meeting notes with every person he’s ever met, or
  • a “Borg” like creature (see Star Trek: Voyager) merging man and literal metal into one being.

In this document, I will investigate non-invasive forms of human augmentation as opposed to those integrations that require implants, insertions into, or piercings of the body. Such integrated devices are able to take more accurate measure of our physiology and can also communicate more efficiently with organs like our sub-epidermal skin and brain. However, these implanted devices are unlikely to be adopted in the near future due to fears over personal injury and the idea of ‘having a computer inside your body.’

The focus of this document will be on non-invasive wearable technologies like head-worn displays and wrist-worn displays (i.e. smartwatches). This category of devices offer many benefits (explored in depth later) including the option of “turning off” the system and removing wearable augmentations with ease. Wearable computers are also much more socially acceptable than implanted devices for the time being.

Human-Machine Symbiosis

Closing the loop between intention and action

The first, and perhaps most important, principle of Symbiotic AI is of reducing the time (and effort) between intention and action. Once the time between intent and action reduces below two seconds, we are able to start forming habits.

Reducing this delta is the premise behind a multi-billion dollar online shopping industry: Amazon’s patented 1-Click [Hartman et al. 1999] system eliminates the time needed to check out through a shopping cart. Amazon’s Dash Buttons further reduces this delta by removing a screen from the purchasing interaction: a single tap of a tangible button is all that’s needed to order common consumables. Once a habit is formed, it’s ingrained, it’s useful, and it’s profitable (for those of that persuasion).

The distance between action and intent: It’s a key concept for the melding of men and machines, humans and AI. Reduce the overhead of a communication bridge or the time to act on a thought. Then habits are formed and we get closer to achieving our goals and our machines get closer to us.

Bridging the biological and the digital

[Maes 1997] observes that we’ve experienced a revolution in computing in the past 30 years: our bodies have been augmented in a significant and permanent way; the two “halves” of our brains are no longer the “left” or “right” but instead the biological we are given and the digital we use to voluntarily augment our memory and communication. Maes provides a compelling way to think about

The interface between our biological and digital worlds is currently mediated by smartphones and laptops. These interfaces are marked by three fundamental problems:

  • I/O bandwidth is limited,
  • they requires our complete attention, disengaging us from the physical world,
  • they serve to distract us quite often as they aren’t necessarily designed for the goals of people in mind.

Different apps on phones are designed to distract you as much as possible, making you as inefficient and ineffective in accomplishing goals leading to increased “multitasking” which contributes to shorter attention spans and worse memory recall [Stanford University 2018].

[Maes 1997] argued that we will become one with the computer, but we must redesign the experience of how we interact with their digital devices. In this direction, Maes argues for a new approach built upon three pillars:

  • A system aware of context and a user’s internal state. It pays attention to what the user is paying attention to, to brain activity, heart rate, breathing rate, etc.
  • An always-on augmented interface integrated in a seamless and non-disruptive way.
  • A system that has its roots in proactive and personalized interaction that offers relevant information and intervention, given what the user is currently trying to do, the user’s state, and the user’s real/human goals.

In this direction, a symbiotic relationship between users and machines will help with self-actualization itself. It will aid us in growing and developing into the people we want to become: We all want to change and grow in different ways; we will rely on symbiotic forms of personal devices more and more to help us change with time.

Maes outlines four broad categories in which symbiotic systems can support humans:

  • Supporting Decision Making
  • Supporting Learning
  • Supporting Memory
  • Supporting Mood and Well-being

We will discuss each of these categories in the following chapters. ##

Augmented Reality

Augmented reality refers to any augmentation of our natural senses1. Augmented reality encompases traditional experiences like those provided in smartphones today2. It also includes experiences delivered in fully-immersive Virtual Reality experiences. We can represent AR on a spectrum from the real world to that of the virtual:

None Figure: [Milgram et al. 1995]

On this particular spectrum exist two primary stages of manipulation:

  • Augmented reality
  • Virtual reality

A second, more useful spectrum exists on the scale of VR to AR. “True AR” encompasses a system “[fused coherently] with the user’s real environment.” In traditional formulations of AR, AR forms a proper superset of VR where in VR “we just need to block the user’s experience of the real world.” [Sandor et al. 2015] Thus, as VR substitutes the entire visual experience, it demands a higher-level of realism than AR. [Sandor et al. 2015] segment their approaches to AR into four categories ranging from “manipulating atoms” to “manipulating perception”:

  • Controlled Matter
  • Surround AR
  • Personalized AR
  • Implanted AR

The Reality-Virtuality Spectrum

There is quite a bit of marketing language from head-worn display manufacturers that occludes a simple understanding of this subject.

Real environment

Simple enough to grasp, this is the real world in which we all reside and perceive.

Augmented reality (AR)

This form of augmentation encompasses a broader scope of marketing terms including divided into two broad categories:

  • those involving registered graphics and
  • those involving static graphics. It’s important to note that both static and registered graphics can still make use of contextual information3 in similar ways; only the mechanism of conveying such information is different between these forms of AR.

Registered graphics

Registered graphics encompasses technologies developed by Microsoft (“Mixed Reality”) or augmented reality solutions developed in the research literature (e.g. Starner, Feiner, et. al) known as registered AR or spatial AR.

This form of AR involves placing virtual computer artifacts into the real world by either:

  • displaying such graphics through a traditional display (i.e. an iPhone running ARKit), or
  • displaying such graphics through a holographic display (i.e. a HoloLens).

None A holographic display placing registered graphics into the real world. Figure: [Buntz]

Static graphics

Depending on the use case, static graphics may be more applicable. This form involves displaying static information generated by a computer that does not include an image of the real environment.

Virtual Reality

### Virtual reality (VR) involves complete immersion in a virtual environment with little or none of the real environment available to the user. None

Figure: [[CSL STYLE ERROR: reference with no printed form.]]

Augmented Reality Spectrum

This spectrum exists on a different axis from the Reality-Virtuality Spectrum and discusses the theoretical mechanisms with which reality can be manipulated.

Controlled Matter

None Figure: [Zambetta 2017]

Think The HoloDeck from Star Trek: matter is created and destroyed, or at the minimum photons are controlled by ‘force fields’ giving the impression of physical objects and interactions.

Another, more tangible4, example is the inFORM project from MIT Media Lab: None Figure: [2013]

[Sandor et al. 2015] notes Sutherland’s suggestion that physically creating and destroying atoms forms an ideal for True AR because it would “create physical objects consistent over all interaction modalities and all users.”

Surround AR

The next level closer to the user is by “manipulating [the] photons” that reach them. Surround AR lacks the physical interactivity enabled by Controlled Matter but can create an augmented reality indistinguishable from physical reality.

Think of an empty room panelled with high-fidelity displays. As the user walks around, her entire perception of the room changes. You can simulate what it’s like to stand in Wrigley Field: as your head moves around the room, the environment tracks you, and updates the environment accordingly to provide the illusion of movement.

However, if you were at Wrigley Field, you couldn’t sit down at the bleeachers and enjoy a hot dog. Matter itself isn’t manipulated, only light.

Personalized AR

HWDs like Google Glass or Magic Leap provide us with this level of augmentation. Our visual field is augmented with new objects placed indistinguishably from the physical world. Each user augments their own visual perception. This doesn’t preclude shared experiences but requires some level high-speed networking.

Implant AR

Completely invisible to the outside world, this level of AR involves directly manipulating where visual perception is interpreted: in our brains. An implant is placed in or near the visual cortex and directly manipulates our perception of the physical world.

Think controlled LSD trip.

The Case for Non-Visual AR

Many primary tasks in our lives (such as driving, studying, or playing with friends) require a significant amount of visual attention and engagement in the natural, physical, non-visually-augmented world. In these situations, it is safer to manipulate a different sense than the visual one.

We have many different senses in our body. In common parlance, we have five senses: taste, touch, smell, sight, and hearing. We also have an extended sensor system that plays less visible of a role in our lives: sensoriomotor/proprioception (bodily awareness), equilibrioception (balance), among others [Sensory Trust 2003]. Many of these senses can alternate between an active or *background *role depending on the situation.

Among these 8+ factors, we can see how many are engaged by each of the daily tasks below:

Daily situation Primary task Secondary task (usually optional) Primary senses engaged Secondary senses engaged
Brushing Moving arm to brush teeth N/A Touch (active), Proprioception N/A
Driving Driving automobile Listen to audio (music, podcast, audiobook), converse with passengers, Sight, Touch Proprioception, Hearing (background for emergencies) Hearing (active)
Cooking Cooking food in a kitchen Listen to audio (music, podcast, audiobook), converse with family or friends Touch, Proprioception, Sight, Taste, Smell, Hearing (background, for alarms) Hearing
Household chores Cleaning, Laundry Listen to audio (music, podcast, audiobook) Touch, Proprioception, Sight, Smell Hearing
Studying Reading,Watching lectures N/A Sight, Hearing N/A
Watching TV or YouTube Watching a screen Converse with friends or family, Draw Sight, Hearing Hearing (background)
Visiting an art museum Walking around an art museum (Usually quiet) Sight, Proprioception (for walking) N/A

Opportunities for creating truly immersive and useful augmented reality systems comes at the following places:

  • Look up situations in which senses are not used for a primary task and there is no secondary task (it is N/A). Explore situations where these unused senses could be used in a secondary task.
  • Look up situations where a sense is not used in a primary or secondary task. Explore situations where these unused senses could be used in a different secondary task or tertiary task.

For example, when studying (and not taking notes), we do not use our sense of touch. We can create a system for augmenting our learning situation through the sense of touch using haptic interactions.

Another example, given that sight is the only major sense used during a walk in an art museum, we could provide an augmented reality experience through our auditory or touch systems. (You can imagine walking around a museum as a wearable system, like AirPods, automatically detects what you are looking at and can also use natural language queries to answer questions you have of the art work.)

We keep coming back to three main senses: touch, sight, and hearing. These form the basis for many of the interactions we have with the world around us. Integrating each of these three components into an augmented reality system has the most potential for creating an immersive and meaningful experience. By taking full advantage of the ‘sensor fusion’ in our minds that we can more effectively learn behaviors and actions, we can increase the comfort, productivity, and flow we feel in various daily tasks.

Glossary of terms

Acronym Term Definition
AI Artificial intelligence The field of creating computer agents comparable to humans in a variety of discrete tasks ranging from image classification to automobile driving.
API Application programmers interface  
AR Augmented reality  
HCD Human-centered design  
HCI Human-computer interaction A field of computer science that focuses on the medium of interaction between humans and their computers.
HWD Head-worn display Describes any display mounted on the head with information presented to the eyes and ears through a display and speakers, respectively.
HUD Heads-up display A subset of HWDs that details
VR Virtual reality A fully-immersive


Buntz, B. Augmented Reality Technology Gaining Ground for Industrial Uses. https://www.iotworldtoday.com/2019/06/14/augmented-reality-technology-heating-up-in-industrial-space/. Fleenor, S.E. 2019. The Seven of Nine binge guide. SYFY WIRE. https://www.syfy.com/syfywire/the-seven-of-nine-binge-guide. Hartman, P., Bezos, J.P., Kaphan, S., and Spiegel, J. 1999. Method and system for placing a purchase order via a communications network. US Patent. https://patentimages.storage.googleapis.com/37/e6/81/3ebb1f33c41b4a/US5960411.pdf. inFORM. 2013. . Maes, P. 1997. Pattie Maes On Sofware Agents: Humanizing The Global Computer. IEEE Internet Computing 1, 10–19. http://dx.doi.org/10.1109/mic.1997.612209. Milgram, P., Takemura, H., Utsumi, A., and Kishino, F. 1995. Augmented reality: a class of displays on the reality-virtuality continuum. Telemanipulator and Telepresence Technologies, International Society for Optics and Photonics, 282–292. Rhodes, B.J. 1997. The wearable remembrance agent: A system for augmented memory. Personal Technologies 1, 4, 218–224. Sandor, C., Fuchs, M., Cassinelli, A., et al. 2015. Breaking the barriers to true augmented reality. arXiv preprint arXiv:1512. 05471. Sensory Trust. 2003. Sensory Trust.5, 9, 21, 53… how many senses? https://www.sensorytrust.org.uk/information/articles/senses.html. Stanford University. 2018. Heavy multitaskers have reduced memory. Stanford News. https://news.stanford.edu/2018/10/25/decade-data-reveals-heavy-multitaskers-reduced-memory-psychologist-says/. Virtual Room | #1 Virtual Reality Sydney | Multiplayer VR Escape Room. Sydney. https://sydney.virtual-room.com/. Zambetta, F. 2017. Star Trek’s Holodeck: from science fiction to a new reality. The Conversation. http://theconversation.com/star-treks-holodeck-from-science-fiction-to-a-new-reality-74839.

  1. e.g. sight, vestibular, auditory, touch (with haptics), etc. 

  2. e.g. an ARKit iOS application for playing darts: https://www.youtube.com/watch?v=Dg9kcm_Li08 

  3. information such as the location, time, people involved, retailers nearby, users nearby, etc. 

  4. because it was created by the Tangible Media Group :) 

© Pramod Kotipalli 2019

This site is open source. Improve this page