User Experience Design
AR for VIPs Icon white body_square.png

Augmented Reality for Visually Impaired People

Augmented Reality for Visually Impaired People (AR for VIPs)

AR for VIPs Icon.png

Augmented Reality for Visually Impaired People (AR for VIPs)

In 2019, as the capstone research project for my UC Berkeley Masters of Information Management & Systems program, I led research on a project to use augmented reality to help blind and visually impaired people navigate in unfamiliar surroundings.

 This project is still under construction; please check back soon! In the meantime, you can view our award winning video, watch our presentation or see the website.

 

Augmented Reality for Visually Impaired People (AR for VIPs) is a research project from the University of California's School of Information that seeks to put augmented reality to use as an accessibility tool for people with low vision or blindness.

Descriptive audio available.

Presentation of Augmented Reality for Visually Impaired People (AR for VIPs) at UC Berkeley's School of Information MIMS capstone presentations 2019.

 
 
Our research poster explaining the basics of AR for VIPs.

Our research poster explaining the basics of AR for VIPs.

 
 

Pardon our dust. The sections below are currently under construction. Please check back soon for the finished version.




 

For the capstone research project of my Berkeley master’s degree, I chose to examine how augmented reality could be used to aid blind and visually impaired people in indoor navigation. I worked alongside 3 graduate students and 4 undergraduates to conduct user research, design, prototyping, and user testing of the AR for VIPs system, and created an award-winning video demonstrating the system and its necessity.

 

Project at a Glance

Table of Contents

Timeline: 9 months, September 2018 - May 2019

Project Type: Capstone Research

Team: myself, three other graduate students, and four undergraduate students

My Role:

  • Spearheaded initial research proposal
  • Conducted user research & existing solution benchmarking alongside team
  • Design of initial system with other designer
  • Project manager & development lead for prototyping
  • Assisted user testing
  • Production of award-winning demonstration video

Tools: Unity, C#, Adobe After Effects, Microsoft HoloLens


 

Context

 

The Challenge

For the roughly 36 million blind people around the world, navigating unfamiliar spaces can be a frustrating or dangerous experience. Though people can learn to navigate cautiously through sound and touch, many places assume that their visitors can quickly see layouts and features, and offer no substitute to printed signs and other purely visual navigational necessities.

Blind Silhouette

Augmented reality holds the potential to alleviate these problems. Though no silver bullet, we set out to prove that modern AR headsets’ capabilities to construct 3D maps of surrounding spaces and access machine learning tools such as text recognition could prove a powerful addition to classic tools like the white cane and guide dog.

AR Icon.png
 

Problem Definition

Perkins School for the Blind Bus Stop Challenge.
Audio descriptions available at
https://youtu.be/NOjUsunoVbk

Of course, this is a huge problem space. Ultimately, we were inspired by the Perkins School for the Blind Bus Stop Challenge to focus on the “5 meter problem”: how blind users can bridge the final ~5 meters between the approximate location their GPS guides them to and the actual location of their goal.

We defined our success criteria as the user being able to:

  1. Identify the location of simple obstacles, such as bus stop poles, from a distance

  2. Read semantic information written on these obstacles in order to identify their goal

  3. Navigate to their goal

Note that due to the constraints of the generation 1 HoloLens (see “Constraints” below), we knew from the start that any prototype we made would be effective in controlled environments only, and would not be viable as an actual product for at least several years. Thus, our goal was to conduct research that could inform future accessible design for AR, not release a consumer product.

 

Users

We aimed for our application to be useful to both totally blind and low-vision people by implementing a purely audio interface. (Visual interface elements seen in screenshots and videos were for testing and illustration purposes only.) In doing so, we hoped that our interface would be compatible with sight-based vision aids such as contrast enhancement. Unfortunately, it also meant that our app would only be useful to blind people with some amount of hearing ability.

 

Team and Role

Core Team

Our core team consisted of myself and three other master’s students in the School of Information. We ran the project from start to finish.

  • Myself, XR designer, lead developer and project manager

  • Alyssa Li, XR designer with a focus on sound design

  • Anu Pandey, user researcher and healthcare professional

  • Rohan Kar, XR developer and machine learning expert

Secondary Team

Our secondary team consisted of four undergraduate students from Virtual Reality at Berkeley, Berkeley’s undergrad XR association, who assisted in development.

Advisors

In addition to those actively working on the project, we received advice and guidance from many sources.

  • Berkeley professor Kimiko Ryokai served as our academic advisor, offering advice on many aspects of the project throughout its development. Professors John Chuang and Coye Cheshire also gave invaluable help. Professor Emily Cooper of the School of Optometry’s sight enhancement HoloLens research proved a valuable reference throughout the project, and she was kind enough to lend us a HoloLens for testing purposes.

  • Microsoft Principal Software Engineer Robert “Robin” Held and Senior Audio Director Joe Kelly helped immensely with their detailed knowledge of HoloLens systems.

  • Lighthouse Labs Technology Director Erin Lauridsen gave us great feedback and connected us with many people in the blind & low vision tech community to speak and test with.

  • National Federation of the Blind’s Lou Anne Blake, Curtis Chong, and Arielle Silverman gave us great advice on which use cases to focus on.

  • Berkeley Web Access’ Lucy Greco was a huge help, offering us advice on several occasions and testing our prototype as well.

 

My Role

Overall, I took on a leadership role in the project.

  • I originated the project as a generalized push for XR accessibility, searched out an advisor, and recruited the other three graduate members to the team.

  • I served as project manager, leading meetings identifying long term and short term goals, assigning tasks, and ensuring we were on track to meet them.

  • Together with Alyssa, I designed the overall interface and functionality.

  • I coordinated the development of individual components and wrote code to combine them together, as well as developing the interface itself.

  • I edited our project video, which won the School of Information Capstone Video Award.

  • I introduced the project at the public I School Capstone Presentations.


 

Constraints

 

Time & Funding

As a student research project, we had no funding to speak of, and our applications for grants were rejected. Fortunately we had access to development space & hardware through Virtual Reality at Berkeley, but this constrained us to what additional supplies we could afford out of pocket. Similarly, though the capstone research was our main priority during the last semester, we all had multiple other time commitments, and had to juggle other classes and graduation preparations while working on it.

Stopwatch.png
Bankrupt.png
 

Blind User Access

As none of us on the team were blind or visually impaired, we knew we had to involve blind and low vision people early and often to make sure that we were designing with them, not at them. However, this made getting feedback and review from blind users something that had to be arranged well in advance and typically required traveling to their offices. With a blind person on the team, we could have had significantly tighter feedback cycles.

Blind User Access.png
 

HoloLens Limitations

HoloLens Development Edition

HoloLens Development Edition

We were using the original HoloLens Development Edition for most of our work, which had the following key limitations:

  • Slow scanning speed - it took slowly and deliberately scanning back and forth to build up an accurate mesh, effectively requiring pre-scanned areas to be useful.

  • Poor outdoor performance - due to the sun’s infrared rays interfering with the HoloLens’ scanning capabilities, we were limited to indoor scanning. This meant that scanning actual bus stops would not be effective.

  • Heavy weight/poor ergonomics - every single blind user we tested with commented that the HoloLens would be uncomfortable to wear for more than a few minutes.’

  • Blind-inaccessible setup - the HoloLens interface was not usable by non-expert users without vision, meaning we would have to launch and monitor the app for blind users.

 

Put together, these limitations meant that we knew our project would only be usable in strictly controlled conditions. A consumer-ready product would likely not be possible on this hardware. However, the project would still have utility as research, paving the way for products to come.


 

Design Process

 

Prologue: Exploring XR Accessibility

One of the cardinal rules of UX design is to start with user needs and find technologies that fit them, not vice versa. We broke that rule with this project; here’s why.

 

Initial Exploration

The School of Information prompt for capstone research is pretty liberal: it simply asks for “a challenging piece of work that integrates the skills and concepts students have learned during their tenure.” At the start of my final year, I only knew that I wanted to work on something related to XR accessibility. I wasn’t sure if it would shape up to be a set of guidelines for developers like AbleGamer’s Includification Guidelines, an accessibility add-on to Unity or the Virtual Reality Toolkit, or something else entirely, but I dove in headfirst into exploration, and soon recruited Anu and Alyssa to the cause.

 

Insight: Information Asymmetry in AR

While we were researching, an idea caught our attention. AR devices are constantly scanning the environment in order to build up a 3d representation in which to place imagery. For sighted users, these would hold little new information; but for blind users, the AR device knows many things about the environment that they do not. If we could figure out which 3d mapping data were useful, and when, we could exploit this information asymmetry to have the AR device communicate useful information to a blind user in real time.

 
AR Scan

Idea: AR device environmental scans could hold useful information for blind people

Hackathon Experimentation

We soon had a chance to prototype this theory. Alyssa and I, along with classmates Soravis “Sun” Prakkamakul, Neha Mittal, and Ankit Bansal, entered the Magic Leap AT&T Hackathon in early November 2018. The organizers wanted to show that AR can be for more than simple entertainment purposes. Our submission, Eyes for the Blind, was a rudimentary exploration of sound feedback based on spatial mapping; with practice we were able to successfully wander around the exhibition hall with our eyes closed without bumping into anything, and nearly managed to integrate text recognition. We knew this idea had potential, and we had an inkling of how we could go about it.

Our Magic Leap Hack team testing our initial theory.

Our Magic Leap Hack team testing our initial theory.

Between the hackathon and our initial research trajectory, we ended up focusing on figuring out how AR technology - specifically headsets with advanced spatial mapping capabilities like the HoloLens and Magic Leap - could act as accessibility devices for blind and low-vision users. Although this was a tech-first approach, we felt that showcasing these devices’ potential for accessibility was worthwhile and could lead to better access to these kinds of tools for blind users in the future.


 

Research: People, Papers, and Products

Though we had a prototype in hand after the frenzy of the hackathon, we wanted to be sure that our project was on solid ground. To do so, we stowed our prototype and went back to basics, looking to three key sources to learn the lay of the land when it comes to low-vision accessibility: people, papers, and products.

 

Blind People & Organizations

Straightaway we made contact with the organizations mentioned in the “advisors” section above. We wanted to understand which technologies were commonly used and what user needs they left unfilled. Ultimately, we had five interview sessions with 1-4 interviewees per session. Our key takeaways:

  1. The guide dog and white cane are the gold standards of accessibility technology, relied on by millions of people. New tools should attempt to supplement, not replace, these traditional tools.

  2. While gathering physical information about nearby objects can be useful, identifying and gathering semantic information such as text in their immediate environment is a frequent unmet need for blind users.

  3. For blind people, sound is a valuable resource. Interfaces that carelessly make noise and cover up important environmental sounds can be annoying or even dangerous for blind users.

Some of these were really revealing to us. For example, we heard from one interviewee:

Bumping into walls or tables may seem like a problem to you, but for us it’s just how we learn where things are.
— Blind interviewee

We had thought that helping users navigate more smoothly without having to touch everything would be a major plus, but it became clear that we still had to overcome our bias as sighted people.

 

Academic Papers

Academia is often a test bed for applying new technologies to accessibility challenges, especially blindness; we found a plethora of papers examining various approaches. We chose about 20 papers to examine closely, split them between us, and dove in, recording major takeaways in a Google Drive spreadsheet and ranking the papers by applicability.

We organized our literature review sheet to ensure easy information recall and citation for the ~20 academic papers we examined..

We organized our literature review sheet to ensure easy information recall and citation for the ~20 academic papers we examined..

 

Some of the most interesting papers included:

In addition to the subject matter itself, these papers often included open source code which we could utilize in our application, as well as offering guidelines for how to structure our user testing.

 

Products for Blind Users

The competitive landscape for products aimed at aiding blind user navigation held myriad products. Broadly, these fell into a few categories.

  1. Obstacle Avoidance - Tools that help a user detect nearby obstacles and improve close range awareness. Includes the two classic tools, the white cane & guide dog, as well as various “sonic canes” like the SmartCane.

  2. Navigation - Smartphone applications like Microsoft Soundscape or Google Maps that clue users into their surroundings and navigate them to objective.

  3. Sight Enhancement - Tools that let users make better use of limited vision, including classic tools like magnifying glasses as well as more advanced ones like OxSight

  4. Machine Vision - Tools that leverage AI to read text, discern colors, or identify other visual semantic information. May include object or facial recognition as well. Includes headsets like OrCam and apps like Textgrabber.

  5. Human Assistance - Services like Aira & Be My Eyes that connect a blind user to a sighted human assistant. Leveraging humans is powerful but adds additional costs and privacy concerns.

We theorized that there might be a lot of value in a generalized head-worn device that could fulfill multiple roles. Just as a smartphone gains a lot of utility from being a phone AND a calculator AND a mapping device, etc., we thought that a generalized AR platform like the HoloLens could replace multiple single-use devices, and that the combination of these features might result in a sum greater than its parts.


 

Ideation: Probing Possibilities

 

Goals, Contexts, & Assumptions

As we started brainstorming in earnest, we found it helpful to lay out more clearly the possible design spaces we could focus on. For the HoloLens developer edition, that meant the following.

Goal Types

  • Exploration - trying to gain a general sense of what’s around them

  • Destination - trying to seek out a particular place or object

Constraints for Current Headsets

Due to cost, menu inoperability for blind users, and battery life concerns, HoloLenses would have to be set up by sighted helpers and loaned to blind users

Due to poor scanning in sunlight and slow scanning overall, spaces would have to be indoor and likely pre-scanned or equipped with beacons

Plausible Locations

  • Relatively controlled, indoor spaces.

  • Large enough to warrant navigational aids.

  • Employees to help manage the devices.

Stores, airports, malls, and museums topped the list.

Present or Future?

In determining the design space, we had to figure something out: were we designing for the HoloLens’ present capabilities, or for some more robust yet-to-be-revealed AR headset? We knew based on the HoloLens limitations (cost, scanning speed, etc. - see “Constraints” above) that it wasn’t something blind users would be able to buy and use themselves in uncontrolled conditions. We could have designed for specific implementations, such as finding a particular item in a grocery store, that fit those constraints.

 
Were we designing for the HoloLens’ present capabilities, or for some more robust yet-to-be-revealed AR headset?
 

Ultimately, in looking for challenges to focus on, we veered towards the Perkins School for the Blind Bus Stop Challenge: helping users locate and identify bus stops. Considering that the HoloLens couldn’t operate in sunlight, and that a user operating the HoloLens to navigate public transport by themselves didn’t fit into the indoor, loaned device scenarios that were more likely in the short term, this may have been an incongruous choice. However, it had the benefits of being a concrete challenge identified by the blind community, being relatively easy to simulate, and likely having lots of transferable applicability.

 

The Singing Mesh

Going into the initial hackathon, our central idea was sonification of the 3D mesh that AR headsets construct of the environment - aka “the singing mesh.” Much in the way bats, dolphins, and even some humans can use echolocation to determine the locations of objects in their environment, we thought that by producing spatial sound at the location of scanned objects, we could increase blind user awareness. The overall effect would be of the whole mesh “singing” to the user to inform them where things are, giving even typically silent objects like tables and chairs a voice blind users could hear.

The HoloLens, Magic Leap, and other advanced headsets construct environmental meshes as the user scans.

The HoloLens, Magic Leap, and other advanced headsets construct environmental meshes as the user scans.

By sonifying objects detected in the scans, we hoped to increase the user’s awareness of their environment.

By sonifying objects detected in the scans, we hoped to increase the user’s awareness of their environment.

 

Text Recognition

From our discussions with blind people, we knew that gathering semantic information would be important, and one type clearly stood out from the rest: text. Braille is vanishingly rare and hard to find compared to the absolute abundance of written text, with text often being the only way to differentiate, say, the office you want to get to from the dozen identical ones next to it.

Even then, it wasn’t obvious how text recognition should ideally behave. Simply reading out all text around the user as soon as its spotted could have downsides - imagine a user hearing a billboard text instead of an oncoming car when crossing the street, or being flooded nonstop with product labels in a grocery store.

In keeping with our “sound is a valuable resource” user research finding, we decided that text recognition should either be manually triggered by the user, or toggled to look for a particular word. Due to development constraints, we only managed to implement the former.

 

Other Considered Features

Some other features we considered developing, but remained on the shelf:

  • Guidance sound - a tone that users could assign to a location or object, then let them home in on it; similar to a feature in Soundscape. Could alternately be implemented as an “anchor sound” to let users mark a place they’ve been, or a “compass sound” to let users know which way is north. Based on user stories of disorientation in rooms with symmetrical layouts or places with few landmarks like parking lots.

  • Obstacle warning - an alert sound that plays when the user is about to run into a head height obstacle, based on user stories of finding obstacles with their face that were too high for their cane. (A common occurrence, and why many blind people wear baseball caps.)

  • Perimeter scan - a sound that would trace the edge of the room the user is currently in; a quick way to build spatial understanding. Deprioritized after several users told us they could understand room boundaries simply through sound quality.

  • Voice notes - users or administrators could leave voice notes for themselves or other users at specific locations; an asynchronous means of leveraging crowdsourcing and human intelligence.

  • Object recognition & other machine learning - using machine vision to recognize things besides text, such as objects, faces, or colors.

  • Be My Eyes integration - integrating with Be My Eyes or another human assistance program to leverage human optical systems

In the end, we had to cut scope pretty aggressively to match our technical talent and timeline, and while all of these offered interesting supplements, we felt that obstacle sonification and text recognition were the most powerful core value tools.

 

Greater Than the Sum of its Parts

Alone, obstacle detection and text recognition are each useful. However, we hoped that both together could become greater than the sum of its parts. Consider the bus stop challenge:

  • Obstacle detection alone could find tall, narrow objects, but not distinguish between trees, parking limit signs, and the bus stop signs the user is actually looking for.

  • Text recognition could read signs to verify that the user is in the right place, but without assistance, users may have a hard time finding said signs.

By pairing the two together, we felt confident we were onto something good.


 

Development: Mastering the Mesh

  • Project management

    • Acquiring resources, setting schedules

    • Asana use

  • Splitting into 3 teams

    • Sound design

    • Obstacle sonification

    • Text recognition

  • Handling integration

  • Lessons learned during prototyping


 

User Testing: Putting the Prototype through its Paces


 

Future Directions


 

Retrospective


Icons via the Noun Project.
AR for VIPs logo — Lloyd Humphreys
Blind man silhouette — Scott de Jonge
AR icon — Meikamoo
Stopwatch — Tezar Tantular
Bankrupt — Phạm Thanh Lộc
Blind user access — Corpus delicti