Lots of people have been looking into the usefulness of Amazon Echo as a piece of assistive technology. I decided it was time for me to try it out myself. Echo is pretty good at picking up your voice, even at a distance, even with other noise around, sending to Amazon for processing to figure out what you said, and then responding to it that.
Echo is compatible with a few common home automation products out there including Phillips Hue lights and Belkin WeMo power outlets. Using Echo together with these products, out of the box, with very little setup, you can turn simple appliances on and off with your voice. By simple appliances I mean anything that you could fully turn on or off by pulling the power plug out or plugging it back in, ie. lamps, fans, some ACs, etc. More advanced electronics (and even some models of the above) will not simply turn back on when plugged back in and/or do not handle well being unplugged while still on, so would not work in that setup.
However I was curious if you could push Echo’s utility a little further, and it turns out there are many different methods that allow you to do that. Of course they all have different tradeoffs in terms of setup time, financial cost, ease of use, etc. Here’s the result of using one such method to control a power wheelchair.
NOTE: this is NOT what Amazon Echo was designed for. This is NOT its intended use. However I found the result somewhat interesting and illustrative of some of Echo’s limitations and limitations of voice control overall in this method. These demos lack CRITICAL safeties that need to be in place for anyone (but especially someone with significant disability) to use safely.
This post gives some background and method but does not attempt to be a full how-to, nor is it actually showing off a fully implemented solution. However parts of it do work pretty well and may spark some ideas. There are many other voice controlled wheelchair solutions available, both professional and DIY. This project was done specifically to gain experience and learn more about using Amazon Echo.
Now, even though a lot of work was done for me by others before me, there was still a lot of decisions to make about how to combine and apply things I found out there. One decision that profoundly affects the outcome is exactly in what way voice would specify the user’s intent. Here’s one spectrum of that decision:
At one end that I’ll call micromanagement: actions begin as you say them (forward, left, etc) and continue until you give a command to do otherwise.
And at the other end that I’ll call destination-specification: you say a destination, such as ‘bedroom’, and the wheelchair’s computer plots a course there (given the map of the space it has loaded) and follows it, detecting new obstacles along the way and going around them (or at least stopping).
There’s been an immense amount of work, long before Amazon Echo, as relates to speech recognition and even specifically to voice controlled wheelchairs. I did not draw much at all from that body of work as I was just trying to run a quick test. A lot can be gained by reviewing some of the previous research done, for example:
For my Echo based implementation, I had the constraint of using Echo’s “Alexa, turn on X” and “Alexa, turn off X” commands. Having this relatively long phrase makes control more cumbersome than simple moment to moment LEFT’s and RIGHT’s or just naming a destination. That combined with Echo not responding instantly (voice is processed on Amazon’s servers) and sometimes not recognizing speech correctly makes a real-time micromanaged system not practical. However, also, I wanted to do something ‘relatively’ simple, so that for me ruled out destination specification and all the mapping, pathfinding, and obstacle avoiding work that involves. That pushed me towards something somewhere in between. I opted for the user specifying an action(direction) and a duration in seconds. This did mostly work as shown in the video, though it can get tedious, and a little frustrating if you over or undershoot what you really wanted to do.
So what led to the “Alexa, turn on X” constraint I was working under? Well, unless you use the official Amazon Echo SDK (Alexa Skills Kit, linked below), and host your server code for Amazon’s servers to talk to, you’re limited to types of interactions Echo already can do. Further it seemed to me that going down the Echo SDK road didn’t change the paradigm in any useful way. All voice commands would still have to follow the format of “Alexa, [action word] [target]” or very similar, so not really faster to say, and not any faster for Echo to process and respond.
So what types of interactions could be leveraged that Echo already has? Well, there’s turning on WeMo outlets and Hue bulbs as discussed, and there’s adding things to your Amazon “To Do” list, “Alexa, add X to my To Do list”. IFTTT (if this then that) which a cool service for connecting various other services together, takes advantage of the To Do list. In this strange example you would say “Alexa, add ‘left three’ to my To Do list”. IFTTT sees the item added to the list and then triggers an HTTP request of your choosing. For example maybe http://220.127.116.11/drive?cmd=left or however else you’ve setup your device to be web controlled. Setting up your device to be web controlled is where most of the work would be, the IFTTT part is really quick to setup and test. And there are already countless tutorials online about setting up devices to be web controlled with Raspberry Pi’s, Arduinos, and many other affordable boards. The information is out there, it’s just about the decisions and tradeoffs.
Going back to the Hue bulbs, there are Hue Bridge Emulators that will run on a Raspberry Pi. In other words, you can setup your Raspberry Pi to appear to Echo as a set of Hue light bulbs. By commanding Echo to turn different lights on and off, you’re really just commanding your Raspberry Pi to do whatever you set it up to do. This was the road I went down, and in my case the Raspberry Pi forwards the given command on to the power wheelchair. One benefit of this approach is how the Echo is communicating directly with the Raspberry Pi, not needing to go through other outside servers (that need to be setup) first. The Hue Emulator I used is linked to below, but note that there exist MANY Hue and WeMo emulators out there. A few emulators won’t be detected as real devices by Echo, but many good ones will. Lots of people are tinkering with this stuff.
Now, I also knew that you could do speech recognition directly on the Raspberry Pi. So I figured it would be worth doing a video with Echo (and Amazon) taken out of the loop entirely.
I found that I got acceptable but worse results this way. First, given the quality and configuration of the mics I had on hand I had to give up on speaking to them over any real distance. Too many background sounds were being picked up and not filtered out leading to false positives on words, and words I did say were often not picked up at all unless I spoke quite loud. This led to me using a wireless headset with a mic. This worked better, but still not as consistent as Echo. And although I was running everything on my end, and could change the command format however I wanted, I ended up with a very similar structure, “Jazzy Forward Two”. Having a ‘wake word’ or keyword that starts every command is very useful to filter out false positives. Also, with the processing delay and accuracy issues, putting a time limit on drive related commands proved to be a useful safety.
For speech recognition I used PocketSphinx (linked below), part of the CMU Sphinx project from Carnegie Mellon. The video is showing version 0.8. I also tested the most recent version: 5 pre alpha. This later version gave me more accurate results but also seemed to run slower, adding even more delay to the response time. Given that I found version 0.8 more usable for this type of test.
NOTE: again, these demos were just to have something to frame a discussion around. For voice control to be used in a real wheelchair control scenario, MANY changes and additions would be necessary. Not the least of which being proximity and impact sensors around the chair to stop the motors as a safety. OR, if the user could activate at least one switch control, then a combination of switch and voice control may work very well. Also as previously noted, many voice controlled wheelchair solutions already exist.
For this project the main added electronics where:
-Raspberry Pi to process the voice commands
-ESP8266 which acted mostly as a WiFi receiver so the wheelchair could wirelessly get commands from the Raspberry Pi
-Arduino Uno to communicate with the wheelchair base and the RF remote for the fan
Final code I tested with:
The actual low level control of the wheelchair base uses the same method I employed in this project:
ACKNOWLEDGEMENTS: Some of the links below provide info and code from others that I adapted to do these tests. Pulling together this test in short order would not have been possible starting from scratch so I’m extremely grateful for their work.
Hue Bridge Emulator that I used:
Amazon Echo SDK (Alexa Skills Kit, ASK):