Using Amazon Alexa to drive a radio-controlled car – Part 1

Introduction

Over the Easter holidays, while watching my son play with his radio-controlled toy car, I had a strange thought pop into my head. Instead of using the sticks on the remote control, won’t it be cool to control the car by using just your voice? You could tell the car to move forward, backward, left or right. What if you could save all the moves you have asked the car to take so far and then at a later time, get the car to replay all those moves?

Now, that would be a car I would love to play with!

In this blog, I will introduce the high-level design for accomplishing the above-mentioned goal. Then over the next few blogs I will take you through the steps to transform the high-level design into a working prototype.

Hardware Requirements

For this prototype, I settled on using the following hardware devices

  • Amazon Echo Dot – this will be used to process my voice commands
  • Raspberry Pi 3 with a GPIO expansion Breadboard
  • A set of four 5v Relay Board Module
  • A radio-controlled race car
  • A soldering iron, solder wire and a digital multimeter

Design considerations

To make the prototype work, I decided to create an Amazon Alexa Skill called race car. This will be used to process my voice commands.

Challenge #1: How would I control the radio-controlled car?

I found two options for this

1. Completely bypass the remote control and send the radio frequency instructions directly to the race car

2. Emulate the button presses on the remote control so that it “thinks” someone is pressing those buttons and then it sends the appropriate radio frequency instructions to the race car

Option Chosen: I chose option 2 because it required the least amount of work. For this option, the only thing I needed to figure out was what happened when a button was pressed. After some experimentation, I found the contacts on the printed circuit board (PCB) of the remote control which I could open and close the contacts on, to emulate the button presses.

Challenge #2: I will use a python script running on a Raspberry Pi 3 within my home network to emulate the button presses on the remote control. How will I get the Amazon Alexa Skill to connect to my Raspberry Pi 3 which is running on my internal home network?

Solution: I found a neat trick at https://developer.amazon.com/blogs/post/Tx14R0IYYGH3SKT/flask-ask-a-new-python-framework-for-rapid-alexa-skills-kit-development . To expose my internal Raspberry Pi 3 python script to the Amazon Alexa Skill, I will use ngrok (https://ngrok.com) to create a secure tunnel between my Raspberry Pi 3 and the ngrok service. This provides me with an HTTPS endpoint within ngrok’s domain, which forwards any requests directed at the ngrok endpoint to the python script running on my internal Raspberry Pi 3 using the secure tunnel.

High Level Design for the prototype

Using the above-mentioned design considerations, the below schematic was developed to create the prototype.

Let’s go through each of the steps (denoted by the numbers) to better understand the design.

1. The user will invoke the race car Amazon Alexa Skill and ask to either move the car in a certain direction, save all the movements that have been requested so far, or run a previously saved set of movements.

2. The Alexa device (Amazon Echo Dot) will record the audio from the user and send it to the Alexa Cloud for processing. Alexa Cloud converts the audio into JSON using Natural Language Processing (NLP). Based on the invocation name, it will pass the JSON file to the race car Amazon Alexa Skill.

3. The race car Amazon Alexa Skill will check to ensure that the intent supplied by the user is valid. Once confirmed, the race car Amazon Alexa Skill will pass the JSON to the endpoint defined for the skill. In our case, this is an endpoint that is hosted at ngrok (https://ngrok.com)

4. The ngrok endpoint will receive the JSON file from the race car Amazon Alexa Skill and then forward it using the secure tunnel to the python script running on the Raspberry Pi 3 within the home network. The python script will use the Flask-Ask framework to process the intents from the Alexa Skills Kit (more information for Flask-Ask can be obtained from https://flask-ask.readthedocs.io/en/latest/)

5. If the user requested to save all the car movements carried out so far, then the python script will write the movements to a table within Amazon DynamoDB.

6. If the user requested to load a previously saved set of movements, then the python script will read the movements from the table within Amazon DynamoDB.

7. If the use requested to either load a previously saved set of movements or to move the car in a certain direction, the python script will emulate the appropriate button presses on the remote control.

8. The remote control will translate the emulated button presses into radio frequency instructions and send them to the car. The car will receive these instructions and move accordingly.

To give you a sneak peek of the prototype, checkout the video at https://youtu.be/4SMYDhuri0Q (there are some minor bugs with the car movement which I intend on getting fixed as soon as possible).

In the next blog in this series, we will go through the process of “hacking” the remote control and also setting up the Raspberry Pi 3 ancillary hardware.

I hope to see you then.

Till then, enjoy!

Advertisements

Building a Breakfast Ordering Skill for Amazon Alexa – Part 1

Introduction

At the AWS Summit Sydney this year, Telstra decided to host a breakfast session for some of their VIP clients. This was more of a networking session, to get to know the clients much better. However, instead of having a “normal” breakfast session, we decided to take it up one level 😉

Breakfast ordering is quite “boring” if you ask me 😉 The waitress comes to the table, gives you a menu and asks what you would like to order. She then takes the order and after some time your meal is with you.

As it was AWS Summit, we decided to sprinkle a bit of technical fairy dust on the ordering process. Instead of having the waitress take the breakfast orders, we contemplated the idea of using Amazon Alexa instead 😉

I decided to give the Alexa skill development a go. However, not having any prior Alexa skill development experience, I anticipated an uphill battle, having to first learn the product and then developing for it. To my amazement, the learning curve wasn’t too steep and over a weekend, spending just 12 hours in total, I had a working proof of concept breakfast ordering skill ready!

Here is a link to the proof of concept skill https://youtu.be/Z5Prr31ya10

I then spent a week polishing the Alexa skill, giving it more “personality” and adding a more “human” experience.

All the work paid off when I got told that my Alexa skill would be used at the Telstra breakfast session! I was over the moon!

For the final product, to make things even more interesting, I created a business intelligence chart using Amazon QuickSight, showing the popularity of each of the food and drink items on the menu. The popularity was based on the orders that were being received.

BothVisualsSidebySide

Using a laptop, I displayed the chart near the Amazon Echo Dot. This was to help people choose what food or drink they wanted to order (a neat marketing trick 😉 ) . If you would like to know more about Amazon QuickSight, you can read about it at Amazon QuickSight – An elegant and easy to use business analytics tool

Just as a teaser, you can watch one of the ordering scenarios for the finished breakfast ordering skill at https://youtu.be/T5PU9Q8g8ys

In this blog, I will introduce the architecture behind Amazon Alexa and prepare you for creating an Amazon Alexa Skill. In the next blog, we will get our hands dirty with creating the breakfast ordering Alexa skill.

How does Amazon Alexa actually work?

I have heard a lot of people use the name “Alexa” interchangeably for the Amazon Echo devices. As good as it is for Amazon’s marketing team, unfortunately, I have to set the records straight. Amazon Echo are the physical devices that Amazon sells that interface to the Alexa Cloud. You can see the whole range at https://www.amazon.com/Amazon-Echo-And-Alexa-Devices/b?ie=UTF8&node=9818047011. These devices don’t have any smarts in them. They sit in the background listening for the “wake” command, and then they start streaming the audio to Alexa Cloud. Alexa Cloud is where all the smarts are located. Using speech recognition, machine learning and natural language processing, Alexa Cloud converts the audio to text. Alexa Cloud identifies the skill name that the user had requested, the intent and any slot values it finds (these will be explained further in the next blog). The intent and slot values (if any) are passed to the identified skill. The skill uses the input and processes it using some form of compute (AWS Lambda in my case) and then passes the output back to Alexa Cloud. Alexa Cloud, converts the skill output to Speech Synthesis Markup Language (SSML) and sends it to the Amazon Echo device. The device then converts the SSML to audio and plays it to the user.

Below is an overview of the process.

alexa-skills-kit-diagram._CB1519131325_

Diagram is from https://developer.amazon.com/blogs/alexa/post/1c9f0651-6f67-415d-baa2-542ebc0a84cc/build-engaging-skills-what-s-inside-the-alexa-json-request

Getting things ready

Getting an Alexa enabled device

The first thing to get is an Alexa enabled device. Amazon has released quite a few different varieties of Alexa enabled devices. You can checkout the whole family here.

If you are keen to try a side project, you can build your own Alexa device using a Raspberry Pi. A good guide can be found at https://www.lifehacker.com.au/2016/10/how-to-build-your-own-amazon-echo-with-a-raspberry-pi/

You can also try out EchoSim (Amazon Echo Simulator). This is a browser-based interface to Amazon Alexa. Please ensure you read the limits of EchoSim on their website. For instance, it cannot stream music

For developing the breakfast ordering skill, I decided to purchase an Amazon Echo Dot. It’s a nice compact device, which doesn’t cost much and can run off any usb power source. For the Telstra Breakfast session, I actually ran it off my portable battery pack 😉

Create an Amazon Account

Now that you have got yourself an Alexa enabled device, you will need an Amazon account to register it with. You can use one that you already have or create a new one. If you don’t have an Amazon account, you can either create one beforehand by going to https://www.amazon.com or you can create it straight from the Alexa app (the Alexa app is used to register the Amazon Echo device).

Setup your Amazon Echo Device

Use the Alexa app to setup your Amazon Echo device. When you login to the app, you will be asked for the Amazon Account credentials. As stated above, if you don’t have an Amazon account, you can create it from within the app.

Create an Alexa Developer Account

To create skills for Alexa, you need a developer account. If you don’t have one already, you can create one by going to https://developer.amazon.com/alexa. There are no costs associated with creating an Alexa developer account.

Just make sure that the username you choose for your Alexa developer account matches the username of the Amazon account to which your Amazon Echo is registered to. This will enable you to test your Alexa skills on your Amazon Echo device without having to publish it on the Alexa Skills Store (the skills will show under Your Skills in the Alexa App)

Create an AWS Free Tier Account

In order to process any of the requests sent to the breakfast ordering Alexa skill, we will make use of AWS Lambda. AWS Lambda provides a cheap and cost-effective way to run code due to the fact that you are only charged for the time that the code is run. There are no costs for any idle time.

If you already have an AWS account, you can use that otherwise, you can sign up for an AWS Free tier account by going to https://aws.amazon.com . AWS provides a lot of services for free for the first 12 months under the Free Tier, with some services continuing the free tier allowance even beyond the 12 months (AWS Lambda is one such). For a full list of Free Tier services, visit https://aws.amazon.com/free/

High Level Architecture for the Breakfast Ordering Skill

Below is the architectural overview for the Breakfast Ordering Skill that I built. I will introduce you to the various components over the next few blogs.Breakfast Ordering System_HighLevelArchitecture

In the next blog, I will take you through the Alexa Developer console, where we will use the Alexa Skills Kit (ASK) to start creating our breakfast ordering skill. We will define the invocation name, intents, slot names for our Alexa Skill. Not familiar with these terms? Don’t worry,  I will explain them in the next blog.  I hope to see you there.

See you soon.