AI Assistants with WhatsApp
How to get started with building a WhatsApp AI Assistant.
Please note: Since writing this article, OpenAI have introduced threads which is a far better way of handling message streams.
By now, you cannot have avoided hearing about the hype train that is ChatGPT. AI is here, and it's likely to take over a lot of the tasks we used to do manually.
One of the things ChatGPT is good for is impersonating a persona and having a discussion. In fact, it's pretty easy to build a simple wrapper to do just that and chat to AI on your favourite platform. Let's take a look at how we can do this with ChatGPT on WhatsApp using Twilio.
How does it work?
First, let's take a look at what we need to build:
DIAGRAM
So essentially we will build some kind of middleware that sits between WhatsApp and ChatGPT to be able to provide the right context to the system. In a production system, you would likely put this into a database but for the purposes of today's example, we will keep it in memory.
To run the AI Assistant, you'll need to set up a Twilio trial account. They have a WhatsApp sandbox service you can use until you are ready to use a real phone number.
Building the app
The app will run on Node.js and Express; use dotenv for environment configurations; and use the OpenAI and Twilio APIs.
npm install dotenv express openai twilio
We will have a single POST endpoint. I'm not going to detail how to use Express here - there are plenty of tutorials for that - but the base of your code will look something like this:
const express = require("express");
const app = express();
const port = 3000;
app.use(express.json());
app.post('/message', async (req, res) => {
// code will go here
});
app.listen(port, () => {
console.log(`Starting app on port ${port}`)
})
We'll need to grab some API keys and config to make this work:
- An OpenAI key
- A Twilio Account ID
- A Twilio Auth Token
- A phone number on Twilio
These should live in a .env file like the one below:
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
TWILIO_NUMBER=
OPENAI_API_KEY=
We can then refer to them at the top of our file:
const { Configuration, OpenAIApi } = require("openai");
require("dotenv").config();
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY
});
const client = require('twilio')(process.env.TWILIO_ACCOUNT_SID, process.env.TWILIO_AUTH_TOKEN)
const twilioNumber = process.env.TWILIO_NUMBER;
const openai = new OpenAIApi(configuration);
We also probably want to include our prompt toward the top of the file (or in a secondary file):
const prompt = `Add prompt here`;
Our POST endpoint will be responsible for handling:
- receiving a message from the user
- sending the message stream to OpenAI
- and then sending the response to the user.
For this simple demo, I've created an in-memory JSON object that stores the user's phone number and message stream - but you will want to store this in an encrypted database on a production environment.
app.post('/message', async (req, res) => {
const { From, Body, ProfileName } = req.body;
// create a new user if the number doesn't exist
if (!messages[From]) {
messages[From] = {
name: ProfileName,
chat: [prompt] // add the base prompt
};
}
The message is received by the POST endpoint. We then determine if we already have this user in our object. If we don't, we create them and add the initial prompt to our chat message stream. If we do, we retreive the chat history.
const currentMessageStream = messages[From].chat;
// add the user message to our message stream
currentMessageStream.push({"role": "user", "content": Body});
Now we have the message stream, we can push the user's new message to the array.
const response = await openai.createChatCompletion({
model: "gpt-4",
messages: currentMessageStream,
max_tokens: 120,
temperature: 0.5
});
const reply = await response.data.choices[0].message.content;
currentMessageStream.push({"role": "system", "content": reply});
Next, we send the user's message to OpenAI and wait for a response, before recording that in our message stream.
return await client.messages.create({
from: `whatsapp:${twilioNumber}`,
body,
to
});
res.send('')
});
Finally, we send the message back via WhatsApp to the user, and tell Express we are done with the request.
Running our app
With that complete, it's now time to run our Node app with node start
. If you've been following along, you should have the app available on PORT 3000.
With the app up and running we need to configure Twilio's WhatsApp to use the service. The easiest way to do this locally is to use Ngrok (brew install ngrok/ngrok/ngrok
). Once ngrok is installed you can run ngrok http 3000
and copy the forwarding address.
Log in to Twilio and paste the forwarding address in to Twilio's WhatsApp webhook
Within the same page, use the 'Sandbox participants' details at the bottom of the page to add the WhatsApp number, and then type and send the code as join [code]
.
With these steps complete, you should now be able to chat to your very own WhatsApp chat bot :)