Basically, a chatterbot is a computer program that when you provide it with some inputs in Natural Language (English, French ...) responds with something meaningful in that same language. Which means that the strength of a chatterbot could be directly measured by the quality of the output selected by the Bot in response to the user?
By the previous description, we could deduce that a very basic chatterbot can be written in a few lines of code in a given specific programming language.
Let’s make our first chatterbot (notice that all the codes that will be used in this tutorial will be written in Pascal. So, it is assumed that the reader is familiar with this language)
{ Program Name: chatterbot1 }
{ Description: this is a very basic example of a chatterbot program }
{ Author: Gonzales Cenelia }
program Chatterbot1;
{$APPTYPE CONSOLE}
type stringx = array[1..5] of string;
const
responses: stringx = ('CONTINUE, I''M LISTENING.', 'VERY INTERESTING CONVERSATION!',
'I HEARD YOU.', 'SO, YOU ARE TALKING TO ME', 'TELL ME MORE...');
var
input, response: string;
bQuit: boolean;
index: integer;
begin
bQuit:= false;
randomize();
repeat
write('>');
readln(input);
if input = 'bye' then
begin
writeln('Bye user, see you next time!');
bQuit:= true
end
else
begin
index:= random(5) + 1;
response:= responses[index];
writeln(response);
end;
until bQuit = true;
readln;
end.
|
Today, we have chatbot such Alice who has successfully won the bronze medal three consecutive times in the Loebner contest and was also judge as being the most human chatbot in the contest.
Alice uses A.I.M.L for the representation of her database.
Due to her usage of A.I.M.L as a standard way to represent her knowledgebase, Alice remains one of the most popular chatbot over the internet. Alice was created by Dr. Richard Wallace.
Chatbots in general are considered to belong to the weak a.i field (weak artificial intelligence) as opposed to strong a.i who's goal is to create programs that are as intelligent as humans or more intelligent. But it doesn't mean that chatbots do not have any true potential. Being able to create a program that could communicate the same way humans do would be a great advance for the a.i field. Chatbot is this part of artificial intelligence which is more accessible to hobbyist (it only take some average programming skill to be a chatbot programmer). So, programmers out there who wanted to create true a.i or some kind of artificial intelligence, writing intelligent chatbots is a great place to start!
So, now we know what to do to improve "our first chatterbot" and make it more intelligent.
Let’s proceed on writing "our second bot", we will call it chatterbot2.
(* Program Name: chatterbot2
Description: this is an improved version of the previous chatterbot program "chatterbot1"
this one will try a little bit more to understand what the user is trying to say
Author: Gonzales Cenelia
Date: 2 july 2009
*)
program Chatterbot2;
var
sInput, sResponse : string;
nSelection : integer;
type
sList = array[1 .. 3] of string;
sList2 = array[1 .. 4] of string;
var
responses : sList;
const
maxInput = 1;
maxResp = 3;
NumOfRecords = 6;
type
StringArray = array[1 .. 6, 1 .. 4] of string;
const
KnowledgeBase : StringArray = (
('WHAT IS YOUR NAME',
'MY NAME IS CHATTERBOT2.',
'YOU CAN CALL ME CHATTERBOT2.',
'WHY DO YOU WANT TO KNOW MY NAME?'
),
('HI',
'HI THERE!',
'HOW ARE YOU?',
'HI!'
),
('HOW ARE YOU',
'I''M DOING FINE!',
'I''M DOING WELL AND YOU?',
'WHY DO YOU WANT TO KNOW HOW AM I DOING?'
),
('WHO ARE YOU',
'I''M AN A.I PROGRAM.',
'I THINK THAT YOU KNOW WHO I''M.',
'WHY ARE YOU ASKING?'
),
('ARE YOU INTELLIGENT',
'YES,OFCORSE.',
'WHAT DO YOU THINK?',
'ACTUALY,I''M VERY INTELLIGENT!'
),
('ARE YOU REAL',
'DOES THAT QUESTION REALLY MATERS TO YOU?',
'WHAT DO YOU MEAN BY THAT?',
'I''M AS REAL AS I CAN BE.'
)
);
procedure CopyArray(Array1 : sList2; var Array2 : sList; startPos : integer);
var
index : integer;
begin
for index:= startPos to maxResp + 1 do
begin
Array2[index - 1]:= Array1[index];
end;
end;
(* make a search for the user's input
inside the database of the program *)
function FindMatch( input : string ) : sList;
var
match : sList;
i : integer;
begin
for i:= 1 to NumOfRecords do
begin
if (KnowledgeBase[i][1] = input) then
begin
CopyArray(KnowledgeBase[i], match, 2);
break;
end;
end;
FindMatch:= match;
end;
(* Main Procedure *)
begin
randomize();
while true do
begin
write('>');
readln(sInput);
responses:= FindMatch(sInput);
{print_array(responses);
writeln;}
if sInput = 'BYE' then
begin
writeln('IT WAS NICE TALKING TO YOU USER, SEE YOU NEXT TIME!');
break;
end
else if length(responses[1]) = 0 then
begin
writeln('I''M NOT SURE IF I UNDERSTAND WHAT YOU ARE TALKING ABOUT');
end
else
begin
nSelection:= random(maxResp) + 1;
sResponse:= responses[nselection];
writeln(sResponse);
end;
end;
readln;
end.
|
Now, the program can understand some sentences like "what is your name", "are you intelligent" etc And also he can choose an appropriate response from his list of responses for this given sentence and just display it on the screen. Unlike the previous version of the program(chatterbot1) Chatterbot2 is capable of choosing a suitable response to the given user input without choosing random responses that doesn't take into account what actualy the user trying to say.
We've also added a couple of new technics to theses new program: when the
program is unable to find a matching keyword the current user input, it simply
answers by saying that it doesn't understand wich is quiet human like.
There are quiet a few things that we can improve, the first one is that since the chatterbot tends to be very repetitive, we might create a mechanism to control these repetitions. We could simply store the previous response of that Chatbot within a string "sPrevResponse" and make some checkings when selecting the next bot response to see if it's not equal to the previous response. If it is the case, we then select a new response from the available responses.
The other thing that we could improve would be the way that the chatbot handles the users inputs, currently if you enter an input that is in lower case the Chatbot would not understand anything about it even if there would be a match inside the bot's database for that input. Also if the input contains extra spaces or punctuation characters (!;,.) this also would prevent the Chatbot from understanding the input. That's the reason why we will try to introduce some new mechanism to preprocess the users inputs before it can be search into the Chatbot database. We could have a function to put the users inputs in upper case since the keywords inside the database are in uppercase and another procedure to just remove all of the punctuations and extra spaces that could be found within users input. That said, we now have enough material to write our next chatterbot: "Chattebot3". View the code for Chatterbot3
The other possibility is much more complex, it use's the concept of Fuzzy String Search. To apply this method, it could be useful at first to break the inputs and the current keyword in separate words, after that we could create two different vectors, the first one could be use to store the words for the input and the other one would store the words for the current keyword. Once we have done this we could use the Levenshtein distance for measuring the distance between the two word vectors. (Notice that in order for this method to be effective we would also need an extra keyword that would represent the subject of the current keyword).
So, there you have it, two different methods for improving the chatterbot.
Actualy we could combine both methods and just selecting which one to use on each situation.
Finally, there are still another problem that you may have noticed with the previous chatterbot, you could repeat the same sentence over and over and the program wouldn't have any reaction to this. We need also to correct this problem.
So, we are now ready to write our fourth chatterbot, we will simply call it chatterbot4. View the code for Chatterbot4
As you probably may have seen, the code for "chatterbot4" is very similar to the one for "chatterbot3" but also there was some key changes in it.
In particular, the function for searching for keywords inside the database is now a little bit more flexible.
So,what next?
Dont worry,there are still a lot of things to be covered.
Before proceeding to the next part of this tutorial, you are encouraged to try compiling and running the code for "chatterbot5" so that you can understand how it works and also to verifies the changes that have been made in it.
Has you may have seen, the implementation of the "current chatterbot", is now encapsulated into a class, also, there has been some new functions added to the new version of the program.
procedure select_response()
This function selects a response from a list of responses, it uses the "shuffle" function from the "Collections class" so that the list of responses are shuffled before the final selection
procedure save_prev_input()
This function simply saves the current user input into a variable (sPrevInput) before geting some new inputs from the user.
procedure save_prev_response()
The function "save_prev_response()" saves the current response of the chatterbot before the bot have started to search responses for the current input, the current responsesis save in the varaible (sPrevResponse).
procedure save_prev_event()
This function simply saves the current event (sEvent) into the variable (sPrevEvent).
An event can be when the program has dectected a "null input" from the user also, when the user repeats himself or even when the chatterbot makes repetitions has well etc.
procedure set_event(str1 : string)
Sets the current event (sEvent)
procedure save_input()
Makes a backup of the current input (sIntput) into the variable sInputBackup.
procedure set_input(str1 : string)
Sets the current input (sInput)
procedure restore_input()
Restores the value of the current input (sInput) that has been saved previously into the variable sInputBackup.
procedure print_response()
Prints the response that has been selected by the "chat robot" on the screen.
procedure preprocess_input()
This function does some preprocessing on the input like removing punctuations, redundant spaces charactes and also it converts the input to uppercase.
function bot_repeat() : boolean
Verifies if the chatterbot has started to repeat himself.
function user_repeat() : boolean
Verifies if the user has repeat himself.
function bot_understand() : boolean
Verifies that the bot understand the current user input (sInput).
function null_input() : boolean
Verifies if the current user input (sInput) is null.
function null_input_repetition() : boolean
Verifies if the user has repeated some null inputs.
function user_want_to_quit() boolean
Check to see if the user wants to quit the current session with the chatterbot.
function same_event() : boolean
Verifies if the current event (sEvent) is the same as the previous one (sPrevEvent).
function no_response() : boolean
Checks to see if the program has no response for the current input.
function same_input() boolean
Verifies if the current input (sInput) is the same as the previous one (sPrevInput).
function similar_input() : boolean
Checks to see if the current and previous input are similar, two inputs are considered similar if one of them is the substring of the other one
(Ex: "how are you" and "how are you doing" would be considered similar because "how are you" is a substring of "how are you doing".
procedure get_input()
Gets inputs from the user.
procedure respond()
Handles all responses of the "chat robot" whether it is for events or simply the current user input. So, basically, these function controls the behavior of the program.
find_match()
Finds responses for the current input.
procedure handle_repetition()
Handles repetions made by the program.
procedure handle_user_repetition()
Handles repetitions made by the user.
procedure handle_event(str1 : string)
this function handles events in general.
You can clearly see that "chatterbot5" have much more functionalities than "chatterbot4" and also each functionalities is encapsulated into methods (functions)
of the class "CBot" but still there are a lot more improvements to be made on it too.
Chattebot5 introduce the concept of "state", in these new version of the Chatterbot, we associate a different "state" to some of the events that can occur during a conversation. Ex: when the user enters a null input, the chatterbot would set itself into the "NULL INPUT**" state, when the user repeat the same sentence, it would go into the "REPETITION T1**" state, etc.
Also these new chatterbot uses a bigger database than the previous chatbot that we have seen so far: chatterbot1, chatterbot2, chatterbot3 ... But still, this is quiet insignificant due to the fact that most chatterbots in use today (the very popular ones) have a database of at least 10000 lines or more. So, this would definitely be one of the major goal that we might try to achieve into the next versions of the chatterbot.
But however for now, we will concentrate a little problem concerning the current chatterbot.
How did we arrive at this transformation? We may have done it by using two steps:
We make sure that the chatterbot have a list of response templates that is linked to the corresponding keywords. Responses templates are a sort of skeleton to build new responses for the chatterbot. Usually we used wildcards in the responses to indicate that it is a template. On the previous example, we have used the template: (SO, YOU THINK THAT*) to construct our response. During the reassembly process, we simply replace the wildcard by some part of the original input. In that same example, we have use: YOU ARE A MACHINE which is actually the complete original input from the user. After replacing the wildcard by the user's input, we have the following sentence: SO, YOU THINK THAT YOU ARE A MACHINE but we can not use these sentence as it is, before that we need to make some pronoun reversal in it.
Notice that it's not a good thing to use transposition too much during a conversation, the mechanism would become too obvious and it could create some repetition.
View the code for Chatterbot9
Some Examples of sentences using "WHO ARE YOU" would be:
1)WHO ARE YOU?
2)BY THE WAY, WHO ARE YOU?
3)SO TELL ME, WHO ARE YOU EXACTLY?
But a keyword such as "WHO IS" can only be found at the beginning or in the middle of a given sentence but it can not be found at end of the sentence or alone.
Examples of sentences using the keyword: "WHO IS"
1)WHO IS YOUR FAVORITE SINGER?
2)DO YOU KNOW WHO IS THE GREATEST MATHEMATICIAN OF ALL TIME?
3)TELL ME, DO YOU KNOW WHO IS? (These one clearly doesn't make any sense)
How do we make sure that the chatterbot will be able to distinguish such keywords and the specific places were they are aloud to be found on a sentence? We will simply introduce some new notations for keywords:
1) Keywords that can only be found at the beginning or in the middle of a sentence will be represented by: _KEYWORD (Ex: _WHO IS)
2)Keywords that could only be found at end or in the middle of a sentence will be denoted by: KEYWORD_
3)Keywords that should only be found alone in a sentence will be represented by: _KEYWORD_ (Ex: _WHAT)
4)And finally, keywords that can be found anywhere in a sentence or even alone would be simply represented by: KEYWORD (Ex: I UNDERSTAND)
A keyword can have different meanings depending on it's position in a given sentence.
View the code for Chatterbot10
To illustrate these concept, we are going to look at the following conversation log:
USER: What is your favorite movie?
CHATTERBOT: IT IS TERMINATOR II.
USER: Why do you like this movie? (Now how are we supposed to answer that question if we knew nothing about the previous response of the Chatbot?)
So clearly, some inputs requires the usage of "context" in order to be able to formulate a correct answer. In the previous example, it would simply be: IT IS TERMINATOR II.
Now the Bot knows what it was talking about previously, it can more easily formulate a good answer to the user's input.
we can now continue the previous conversation log:
(Context: IT IS TERMINATOR II)
CHATTERBOT: BECAUSE IT IS A SCIENCE-FICTION MOVIE AND I LOVE SCIENCE-FICTION.
Context also aloud us to control improper reaction from the Chatbot. Example, if the user enters the sentence: "Why do you like these movie?" during a conversation without the Chatterbot even talking about these subject.
It could simply respond by saying: WHAT ARE YOU TALKING ABOUT?
The context feature has been implemented in Chatterbot11.
View the code for Chatterbot11
Another great feature that would be very interesting to implement into a Chatterbot is the capacity to anticipate the next response of the user, these would make the Chabot
looks even more smarter during a conversation.
We now have a complete architecture for the database, we just need to implement theses features into the next version of the chatbot (Chatterbot13)
View the code for Chatterbot12