Machine learning is a subfield of Artificial Intelligence (AI) where the goal is to comprehend the structure of data and fit that data into models that humans can comprehend and use. To achieve this goal, there are two widely used methods, they are:
- Supervised learning
- Unsupervised learning
In supervised learning, the algorithms are trained with human-labeled input and output data, while in unsupervised learning, the algorithm is trained with unlabeled data. The unsupervised learning algorithm needs to find a pattern in the data in order to classify the inputs. Chatbots are the applications of both supervised and unsupervised learning.
Chatbots are sophisticated conversational computer systems programmed to mimic human communication to provide automatic online guidance and assistance. Because of the rising benefits of chatbots, numerous sectors such as e-commerce, healthcare, and entertainment have adopted them to provide virtual assistance to clients.
They have been used in both the private sector, including virtual assistants powered by voice (e.g., Siri, Alexa, Google Now, Cortana) and public sector gaming agencies, telecommunications, banking, retail, stock market, etc.
Supervised learning-based chatbots are also called information retrieval models. These models contain a predefined set of possible answers; the chatbot analyzes the user question and chooses one of the answers contained in its collection depending on the user input.
A database of question-answer pairs is commonly used as the knowledge foundation for this type of model. This database is used to create a chat index, which lists all of the possible responses depending on the message that prompted them.
When a user delivers input to the chatbot, it is treated as a query. An information retrieval model similar to those used for web inquiries matches the user’s input to comparable ones in the chat index. As a result, the answer is the output returned to the user.
These models are built in two steps:
- Topic modeling for parallel text
- Determining the hierarchical design
The initial stage is to look for meaningful word co-occurrence patterns. The second stage is to model the architecture of co-occurrences across topics. Commonly used datasets for the supervised learning approaches are OpenSubtitles, Cornell, and the DailyDialog dataset.
As discussed above, supervised learning methods require labeled data to learn. While it may appear simple at first, it becomes increasingly difficult as the number of data increases. This also necessitates the presence of a supervisor, a subject matter expert who is continually labeling conversation data to a chatbot.
As a result, training supervised chatbots using this approach becomes costly. Unsupervised learning came to light to handle this problem in a more controlled manner where you just have input data (x) and no output labels.
Deep Learning Integration
Unsupervised chatbots generate new responses word by word, based on the input of the user. These models can thus create entirely new sentences to reply to users’ queries; however, they have to be trained through a difficult process to find out the structure and syntax. As the model generates the outputs, the outputs can somewhat lack quality or consistency.
This approach is typically supported by a deep learning platform composed of an Encoder-Decoder neural network model with Long-Short-Term-Memory mechanisms to counterbalance the vanishing gradient effect present in vanilla recurrent neural networks. Huge corpuses such as Wikipedia Corpus and Common Crawl are used to train these models.
Comparison of Results
Evaluation of conversational systems has been one of the difficult tasks. The two most widely used evaluation techniques to evaluate chatbots are:
- Human evaluation
- Automated evaluation metrics
One of the advantages of supervised chatbots is that it ensures the quality of the responses since they are not automatically generated. However, these models are less suitable for the underlying algorithm for conversational or chit-chat agents, the so-called social chatbots.
As unsupervised chatbots generate entirely new output, the outputs can somewhat lack quality or consistency. However, it provides several advantages. First, it is an end-to-end solution that can be trained using a variety of datasets, and hence on a variety of domains, rather than domain-specific expertise. It can also be adapted to work with other algorithms if further analysis on domain-specific knowledge is needed.
AI Is Ultimately Limited
Chatbots are applied in many other fields like e-commerce, education, healthcare, etc. Despite breakthroughs in technology, AI chatbots are still unable to mimic human speech.
This can be because of a faulty approach to dialogue modeling and a scarcity of domain-specific data with open access. However, with the recognition of chatbots and therefore the evolvement of computing, new and advanced capabilities of chatbots in text and voice are likely to be observed.