Categorization system with artificial intelligence in Node.js
"Artificial intelligence is on it's way" by Hitesh Choudhary on Unsplash

Categorization system with artificial intelligence in Node.js

I'll explain how to create an automated categorization system using artificial intelligence with Node.js in a very simple way

5 min read

Since it is very fashionable to use artificial intelligence (AI) for everything we will see how to do it simply using programming language Node.js, it seems complicated and that only a few can do it but the reality is that it is quite simple.

This same site is already using it, it is in beta phase simply because it has not trained enough yet, but we will see that later.

The idea is quite basic, we just have to give content to the system so that it knows which category belongs to each article that we do, the information if we do not have our own data we can extract it from other websites to feed the MACHINE and begin to learn what is each thing and where to find it.

To do the process you will need to create a database where to store the information may be the one you most want Mysql, mongodb, cassandra, the one you want, the idea is simply to store it to process it later, once you have enough content in this basse of data and marked to which category this content belongs we proceed to realize the script that will learn from this data.

We need to include in our project the package natural which is the one that will help us throughout this process. You can include it by running the following command or adding it to your package.json

npm install natural

Naive Bayesian (Bayesian)

This is the name of the method that we will use to classify in this example the articles, the package natural also admits the methodology of logistic regression.

var natural = require('natural');
var classifier = new natural.BayesClassifier();

Now let's proceed to train the algorithm and begin to understand.

classifier.addDocument('What is COPD? chronic Obstructive Pulmonary Disease or COPD is a pathology that seriously affects the correct functioning of the respiratory system. It is a condition, where a person has limitations to pass air to the lungs.', 'Health');

classifier.addDocument ('Cities to visit in Brazil. Brazil is one of the best countries to visit in America. Its tropical climate arouses the attention of tourists worldwide, as well as, from neighboring countries', 'Travel');

classifier.train ();

In this way we are telling the algorithm that this content is of the Health and Travel category respectively.

Now to classify the contents or articles once trained we do it in the following way.

console.log(classifier.classify('This is a health article that talks about COPD'));

The output in the console will be "Health ";

console.log(classifier.classify('Whenever I go to Brazil, the cities I like to visit the most are ...'));

The exit in this case will be "Travel";

In this simple way we have a system that we can automate so that it decides for us to which categories corresponds each article that we write.

If we want to obtain a list with more classifications to assign more than one category to each article, we can obtain a list that is ordered by the categories with more probabilities to which the article should be assigned with a numerical value indicating the probability in the coincidence for the given text, the greater the number the greater the likelihood that the article will have to be categorized and a smaller number as evidently the opposite.

console.log(classifier.getClassifications('Whenever I go to Brazil, the cities I like to visit the most are ...'));
  {label: 'Travel', value: 0.49999999999999997},
  {label: 'Health', value: 0.09999999999999998}

Well, once we have finished training our little ≪MONSTER≫ we will save everything we have learned so that we can reuse it later and we do not have to do the same process over and over again, since obviously this process consumes a lot and what has been learned It must obviously be saved.'classifier.json', function(err, classifier) ​​{
  // The classifier is saved in the classifier.json file!

In our code to know what you have already learned we will have to load it with the following sentence.

natural.BayesClassifier.load('classifier.json', null, function(err, classifier) ​​{
  console.log(classifier.classify('Text to classify'));
  console.log(classifier.classify('Other text to classify'));

Now our classifier already knows everything that he had learned every time our system is lifted or has to be restarted.

As you can see it is quite easy to classify our articles automatically with artificial intelligence in Node.js.

Surely you have seen many times as Facebook asks you if any video or any specific page belongs to any sector or category, this is without a doubt that Facebook is feeding your ≪MACHINE≫ so that it understands better and better, Google he does something similar with his captcha, and WebMediums soon he will also use this technique internally to test, then make it public and users can also classify something or improve the AI if they want to do it.

PS: Example of how it works, if they go to the javascript category of WebMediums will see how this post is categorized in this category: D.