Train,Test,Validation

Dataset

To begin, we need a dataset as data is crucial for any ML/AI model.

Data Splitting

The first step in our classification task is to randomly split our data into 3 independent sets:

Training Set: The dataset that we feed our model to learn underlying patterns and relationships.

Validation Set: The dataset that we use to understand our model's performance and tune it accordingly.

Test Set: The dataset that we use to asses model's performance in the real world.

Training the Model

Now, let’s go ahead and train our model on training dataset.
Here, we will teach the AI model to learn and make predictions or perform specific tasks. But wait, there are a plethora of classification algorithms available:

Logistic Regression
Support Vector Machines (SVM)
Random Forest
Naive Bayes

Let’s use the Logistic Regression model for today!

Building The Model

What you are performing here is supervised learning where the model learns from labeled examples to make predictions.

Drag each animal in the training set to a new position to see how model updates the decision boundary!

Validating the Model

Now that we have trained the model, we will assess its performance using a validation set.

On the basis of the assesment, you can tweak the parameters of the model to try and get the desired performance.

Testing the Model

Great! We have tested our model and we have reached an accuracy of 75%.
This means that the model will accurately classify cats and dogs 3/4 times.

We can now assess the model performance using a confusion matrix: