📅Confusion Matrix and its role in cybersecurity

AYUSH BAJPAI
4 min readJun 6, 2021

📙Welcome again to one more article on the topic of confusion matrix in Machine learning and its use case in the cybersecurity world. so let’s start off discussion on this interesting Machine learning topic:

🧐What is the confusion Matrix?

🤷‍♂️so let's first see what is mean by a confusion matrix,

The confusion matrix is a very common term when it comes to the Machine learning world. It was invented in 1904 by Karl Pearson. He used the term Contingency Table to describe the confusion matrix.

A confusion matrix is a performance measurement technique for Machine learning classification problems. It’s a simple table that helps us to know the performance of the classification model on test data for the true values are known. In classification problems, it is very much needed to specify the performance assessment. Classification accuracy is additionally a measure showing how well the classifier correctly identifies the objects and performs the prediction.

the confusion matrix shows the ways during which your classification model is confused when it makes predictions.

📕 In simple words, we can say that a confusion matrix is used to determine the accuracy of a machine learning algorithm.

Suppose we create a Machine model to predict whether a given image is of chocolate or not. Let there be a total of 100 predictions done by model:

🎯 True Positive:

Interpretation: You predicted positive and it’s true.

📌 True Negative:

Interpretation: You predicted negative and it’s true.

🧐 False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

🎯 False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

So this would give an idea of what the four boxes in the confusion matrix are representing.

NOTE: false positive is the most dangerous case in the confusion matrix because our model predicted according to our choice but it’s result is false in accordance with the actual value.

🤷‍♀️🤷‍♀️Confusion Matrix’s implementation in monitoring Cyber Attacks:

The data set used for The Third International Knowledge Discovery and Data processing Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data processing. The aim was to build a network intrusion detector, a predictive model which is capable of distinguishing between ``bad’’ connections, called
intrusions or attacks, and ``good’’ normal connections. This database contains a typical set of data to be audited, which incorporates a good sort of intrusions simulated during a military network environment.
In the KDD99 dataset these four attack classes (DoS, U2R, R2L, and probe) are divided into 22 different attack classes that tabulated below:

In the KDD Cup 99, the standards used for evaluation of the participant entries is that the Cost Per Test (CPT) computed using the confusion matrix and a given cost matrix.

  • True Positive (TP): The amount of attack detected when it is actually attacked.
  • True Negative (TN): The quantity of normal detected when it’s actually normal.
  • False Positive (FP): The quantity of attack detected when it’s actually normal (False alarm).
  • • False Negative (FN): The amount of normal detected when it is actually attacking.

So, after seeing this use case of confusion matrix in the Intrusion detection system, we can say that confusion matrix is very useful in determining the accuracy of a classification model.

At last, I want to conclude my article on the confusion matrix by leaving you with a hilarious example of confusion matrix, so see the example given below in Image:

That’s all.

👉thanks for reading this article.

--

--

AYUSH BAJPAI
0 Followers

ansible learner,coding enthusiast,learning new technology