Confusion Matrix is no more a confusion !

Nikita Sahoo
3 min readJun 4, 2021

When we get the data, after data cleaning, pre-processing and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But hold on! How in the hell can we measure the effectiveness of our model. Better the effectiveness, better the performance and that’s exactly what we want. And it is where the Confusion matrix comes into the limelight. Confusion Matrix is a performance measurement for machine learning classification.

What is Confusion Matrix and why you need it?

Well, it is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

understanding TP, FP, FN, TN with example ..

True Positive:

Interpretation: You predicted positive and it’s true.

You predicted that an animal is a cat and it actually is.

True Negative:

Interpretation: You predicted negative and it’s true.

You predicted that animal is not a cat and it actually is not (it’s a dog).

False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

You predicted that animal is a cat but it actually is not (it’s a dog).

False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

You predicted that animal is a cat but it actually is not (it’s a dog).

CYBER CRIME CASES AND CONFUSION MATRIX:

In the present world, cybercrime offenses are happening at an alarming rate. As the use of the Internet is increasing many offenders, make use of this as a means of communication in order to commit a crime. Cybercrime will cost nearly $6 trillion per annum by 2021 as per the cybersecurity ventures report in 2020. For illegal activities, cybercriminals utilize any network computing devices as a primary means of communication with a victims’ devices, so attackers get profit in terms of finance, publicity and others by exploiting the vulnerabilities over the system. Cybercrimes are steadily increasing daily.

Evaluating cybercrime attacks and providing protective measures by manual methods using existing technical approaches and also investigations has often failed to control cybercrime attacks. Existing literature in the area of cybercrime offenses suffers from a lack of a computation methods to predict cybercrime, especially on unstructured data. Therefore, this study proposes a flexible computational tool using machine learning techniques to analyze cybercrimes rate at a state wise in a country that helps to classify cybercrimes.

Security analytics with the association of data analytic approaches help us for analyzing and classifying offenses from India-based integrated data that may be either structured or unstructured. The main strength of this work is testing analysis reports, which classify the offenses accurately with 99 percent accuracy.

Conclusion

A confusion matrix is a remarkable approach for evaluating a classification model. It provides accurate insight into how correctly the model has classified the classes depending upon the data fed or how the classes are misclassified.

Thank you for reading

--

--