Naive Bayes Classification Using Python Programming
Name: Tejas Sahoo
Roll No: K057
Branch: BTech Cyber Security
Aim:
To implement supervised classification using Python programming with Naive Bayes.
Introduction:
Naive Bayes classification is a probabilistic classifier based on Bayes’ theorem. It assumes that the features are independent, which is a simplifying assumption often not matching reality.
Advantages:
- Extremely fast for both training and prediction.
- Very interpretable and requires few parameters.
Applications:
- Text classification and spam filtering.
- Predictive modeling in various domains.
Output
Description :
Return the first n rows.
This function returns the first n rows for the object based
on position. It is useful for quickly testing if your object
has the right type of data in it.
For negative values of n, this function returns all rows except
the last n rows, equivalent to df[:-n].
Accuracy :
- print(accuracy_score(y_test,y_pred)*100)
b) What is the significance of data preprocessing in classification?
Data preprocessing is required for cleaning the data and making it suitable for a machine learning model which also increases the accuracy and efficiency of a machine learning model. It is a data mining technique that transforms raw data into an understandable format. Raw data (real world data) is always incomplete and that data cannot be sent through a model. That would cause certain errors.
It involves below steps:
● Getting the dataset
● Importing libraries
● Importing datasets
● Finding Missing Data
● Encoding Categorical Data
● Splitting dataset into training and test set
● Feature scaling
Conclusion:
We successfully applied the concept of supervised data mining and got a model with an accuracy of 87.69%. We also generated a list of actual vs predicted status through our model prediction using the Naïve Bayes classifier.