Technology: Python

Regression vs Classification - The most significant difference between regression vs classification is that while regression helps predict a continuous quantity, classification predicts discrete class labels.

Regression

Regression is a process of finding the correlations between dependent and independent variables. It helps in predicting the continuous variables such as prediction of Market Trends, prediction of House prices, etc.
The task of the Regression algorithm is to find the mapping function to map the input variable(x) to the continuous output variable(y).
Types of Regression

Simple Linear Regression
Multiple Linear Regression
Polynomial Regression
Support Vector Regression
Decision Tree Regression
Random Forest Regression

How to check accuracy

R Squared (to check goodness of best fit line)
Adjusted R Squared -> penalizes attributes that are not correlated

Classification

Classification is a process of finding a function which helps in dividing the dataset into classes based on different parameters. In Classification, a computer program is trained on the training dataset and based on that training, it categorizes the data into different classes.
It is the process of finding or discovering a model or function which helps in separating the data into multiple categorical classes i.e. discrete values. In classification, data is categorized under different labels according to some parameters given in input and then the labels are predicted for the data.
The derived mapping function could be demonstrated in the form of “IF-THEN” rules. The classification process deal with the problems where the data can be divided into binary or multiple discrete labels.
Types of ML Classification Algorithms:

Logistic Regression
K-Nearest Neighbours
Support Vector Machines
Kernel SVM
Naïve Bayes
Decision Tree Classification
Random Forest Classification

How to check accuracy

confusion matrix
accuracy score
true positive rate
recall value
precision value
F1 score

Difference between Regression and Classification

Regression Algorithm	Classification Algorithm
In Regression, the output variable must be of continuous nature or real value.	In Classification, the output variable must be a discrete value.
The task of the regression algorithm is to map the input value (x) with the continuous output variable(y).	The task of the classification algorithm is to map the input value(x) with the discrete output variable(y).
Regression Algorithms are used with continuous data.	Classification Algorithms are used with discrete data.
In Regression, we try to find the best fit line, which can predict the output more accurately.	In Classification, we try to find the decision boundary, which can divide the dataset into different classes.
Regression algorithms can be used to solve the regression problems such as Weather Prediction, House price prediction, etc.	Classification Algorithms can be used to solve classification problems such as Identification of spam emails, Speech Recognition, Identification of cancer cells, etc.
The regression Algorithm can be further divided into Linear and Non-linear Regression.	The Classification algorithms can be divided into Binary Classifier and Multi-class Classifier.

Lambda Functions(Anonymous functions)

addition = lambda a,b:a+b
addition(12,14)

pyforest

lazy import of all python data science libraries

Univariate, Bivariate & MultiVariate

Univariate - Output based on only considering one feature
Bivariate - atleast two features
multi variate -

Bias means error in training data

Variance means error in test data

Overfitting - Low error with training data set but high error with Test data set. Low bias & high variance

Underfitting- high error for both training & test data sets. High bias & high variance.

Multi-colleniarity

R Squared

Adjusted R Squared

Hypothesis Testing

Stats is about data/huge amount of data. It will be useful when one analyzes it and draw conclusions from it.
To find interpretation and conclusion we use hypothesis testing
Evaluates 2 mutual exclusive statements ON population data using sample data
Steps of hypothesis testing

Make initial assumption(H0). This is called as NULL hypothesis
Collect data(called evidences) to REJECT or NOT REJECT null hypothesis

Technology

Saturday, 3 October 2020

Python

Difference between Regression and Classification

No comments:

Post a Comment

Blog Archive