Regression vs Classification - The most significant difference between regression vs classification is that while regression helps predict a continuous quantity, classification predicts discrete class labels.
- Regression
- Regression is a process of finding the correlations between dependent and independent variables. It helps in predicting the continuous variables such as prediction of Market Trends, prediction of House prices, etc.
- The task of the Regression algorithm is to find the mapping function to map the input variable(x) to the continuous output variable(y).
- Types of Regression
- Simple Linear Regression
- Multiple Linear Regression
- Polynomial Regression
- Support Vector Regression
- Decision Tree Regression
- Random Forest Regression
- How to check accuracy
- R Squared (to check goodness of best fit line)
- Adjusted R Squared -> penalizes attributes that are not correlated
- Classification
- Classification is a process of finding a function which helps in dividing the dataset into classes based on different parameters. In Classification, a computer program is trained on the training dataset and based on that training, it categorizes the data into different classes.
- It is the process of finding or discovering a model or function which helps in separating the data into multiple categorical classes i.e. discrete values. In classification, data is categorized under different labels according to some parameters given in input and then the labels are predicted for the data.
- The derived mapping function could be demonstrated in the form of
“IF-THEN” rules. The classification process deal with the problems where
the data can be divided into binary or multiple discrete labels.
- Types of ML Classification Algorithms:
- Logistic Regression
- K-Nearest Neighbours
- Support Vector Machines
- Kernel SVM
- Naïve Bayes
- Decision Tree Classification
- Random Forest Classification
- How to check accuracy
- confusion matrix
- accuracy score
- true positive rate
- recall value
- precision value
- F1 score
Difference between Regression and Classification
| Regression Algorithm | Classification Algorithm |
|---|---|
| In Regression, the output variable must be of continuous nature or real value. | In Classification, the output variable must be a discrete value. |
| The task of the regression algorithm is to map the input value (x) with the continuous output variable(y). | The task of the classification algorithm is to map the input value(x) with the discrete output variable(y). |
| Regression Algorithms are used with continuous data. | Classification Algorithms are used with discrete data. |
| In Regression, we try to find the best fit line, which can predict the output more accurately. | In Classification, we try to find the decision boundary, which can divide the dataset into different classes. |
| Regression algorithms can be used to solve the regression problems such as Weather Prediction, House price prediction, etc. | Classification Algorithms can be used to solve classification problems such as Identification of spam emails, Speech Recognition, Identification of cancer cells, etc. |
| The regression Algorithm can be further divided into Linear and Non-linear Regression. | The Classification algorithms can be divided into Binary Classifier and Multi-class Classifier. |
Lambda Functions(Anonymous functions)
- addition = lambda a,b:a+b
- addition(12,14)
pyforest
- lazy import of all python data science libraries
Univariate, Bivariate & MultiVariate
- Univariate - Output based on only considering one feature
- Bivariate - atleast two features
- multi variate -
Bias means error in training data
Variance means error in test data
Overfitting - Low error with training data set but high error with Test data set. Low bias & high variance
Underfitting- high error for both training & test data sets. High bias & high variance.
Multi-colleniarity
R Squared
Adjusted R Squared
Hypothesis Testing
- Stats is about data/huge amount of data. It will be useful when one analyzes it and draw conclusions from it.
- To find interpretation and conclusion we use hypothesis testing
- Evaluates 2 mutual exclusive statements ON population data using sample data
- Steps of hypothesis testing
- Make initial assumption(H0). This is called as NULL hypothesis
- Collect data(called evidences) to REJECT or NOT REJECT null hypothesis
-
No comments:
Post a Comment