Linear Regression (Machine Learning)
Regression allows us to mathematically model relation between two or more variables using equations. Simple linear regression, we predict scores on one variable from the scores on other variable. The variable we are predicting is called the dependent variable. The variable we are basing our predictions on is called the independent variable. When there is only independent variable, the prediction method is called simple linear regression. In case of multiple independent variables the prediction method is Multiple linear regression.
For this prediction we use Python programming language. Python has rich ML libraries
Linear Regression Using Python
First we import required libraries to process data and develop linear regression algorithm
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
numpy: NumPy contains a multi-dimensional array and matrix data structures
pandas: Pandas is a Python library used for working with data sets and has functions for analyzing, cleaning, exploring, and manipulating data.
matplotlib: Matplotlib is a cross-platform, data visualization and graphical plotting library
seaborn: Seaborn is data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics
df = pd.read_csv('Path')
Read the data set (CSV, Json, …)
from sklearn.linear_model import LinearRegression
model = LinearRegression()
sklearn is a efficient tools for machine learning and statistical modeling including classification, regression, clustering, etc..
After importing LinearRegression we train our model based on train data and predict values of test data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)
Here test size indicates test data (30 %) and train data (70%)
LinearRegression().fit(X_train,y_train)
Then we fit our test data in Linear Regression algorithm and train our model
predict(X_test)
Now we can predict test data dependent variable by providing independent variables