Regression is the basic concept of predictive analytics. When we say predictive analytics, it’s all about to predict the future or forecast the future. There are many types of regression exist in the world of data science or statistics like linear regression, logistic regression, ridge regression and many more. As a basic data analytics tool, excel has the capability to perform regression on a data set. You can perform regression in excel by many ways like using functions or simply click on the regression option. In this article, we’ll discuss how to use regression in excel and how to get regression equation in excel quickly.
Table of Contents
What is regression
Regression is a method to find relationship between dependent and independent variables. The relation is basically statistical relationship or strength of statistical relationship. As an example, we have two variables, ice cream sales and weather temperature. Here weather temperature is independent variable and ice cream sale is dependent variable. If we need to find the relationship between ice cream sales and weather condition, need to take help of regression equation.
In any situation, number dependent variable should be one and independent variables can be one or more. If number of independent variables are more than one, it is considered as multiple linear regression. As an example, instead of only temperature if we consider humidity also for ice cream sales, that will be type of multiple linear regression as more than one independent variables are existing.
There are two important parameters in regression apart from dependent and independent variables, those are slope and intercept.
Four Components of regression equation
In this article, we’ll focus on simple linear regression to describe the concept.
Simple Linear Regression equation: Y = a + bx
There are four parts in the regression equation, independent variable (x), dependent variable (y), intercept (a), slope (b).
To perform regression in excel or any other statistical tool, dependent and independent variable values can be directly picked up from the data. On the other hand, intercept and slope need to derive with equation or excel function.
These two components value, intercept and slope can be found from a data set through excel functions which we’ll discuss in the next part.
Excel functions for regression
As discussed earlier that intercept and slope can be found in excel by using functions.
To find slope in excel, SLOPE function can be used. On the other hand, in built INTERCEPT function in excel gives the output of intercept. Also, Microsoft has created a combined function called LINEST to get bot slope, intercept and other statistical parameters related to regression in one go.
To check the fit or strength of association between dependent and independent variables, R2 is being checked and separate RSQ function designed.
Excel also has capability to do forecast based on the historical data which methodology is based on the regression. Excel has 6 powerful forecast functions to perform forecasting.
SLOPE function:
To get the slope (numeric) from known X and Y value of regression line.
Syntax:
=SLOPE (known_ys, known_xs)
Arguments:
Known ys- Range of y values as dependent variable.
Known xs- Range of x values as independent variable.
INTERCEPT function:
To get intercept from regression line where it is intercepting with y axis.
Syntax:
=INTERCEPT (known_ys, known_xs)
Arguments:
Known ys- Range of known dependent variable (y) values.
Known xs- Range of known independent variable (x) values.
RSQ function:
R2 is an important parameter to check the fit between dependent and independent variables. RSQ function has been introduced to quickly create R2 output. If the output is near to zero, it is considered as good fit and dependent and independent variables are associated with each other.
Syntax:
=RSQ (known_ys, known_xs)
Arguments:
Known ys- Range of known dependent variable (y) values.
Known xs- Range of known independent variable (x) values.
LINEST function:
It is a dynamic and multipurpose regression function to find out Slope & Intercept, set slope and get additional statistical output. With one function more output of regression parameters can be found.
Syntax:
=LINEST (known_ys, [known_xs], [const], [stats])
Arguments:
Known ys- Range of known dependent variable (y) values.
Known xs [optional but recommended for input]- Range of known independent variable (x) values.
Const [optional input]- Two options are available, TRUE and FALSE to set intercept. In case of TRUE, normal intercept will be considered and in case of FALSE, it will be zero.
Stats [optional input]- To find additional statistics output related to regression. Additional out put will be added if argument mentioned as TRUE. If it is FALSE, only slope and intercept will be visible. Location of the statistical output shown in the image.
How to use regression in excel
Regression equation can be found by three ways in excel.
Using functions- By using functions like SLOPE, INTERCEPT and LINEST.
Using scatterplot option- insert regression equation in scatterplot.
Using data analysis tool- By using regression option in data analytics tool.
Using Functions:
As discussed in “Excel functions for regression” part that by using functions like SLOPE, INTERCEPT or LINEST functions, desired components of regression equation can be found.
Using scatterplot:
Scatterplot is a powerful data visualization graph by which you can understand the relation between variables. Not only the relational visibility but also regression equation and R2 can be added into it. Let’s look into the steps to add regression equation and R2 in scatterplot.
Step 1- Select the data set, go to ‘insert’ tab and insert scatterplot.
Step 2- Go to chart elements/ “+” icon of the scatterplot and add trendline.
Step 3- Double click on the trendline or click on ‘format trendline’ option which will be available by right click on the trendline. After that, from the right-side window, select ‘display equation on chart’ to add regression equation on the scatterplot. To add R2 value on the scatterplot, select ‘display R-squared value on chart’.
Liner regression is used as example to show how to add regression equation in the scatterplot. If you want to use other types of regression equation/ trendline like logarithmic, exponential, power or others, select the trendline options as shown in the screenshot. Regression equation will be visible as per the trendline option selected.
Using data analysis tool:
Most dynamic option in excel to get regression details is to use data analysis tool. Data analysis tool is placed within data tab in excel.
Details are available with this option like residual, standard error, R square, adjusted R square, t-stat, p-value will be available by using the said option.
If ‘Data Analysis’ option is not visible in your excel data tab, add it with the steps shown below.
Step 1- Go to ‘File’ and click on ‘Options’
Step 2- From the pop-up window, click on ‘Add-ins’. After that, click on ‘Go…’ button placed with Manage option. Remember to keep ‘Excel Add-ins’ in the manage drop down.
Step 3- Select ‘Analysis ToolPak’ from the add-ins options, available in the pop-up window.
To get regression details by using data analysis tool, follow below steps.
Step 1- Click on ‘Data Analysis’ option and select ‘Regression’ from the long list of analysis options.
Step 2- In the pop-up regression window, input x and y value range. Select output options and click on ‘OK’. Regression details will be available like shown below.
So, to add regression equation or get regression details, excel functions are frequently used option. If you want details value of all regression parameters, use ‘Data Analysis’ tool.