Whenever we are performing descriptive statistics on historical data, it is required to know the spread of the data. Creating a Bell curve is such methodology where spread of data can be visualized. So, let us understand that what is Bell Curve. Also, we’ll discuss that how to create Bell Curve in excel.
Table of Contents
What is Bell Curve?
Bell Curve is a methodology to understand normal distribution of data. In other words, Bell Curve is the graphical form of normal distribution. In Bell Curve, X axis represents the values of distribution and Y axis represents the frequency of each value.
Middle axis peak of a normally distributed Bell Curve denotes Mean, Median and Mode. If we go towards left or right side from middle, we’ll found the intervals of standard deviation (σ) denoted by 1s, 2σ & 3σ.
Why Bell Curve?
After discussing what is Bell Curve, next question comes into the mind that why bell curve. As discussed in the beginning that Bell Curve is used to visualize the distribution of data. Also, the skewness of the data can be found through Bell Curve. If the data is skewed and bell curve is not symmetrical, in that case Mean and Median will be different and Median can be considered instead of Mean for further analysis.
What is Empirical Rule or Three-Sigma Rule of Bell Curve?
In Bell Curve, if we consider 1s (-1σ to 1σ), 68% data falls under 1 standard deviation (σ). Same way, 95% data falls under 2σ and 99.7% data falls under 3σ. Also, this is called 68-95-99 rule.
What is Skewness and Kurtosis?
In earlier paragraph (what is bell curve), it is discussed that bell curve is only possible if data is normally distributed. Many times, we are creating Bell Curve and it is not showing the symmetrical shape, either it is skewed towards left or right. On the other hand, the tail of the Bell Curve can be heavy or light. To explain the asymmetrical Bell Curve, Skewness and Kurtosis came into the picture.
If the Bell Curve is skewed towards left or right and it has a long tail, it is called skewed Bell Curve. There are two types of Skewness, Right or positive Skewed and Left or negative Skewed.
Kurtosis tells about the outliers present in a data or it can be considered as a measuring technique of outlier.
How to measure Skewness and Kurtosis with Excel functions?
To find out SKEWNESS of a data set, “SKEW” function is being used where positive or negative data output can be found. Based on negative or positive output, type of skewness can be decided.
To find out KURTOSIS of a data set, “KURT” function is being used. Output interpretation is given below.
If output value is “Zero”, it is Mesokurtic
If output value is “Negative” (-), it is Platykurtic
If output value is “Positive” (+), it is Leptokurtic
What is the difference between Normal Distribution (Gaussian Distribution) & Non-Normal Distribution?
In case of normal distribution, Bell Curve is looks symmetrical where maximum data points are located at the center and smoothly distributed towards left and right side by creating tail with diminishing order.
In case of non-normal distribution, Bell Curve is skewed toward left or right and form positive or negatively skewed Bell Curve.
What is Standard Deviation?
Standard Deviation is the measure of dispersion or spread of a data. In other words, Standard Deviation talks about the deviation of data from mean value. Standard Deviation denoted by Sigma (s).
To calculate Standard Deviation in Excel, “STDEV” function is being used.
What is Z Score?
Z score tells us that how many data points are falling under number of standard deviation from mean. As an example, if Z score is +1, that tell us that the data points are above mean value and 1 standard deviation.
How to create Bell Curve in Excel?
Excel is a basic data analytics tool used by all the industry experts every day. In excel 2016, insert statistical chart option is available but readymade bell curve option is not available. Hence, creating bell curve is not so hard and it requires the use of specific formula and arrange the data. To create Bell Curve in Excel, we need to use three functions, “AVERAGE” to calculate average or mean, “STDEV” to calculate standard deviation and “NORM.DIST” to calculate normal distribution.
We have below data set of different age by which we would like to create a Bell Curve to check distribution of data.
1. First arrange the data as ascending order.
2. Calculate average (Mean) by using “AVERAGE” function and calculate Standard Deviation by using “STDEV” function.
3. Create another column to add Normal distribution with “NORM.DIST” function. In NORM.DIST function, it is required to include average and standard distribution which is already calculated. Also use “FALSE” while applying NORM.DIST function for “Probability Mass Function”.
4. Finally create “Scatter Diagram” with smooth line by using Data (Age) and Normal Distribution (Norm Dist) column.
Bell Curve vs Histogram:
Both histogram and bell curve are being used while performing descriptive analytics. We can say that histogram is another way to find hidden story in a data set. Histogram is such graph which gives the clarity regrading distribution of data. But in histogram, data frequency is also visible. Histogram is also called frequency distribution graph. It is based on bar graph and frequency of each numerical value is represented through bar graph. In one hand, bell curve is based on the probability distribution methodology. On the other hand, histogram is based on the frequency distribution methodology.
Like bell curve, histogram can show the skewness and the data. Histogram shape can be classified as bell shaped, binomial shaped, left and right skewed, uniform shape and random shaped.
So, we have discussed that what is bell curve and other related points on bell curve. Bell Curve is one of the key concepts of descriptive statistics that’s why in any of the statistical tool (R, Python) it can be created to understand distribution of historical data.