Hypothesis Testing in Linear Regression

If you need some introduction to Linear Regression, Go here !

You must be familiar with the output of a linear regression model.

We get a p – value for each variable.

What is this p-value?

Hypothesis Testing :

For a simple linear regression, as you know, the equation will look like this:

If there is no relationship between X, y, then Beta_1 will be zero. Conversely, if Beta_1 is 0, there will be no change in y even if X changes.

Our Claim here is that X will have an effect on Y for population. Unless we test this, we can’t be sure.

Lets set up the Hypothesis :

H0 : Beta_1 = 0 #Null Hypothesis is that the coefficient is 0

H1 : Beta_1 <> 0 # This is our claim, the alternate hypothesis

Step 2: Defining the alpha (Significance Level). We are taking it as 0.05.

Step 3 : Finding the test statistic. In our case, the test statistic follows a t distribution. and it looks like this:

Step 4: Finding P value for the t distribution.

If the p-value is > 0.05,

Null is accepted,

Proving that there is no relationship between Xi (In case of multiple variables) and y variables in our data set.

If the p-value < 0.05,

then the null hypothesis is rejected,

meaning, there is a statistically significant relationship between our Xi and y s.

If there are multiple independent variables, then, we set the null hypothesis as all coefficients are 0, and the test follows.

If in the data set, for a variable, p value comes out to be > 0.05, it means there is no statistically significant relationship between that variable and the target. Hence, we can simply drop that variable.

Image taken from medium.com

If you look at the p- values above, you can drop the variables with very high p-values.

Happy Learning!

🙂

Leave a comment

Website Built with WordPress.com.

Up ↑

Design a site like this with WordPress.com
Get started