Before stationarity, let's understand white noise. What is white noise? A white noise series is a sequence of random numbers and cannot be predicted. More formally, a series is white noise if the variables are independent and identically distributed with mean of zero and same variance. Each value has zero correlation with other values in... Continue Reading →
Hypothesis Testing in Linear Regression
If you need some introduction to Linear Regression, Go here ! https://vipanchiks.wordpress.com/2022/05/28/linear-regression-a-detailed-introduction/ You must be familiar with the output of a linear regression model. We get a p - value for each variable. What is this p-value? Hypothesis Testing : For a simple linear regression, as you know, the equation will look like this: If... Continue Reading →
Linear Regression – A detailed introduction
What is Regression? We use this term very often in Machine Learning and Statistics. What is the meaning of this term? It is a method used in statistics to determine the relation between on variable (dependent) and the other variable (independent). Literal meaning of regression is stepping back towards the average. So , where from... Continue Reading →
Multicollinearity – A detailed Understanding
Multicollinearity: It is the existence of correlation among the predictor variables. Multicollinearity is the occurrence of high intercorrelations among two or more independent variables in a multiple regression model.Investopedia Why is it a problem? Multicollinearity among independent variables will result in less reliable statistical inferences. Multicollinearity increases the variability in coefficients, making the estimates sensitive to... Continue Reading →
R-squared in Linear Regression – Explained
Linear Regression tries to fit a straight line that can represent all the data points with minimum error. In general, a line is good if the difference between the predicted value and the actual value is small. R-squared is the percentage of variation explained by the relationship between two variables. towardsdatascience.com Understanding R-Squared : Scatter... Continue Reading →
Gradient Descent – Linear regression
It is a commonly used optimisation algorithm for training machine learning models and neural networks. Data helps these models to learn over time. The Gradient Descent algorithm helps us minimise the error and finds an optimum function (Linear equation) at minimum error. This way our predictions will be with minimum error. It has many applications... Continue Reading →
Why should we Standardize Data in a KNN model?
In a KNN model, we have to standardize data before sending the data to model. Why do we have to do that? Consider this data : Data In the data above, Age is in years and Income is in rupees, in lakhs. Now, if we calculate Euclidean distance between 2 data points, it will be... Continue Reading →
Confusion Matrix
To measure the performance of a prediction model, we use the Confusion Matrix. Let's say we have a 2 variable classification problem. (Yes/no) Confusion matrix: Confusion Matrix Look at the above picture. If the value is actually positive and your model predicted it as positive, then it is a True Positive. The count of TPs... Continue Reading →
K Nearest Neighbor(KNN)-1NN Algorithm-Explained
Let's say we have a information about cars in a data set. The information is captured in some variables like Weight, MPG, horse power, Acceleration etc., Cars Data Set Look at the data, each row represents a car with the attributes. Now, If you see a new car, with certain MPG, HP, Cylinders, can you... Continue Reading →