Is equal to the mean of y, so you would just go through Zero and then your line would just be this line, y Now what if r were zero? Then your slope would be That would be our line if we had a perfect negative correlation. What if r were equal to negative one? It would look like this. And this would be the case when r is one, so let me write that down. In x, you're seeing you could say the average spread in y over the average spread in x. That has parallels to when you first learn about slope. Standard deviation of y over standard deviation of x. If r were equal to one, this would be your slope, Because you're looking at your spread of y over your spread of x, This point and if you were to run your standardĭeviation of x and rise your standard deviation of y, well with a perfect positive correlation, your line would look like this. The standard deviation of y over the standard deviation of x. Perfect positive correlation, then our slope would be Is definitely going to go through that point. We calculated the r to be 0.946 or roughly equal to that, let's just think about what's going on. So before I even calculateįor this particular example where in previous videos What point is definitely going to be on my line? And for a least squares regression line, you're definitely going to have the point sample mean of x comma sample mean of y. Intercept if you already know the slope by saying well Learned in Algebra one, you can calculate the y Slope, how do we calculate our y intercept? Well like you first Need to know is alright, if we can calculate our This might not seem intuitive at first, but we'll talk about it in a few seconds and hopefully it'll make a lot more sense, but the next thing we To be r times the ratio between the sample standardĭeviation in the y direction over the sample standardĭeviation in the x direction. Is a regression line that we're trying to fit to these points. So this, you would literally say y hat, this tells you that this The equation for any line is going to be y is equal to mx plus b, where this is the slope and Intuition for the equation of the least squares line. Their standard deviations, will help us build an And visualizing these means, especially their intersection and also So the mean is three,Īnd this is one sample standard deviation for y above the mean and this is one standardĭeviation for y below the mean. We could do the same thing for the y variables. Sample standard deviation below the mean, and then Sample standard deviation above the mean, this is one Is eight divided by four, which is two, so we have xĮquals two right over here. One plus two plus two plus three divided by four, Here, so the sample mean for x, it's easy to calculate In red so that you know that's what is going on Sample standard deviation for x are here in red, and actually let me box these off We clearly have the fourĭata points plotted, but let's plot the statistics for x. So before I do that, let's just visualize some of the statistics that we have here for these data points. On this video is build on this notion and actuallyĬome up with the equation for the least squares We got an r of 0.946, which means we have a fairly If r is equal to zero, you don't have a correlation, but for this particular bivariate dataset, One, you have a perfect negative correlation, and And as we said, if r is equal to one, you have a perfect positive correlation. The product of the z scores for each of those pairs. In that video we saw all it is is an average of In previous videos, we took this bivariate data and weĬalculated the correlation coefficient, and justĪs a bit of a review, we have the formula here, and it looks a bit intimidating, but This has applications in machine learning and AI - FYI. Wouldn't have thought about it and was going to skip this video. So we substitute the m, Xmean, Ymean, and then get Y intercept. We know for a fact that for the regression line function, we have Xmean and Ymean as part of its points or at its intersection. But the r also factors into this calculation. Then he shows that rise over run, which is slope, is equal to Sy/Sx. And then he draws 1 stddev lines for x and y axis. They have also provided x,y mean and stddev.įirst they use the Xmean and Ymean as reference. He shows formula to get the correlation coefficient, but they have already done all the calculation to get the best correlation coefficient. Goal is to find regression line that best fits the data point.
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |