Today's class:
(1) Quick overview Shapiro-Wilk test. When we test claims about the actual correlation (ρ or "rho") between two variables we are using the sample correlation, r. We are assuming that both variables used to generate this r value follow a normal distribution. For small samples this must hold for tests on ρ using r to be valid. What about larger samples? In such situations, we can fall back on the central limit theorem. One way to check data in the case of small samples is to run a goodness of fit test called the Shapiro-Wilk test. Click here to see a brief video on how to generate Shapiro-Wilk results using JMP. Click here to see how we use SAS or JMP to generate Shapiro-Wilk stat and its p-value. For an online version of Shapiro-Wilk click here.
(2) P values" and tests on rho. (Look at the correlation matrix on SAS outputs)
P value is defined as the lowest level of significance could use in a given example and still reject the null. This is, to say the least, a little confusing. It's like saying you get an 84% on the first quiz and that therefore the cutoff for an A would have to be as low as an 84% for you to get an A for the quiz. Basically, if p-value <alpha then you will reject the null. If the p-value is greater than alpha then no.
(2) Intro to regression: (assuming that you have read introduction to chapter 11 as well as sections 1-3 for today! ):
When we reject the null in tests of significance on rho this indicates that there is a significant linear (line-like) relationship. (As in training time and performance time example)
What sort of information does a straight line equation yield? (1) slope, (2) y intercept and (3) predictions of y based on x.
Technique of OLS (ordinary least squares) or for short, just LS (least squares).
Happy Thanksgiving. See you Monday, November 28th.