- Statistical significance -- centers vs spread
- Linear methods
- shapes required
- of the scatterplot (straight enough, equal spread, no curves, etc)
- of the residual plot (cloudlike, equal spread, no curves, etc)

*R*^{2}and its relationship to variance- residuals definition and meaning
- outliers, influence, and leverage
- What to do with outliers and when

- extrapolating
- lurking variables
- Finer methods
- splitting data into groups -- cat-quant-quant comparison
- straightening data with reexpression

- shapes required
- Reexpressing data
- Methods:
- 'ladder' of powers -- common reexpressions
- log-log for scatterplots when all else fails
- common reexpressions situations
- sqrt for counts
- log for things that grow exponentially
- reciprocal for ratios

- with multiple variables, try picking a reexpression that makes sense for each separately

- Goals:
- scatterplot -- quant-quant -- straighten, even up spread
- (can't straighten if it goes up and then down)

- single variable -- quant -- make normal-shaped, eg symmetric
- (can't if bimodal, split into subgroups instead)

- side-by-side boxplots -- cat-quant -- make spreads similar, so we can compare differences fairly

- scatterplot -- quant-quant -- straighten, even up spread

- Methods:
- Randomness and probability
- how to get random numbers
- random simulations (and the TI)
- Probability axioms
- 0 <=
*P*(*A*) <= 1 *P*(sample space) = 1*P*(not*A*) = 1 -*P*(*A*)- mutually exclusive
*A*and*B*:*P*(*A*or*B*) =*P*(*A*) +*P*(*B*) - independent
*A*and*B*:*P*(*A*and*B*) =*P*(*A*) **P*(*B*)

- 0 <=
- conditional probability
- definition:
*P*(*A*|*B*) =*P*(*A*and*B*) **P*(*B*) - idea
- relationship with independence

- definition:
- center, spread, and relatedness
- expected value
*E*(*X*) -- center - variance Var(
*X*) -- spread- (standard deviation)

- Cov(
*X*,*Y*) and correlation -- relatedness (independent or not?) - rules for computing with
*E*and Var

- expected value

- Sampling
- randomly (as possible) / avoid bias
- population parameters vs sample statistics
- Methods:
- simple random
- stratified
- clustered
- systematic
- multistage -- one then another

- Experiments
- randomize, and control (incl. blocking)
- blinding, double-blinding, and placebos
- confounding variables

First: you should do the suggested problems from the schedule! About half of the exam will be taken directly from the suggested problems, so working these is the best thing you can do to study.

Next: you may want to look at the old exams. Do this less because they're such good practice problems, and more to get a sense of what a multiple-choice statistics exam might look like, or at least has looked in the past.

If you still want more, then:

If you're using the book, it has a lot of problems in it that were not assigned, and almost all of them are appropriate and worthwhile to think about.

If you're using MyStatLab, then there is a link at left from the main page: "STUDY PLAN". This will take you to additional practice problems. As usual, MyStatLab will tell you if you've done them correctly.

(Please note that you can also browse the textbook online in MyStatLab by following the "Chapter Contents" link from the main page, and look at the problems in the textbook, etc.)

I'll have my usual office hours Tuesday 12 - 2 PM.

I'll have some limited availability other times Monday and Tuesday afternoons for appointments. Akshay will unfortunately not be available this time around, although he will answer emailed questions about the homework or other topics. Email either of us (or stop by my office) to make an appointment and come talk.