## Reviewing for Math 2200 Exam 2

This informal document is intended to help you study, and inform you of additional resources that are available. I don't guarantee that it is complete -- please inform me if you feel something important is missing.

### Main topics

• Statistical significance -- centers vs spread
• Linear methods
• shapes required
• of the scatterplot (straight enough, equal spread, no curves, etc)
• of the residual plot (cloudlike, equal spread, no curves, etc)
• R2 and its relationship to variance
• residuals definition and meaning
• outliers, influence, and leverage
• What to do with outliers and when
• extrapolating
• lurking variables
• Finer methods
• splitting data into groups -- cat-quant-quant comparison
• straightening data with reexpression
• Reexpressing data
• Methods:
• 'ladder' of powers -- common reexpressions
• log-log for scatterplots when all else fails
• common reexpressions situations
• sqrt for counts
• log for things that grow exponentially
• reciprocal for ratios
• with multiple variables, try picking a reexpression that makes sense for each separately
• Goals:
• scatterplot -- quant-quant -- straighten, even up spread
• (can't straighten if it goes up and then down)
• single variable -- quant -- make normal-shaped, eg symmetric
• (can't if bimodal, split into subgroups instead)
• side-by-side boxplots -- cat-quant -- make spreads similar, so we can compare differences fairly
• Randomness and probability
• how to get random numbers
• random simulations (and the TI)
• Probability axioms
• 0 <= P(A) <= 1
• P(sample space) = 1
• P(not A) = 1 - P(A)
• mutually exclusive A and B: P(A or B) = P(A) + P(B)
• independent A and B: P(A and B) = P(A) * P(B)
• conditional probability
• definition: P(A | B) = P(A and B) * P(B)
• idea
• relationship with independence
• expected value E(X) -- center
• (standard deviation)
• Cov(X, Y) and correlation -- relatedness (independent or not?)
• rules for computing with E and Var
• Sampling
• randomly (as possible) / avoid bias
• population parameters vs sample statistics
• Methods:
• simple random
• stratified
• clustered
• systematic
• multistage -- one then another
• Experiments
• randomize, and control (incl. blocking)
• blinding, double-blinding, and placebos
• confounding variables

### Finding practice problems

First: you should do the suggested problems from the schedule! About half of the exam will be taken directly from the suggested problems, so working these is the best thing you can do to study.

Next: you may want to look at the old exams. Do this less because they're such good practice problems, and more to get a sense of what a multiple-choice statistics exam might look like, or at least has looked in the past.

If you still want more, then:
If you're using the book, it has a lot of problems in it that were not assigned, and almost all of them are appropriate and worthwhile to think about.
If you're using MyStatLab, then there is a link at left from the main page: "STUDY PLAN". This will take you to additional practice problems. As usual, MyStatLab will tell you if you've done them correctly.
(Please note that you can also browse the textbook online in MyStatLab by following the "Chapter Contents" link from the main page, and look at the problems in the textbook, etc.)

### Getting help

I will (personally) be holding a review session at a time and place TBA (but most likely Monday 8-10pm someplace). Come with thoughtful questions, or come to listen to the thoughtful questions of your peers.
I'll have my usual office hours Tuesday 12 - 2 PM.
I'll have some limited availability other times Monday and Tuesday afternoons for appointments. Akshay will unfortunately not be available this time around, although he will answer emailed questions about the homework or other topics. Email either of us (or stop by my office) to make an appointment and come talk.