# Data Analysis and Probability grades 9-12

## Concepts and Outcomes from

SFAA - Science for All Americans

NAEP - from the National Assessment for Educational Progress

### Facts, Concepts, and Generalizations

- SFAA - Select a sampling technique to gather data, analyze the resulting data and make inferences.
- SFAA - A set of data can be represented by a few summary characteristics that may reveal or conceal important aspects of it.
- SFAA - To get an idea of what a set of data is like, for example, we can plot each case on a number line, and then inspect the plot to see where cases are piled up, where some are separate from the others, where the highest and lowest are, and so on.
- SFAA - Alternatively, the data set can be characterized in a summary fashion by describing where its middle is and how much variation there is around that middle.
- SFAA - Common sources of bias in drawing samples include convenience (for example, interviewing only one's friends or picking up only surface rocks), self-selection (for example, studying only people who volunteer or who return questionnaires), failure to include those who have dropped out along the way (for example, testing only students who stay in school or only patients who stick with a course of therapy), and deciding to use only the data that support our preconceptions.

- SFAA - If sampling is done without bias in the method, then the larger the sample is, the more likely it is to represent the whole accurately.
- SFAA - On the other hand, the actual size of the total population from which a sample is drawn has little effect on the accuracy of sample results. A random sample of 1,000 would have about the same margin of error whether it were drawn from a population of 10,000 or from a similar population of 100 million.
- SFAA - Two quantities are positively correlated if having more of one is associated with having more of the other. (A negative correlation means that having more of one is associated with having less of the other.) But even a strong correlation between two quantities does not mean that one is necessarily a cause of the other. Either one could possibly cause the other, or both could be the common result of some third factor. For example, life expectancy in a community is positively correlated with the average number of telephones per household. One could look for an explanation for how having more telephones improves one's health or why healthier people buy more telephones. More likely, however, both health and number of telephones are the consequence of the community's general level of wealth, which affects the overall quality of nutrition and medical care, as well as the people's inclination to buy telephones. Top button
- SFAA - Statistics is a form of mathematics that develops useful ways for organizing and analyzing large amounts of data.
- SFAA - We are often presented with summary data that purport to demonstrate a relationship between two variables but lack essential information. For example, the claim that "more than 50 percent of married couples who have different religions eventually get divorced" would not tell us anything about the relationship between religion and divorce unless we also knew the percentage of couples with the same religion who get divorced. Only the comparison of the two percentages could tell us whether there may be a real relationship. Even then, caution is necessary because of possible bias in how the samples were selected and because differences in percentage could occur just by chance in selecting the sample. Proper reports of such information should include a description of possible sources of bias and an estimate of the statistical uncertainty in the comparison.
- SFAA - The most familiar statistic for summarizing a data distribution is the mean, or common average; but care must be taken in using or interpreting it. When data are discrete (such as number of children per family), the mean may not even be a possible value (for example, 2.2 children). When data are highly skewed toward one extreme, the mean may not even be close to a typical value. For example, a small fraction of people who have very large personal incomes can raise the mean considerably higher than the bulk of people piled at the lower end can lower it. The median, which divides the lower half of the data from the upper half, is more meaningful for many purposes. When there are only a few discrete values of a quantity, the most informative kind of average may be the mode, which is the most common single value—for example, the most common number of cars per U.S. family is 1.
- SFAA - More generally, averages by themselves neglect variation in the data and may imply more uniformity than exists. For example, the average temperature on the planet Mercury of about 15o F does not sound too bad—until one considers that it swings from 300o F above to almost 300o F below zero. The neglect of variation can be particularly misleading when averages are compared. For example, the fact that the average height of men is distinctly greater than that of women could be reported as "men are taller than women," whereas many women are taller than many men. To interpret averages, therefore, it is important to have information about the variation within groups, such as the total range of data or the range covered by the middle 50 percent. A plot of all the data along a number line makes it possible to see how the data are spread out.

### Outcomes

- NAEP - Read, interpret, and make predictions using tables and graphs (Read and interpret data, solve problems by estimating and computing with data, i nterpolate or extrapolate from data).
- NAEP -Organize and display data and make inferences (Use tables, histograms (bar graphs), pictograms, and line graphs; use circle graphs and scattergrams; use stem-and-leaf plots and box-and-whisker plots; m ake decisions about outliers).
- NAEP - Understand and apply sampling, randomness, and bias in data collection (Given a situation, identify sources of sampling error; describe a procedure for selecting an unbiased sample; make generalizations based on sample results).
- NAEP - Describe measures of central tendency and dispersion in real-world situations.
- NAEP - Use measures of central tendency, correlation, dispersion, and shapes of distributions to describe statistical relationships (Use standard deviation and variance; use the standard normal distribution; make predictions and decisions involving correlation).
- NAEP - Use formulas and more formal terminology to describe various situations.
- NAEP - Have a basic understanding of the use of mathematical equations and graphs to interpret data, including the use of curve fitting to match a set of data with an appropriate mathematical model.
- NAEP - Use a variety of statistical techniques to model situations and solve problems.
- NAEP - Apply concepts of probability to explore dependent and independent events, and they should be somewhat knowledgeable about conditional probability.
- NAEP - Understand and reason about the use and misuse of statistics in our society ( Given certain situations and reported results, identify faulty arguments or misleading presentations of the data; appropriately apply statistics to real-world situations; fit a line or curve to a set of data and use this line or curve to make predictions about the data, using frequency distributions where appropriate).
- NAEP - Design a statistical experiment to study a problem and communicate the outcomes.
- NAEP - Use basic concepts, trees, and formulas for combinations, permutations, and other counting techniques to determine the number of ways an event can occur.
- NAEP - Determine the probability of a simple event (Estimate probabilities by use of simulations; use sample spaces and the definition of probability to describe events; describe and make predictions about expected outcomes).
- NAEP - Apply the basic concept of probability to real-world situations (Use probabilistic thinking informally; use probability related to independent and dependent events; use probability related to simple and compound events; use conditional probability).