2. WHAT ARE STATISTISC?
• Descriptive statistics is the focus, and they are simply
numbers, for example, percentages, numerals, fractions,
and decimals. These numbers are used to describe or
summarize a larger body of numbers.
• GPA would be an example
3. WHY USE STATISTICS
• Expose to statistics will not go away, and the ability to
understand its concepts can help in a number of areas
(professional & personal). With increasing calls for
accountability, it will become all the more important that
classroom teachers understand the statistics reported to
them and the statistics reported to others.
4. TABULATING FRQUENCY DATA
• The first method is to simply list the score in ascending or
descending numerical order.
• The List – list scores in descending order, and this makes
it easier to identify trends, patterns, and individual scores
(if the number of scores is small). p.267
• The Simple Frequency Distribution – will summarize data
effectively only if the spread of scores is small. They tend
to be so lengthy that it is difficult to make sense of the
data.
5. TABULATING FRQUENCY DATA
• The Grouped Frequency Distribution – similar to the
simple frequency distribution, except that ranges or
intervals of scores are used for categories rather than
considering each possible score as a category. (p.268
&269).
6. GRAPHING DATA
• A graph will almost always clarify or simplify the
information presented by groups of numbers.
1. Bar Graphs, or Histogram – type of graph used most
frequently to convey statistical data. They are best used for
graphically representing discrete or noncontinuous data. See
page 274/275 for example.
2. The Frequency Polygon – best used to graphically represent
what is called continuous data, such as test scores. See page
274 & 275.
7. GRAPHING DATA
Symmetrical Distributions – each half or side of the
distribution is a mirror image of the other side.
Asymmetrical Distribution – on the other hand, has
nonmatching sides or halves. P. 279
Positively Skewed Distribution – results from an
asymmetrical distribution of scores. The majority of the
scores fall below the middle of the score distribution (p.
280).
Negatively Skewed Distribution – also a result from an
asymmetrical score distribution. The majority of the
scores fall above the middle of the score distribution.
Many high scores, but few low scores.
8. Tabulating Frequency Data
• Start with Data:
87 72 91 69 89 95 65 98 81
85 80 88 81 85 90 81 83 84
76 81 82 70 84 77 76 70 76
• Just by looking at these scores, what, if anything can you
tell about how the class did?
• On average, how did the students do?
• Did most of the students perform well on this test?
9. In Excel
• Enter all your data into one column on excel
• Click on the data tab up at the top
• The first button under the data tab is sort
• Click on sort and choose descending order
10. Frequency
• A simple list summarizes data conveniently if N, the
number of scores is small
• If N is large, lists become difficult to interpret
• Trends are not always very clear, numbers tend to repeat
themselves, and there are usually a lot of missing scores
• A simple frequency distribution considers all scores,
including those that are missing.
11. Grouped Frequency Distribution
• Ranges or intervals of scores are used for categories
rather than considering each possible score as a
category.
• Constructing a grouped frequency distribution:
• Step 1: Determine the range of scores (symbolized by R). The
range (or spread) of scores is determined by subtracting the lowest
score (L) from the highest score (H).
Formula: R = H - L
Application: R = 98 – 65
• The range of scores is 33.
12. Continued
• Step 2: Determine the appropriate number of intervals.
The number of intervals or categories used in a grouped
frequency distribution is somewhat flexible or arbitrary.
• As already stated, this decision is somewhat arbitrary. In making
such decisions, though, be sure to use as many categories or
intervals as are necessary to demonstrate variations in the
frequencies of scores.
13. Continued
• Step 3: Divide the range by the number of intervals you
decide to use and round to the nearest odd number. This
will give you i, the interval width:
Formula: i = _____R_____
number of intervals
Application: i = _____ 33_____
10
= 3.3, round to the nearest odd number, 3
• You can see there is an inverse relationship between the number of intervals
and the width of each interval.
• That is, as fewer intervals are used, the width of each interval increases; as more intervals are
used, the interval width decreases.
• Keep in mind that as i, the interval width, increases, we lose more and more information about
individual scores.
15. Continued
• Step 4: Construct the interval column making sure that
the lowest score in each interval, called the lower limit
(LL), is a multiple of the interval width (i). The upper limit
of each interval (UL) is one point less than the lower limit
of the next interval.
• Within an interval width of 7, the LL of each interval could be 7, 14,
21, etc. (7x1, 7x2, 7x3, etc.). However, we eliminate those intervals
below and above the intervals that include or “capture” the lowest
and highest scores.
17. To make a Frequency Polygon- this is optional to read- you are not
expected to create one on your own
• MP one column A1 to A10
• f column B1 to B10
• Click on insert line chart
• once get the line chart
• rt. click on the bottom line, choose "select data“
• Now you will see a new window open called select data source
• click under horizontal, edit button
• highlight A1 to A10-ok
• bottom will change the x-axes to MP
• Rt. click the other line to delete.
20. Median
• The median is the score that splits a distribution in half:
50% of the scores lie above the median, and 50% of the
scores lie below the median.
• Known as the 50th
percentile
• Example: Determine the median for the following set of
scores: 90, 105, 95, 100, and 110.
• Steps:
• 1. arrange the scores in ascending or descending numerical order (don’t
just take the middle score from the original distribution.)
• 2. circle the score that has equal numbers of scores above and below it;
this score is the median.
Application: 110, 105, 100, 95, 90
21. Example 2: Even number of data
• Determine the median for the following set of scores: 90,
105, 95, 100, 110, 95.
• Steps:
• 1. arrange the scores in numerical order
• 2. circle the two middle scores that have equal numbers of scores above
and below them.
• 3. compute the average of those two scores to determine the median.
• Application: 110, 105, 100, 95, 95, 90
• Two middle scores: 95+100 = 195 = 97.5 = MDN
2 2
• In this example the two middle scores are different scores, and the median is
actually a decimal rather than a whole number (integer.) This can be
confusing unless you remember that the median is a value, not necessarily a
score.
22. Median
• Since the median is not affected by extreme scores, it
represents central tendency better than the mean when
distributions are skewed.
• In skewed distributions, the mean is pulled toward the extremes, so
that in some cases it may give a falsely high or falsely low estimate
of central tendency.
23. Positively Skewed Distribution
• In the positively skewed distribution the few scores of 100
or above pull M toward them. The mean presents the
impression that the typical student scored about 80 and
passed the test.
• However, the MDN shows that 50% of the students scored 60 or
below.
• In other words, not only did the typical student fail the test (if we
consider the middle student typical), but the majority of students
failed the test (assuming a score of 60 is failing, of course.)
24. Negatively Skewed Distribution
• In the negatively skewed distribution the few scores of 40
or below pull the mean down toward them.
• Thus the mean score gives the impression that the typical student
scored about 60 and failed the test.
• Again, the median contradicts this interpretation. It shows that 50%
of the students scored 80 or above on the test and that actually the
majority of students passed the test.
25. Percentiles- this is an important slide
• A percentile is a score below which a certain percentage
of the scores lie.
• Percentiles divide a frequency distribution into 100 equal
parts.
• Percentiles are symbolized P1, P2,…P99.
• P1 represents that score in a frequency distribution below which 1% of
the scores lie.
• P2 represents that score in a frequency distribution below which 2% of
the scores lie.
• P99 represents that score in a frequency distribution below which 99%
of the scores lie.
27. Mode Median
• The mode is the least
reported measure of
central tendency.
• The mode, or model
score, in a distribution
is the score that occurs
most frequently.
• The mode is the least
stable measure of
central tendency. A few
scores can influence
the mode considerably.
• Gives useful
information in addition
to the mean.
• Discounts (relatively
speaking) any outliers
like one student who
was absent and did
really poorly would not
affect the median like it
does the mean