Statistics 2

A table shows biology grades for boys and girlsThe best choices to display the biology grades

Here are the grades again so you can refer to them.

It is easy to understand that most students will immediately think of the big three: Line Graphs,  Bar Graphs and Pie Charts.

Did you choose a Line Graph? Line graphs display data that changes over time or some over variable (temperature, speed, distance)  I’m afraid that nothing here changed.

Did you choose a Bar Graph? That’s not a bad idea. But exactly how would you do it? Would you display each student’s grade separately have have 40 skinny bars? Would you find the average for the boys and the average for the girls and have two fat bars? Neither sounds very helpful. You would understand the individual grades better using the table we began with and if we wanted to compare the average grade for girls and boys, we could simple write the two averages. There is no need for a graph.

Did you choose a Pie Chart? Another interesting chart. Again, I would ask how you would organize it. Would you divide the circle in two part to show the average grade for boys and the average for girls? Obviously not too helpful. Did you plan to somehow show a tiny slice for each student? I can’t imagine how you would do that. The numbers would need to be percentages of some whole thing. I guess you could add up all the grades. Together the class made 2,979 points. Will it be helpful in any way to know that George made 2.14% of that? I don’t think so.

Now the question: Did you select another kind of graph? Something other than the big three? Below you will see my choices.

We will begin with a Histogram. Perhaps some of you pictured this but called it a bar graph.
That’s quite understandable. There are several differences.

Instead of comparing individual grades, a histogram uses intervals (always the same size) to group the data.

The other difference is that in a histogram, the bars are touching each other. In a bar graph, there are spaces between them

In a histogram, there is an order to the bars.  A histogram shows one way to display the data

The intervals are arranged from the lowest numbers to the highest. In a bar graph, the arrangement is arbitrary. You can come 40 students alphabetically, or from highest score to lowest, or in completely random order.

Now, lets consider what the histogram tells us that we didn’t know before. It tells us there were two students that made terrible grades. There were more with grades in the range from 41-60, also pretty poor. At the other end there are five students who made grades between 91 and 100.

It doesn’t compare the girls’ grades with the boys’ grades but we could do that with two separate histograms. Since these are not bar graphs, we would not have double bars or stacked bars.

What else doesn’t it show? We lose some interesting data. We don’t don’t the actual range. We don’t realize that one student actually made a 0 and another made 100.  We also don’t know the average or the mean (the grade in the middle of the ordered list of grades. We do, however get a good picture of how the class is doing. Most students seem to be making A, B, or C grades. The large number at the 51-60 interval is interesting and would be worth checking to see what problems they had.

I had never heard of Stem and Leaf graphs until, while teaching eighth grade math, a new book included these. It took me a little while to see how valuable these simple graphs can be.

Here, the stem is the list of numbers up the center. It resembles the intervals on the histogram. On the top row, there is one student listed. The grade is the 10 on the stem plus the 0 under the Grils’ side. This tells you it’s a grade of 100.

On the next line, with a stem of 9, the boys’ grades were 90. 90. 92. 92. The girls’ grades were 90, 97. This gives you a list of all the grades and at the same time they are placed in intervals.

It provides the same overview of the class grades plus you can see the single 0 and the single grade of 100. You can also look further to see every grade. Are the grades in the fifties close to sixty? Yes, they are for the girls but not for the boys.

What data are we missing? It wouldn’t be difficult to count down half way on each side to find the median grades. We have the entire range but we don’t have the average. That would be easy enough to add. There is still one more interesting possibility.

 Box Plots are used to show another way to display the data.The same year I was introduced to the Stem and Leaf Graph in an eighth grade math book, I also saw my first Box plots, also called box graphs. In the eighth grade book they were called Box and Whisker Graphs. That’s a very descriptive name. Box plots were created by John Tukey and described in his book, Exploratory Data Analysis (1977).

At first glance you can tell that we get information to compare girls and boys but no other actual grades. These graphs are designed to show five points of data: Maximum, Minimum, Median, second and third quartiles.

1. The full range of grades is shown by the vertical line. For the girls the range is 100 down to 75. This means you see the maximum and minimum.

2.You actually see the range for each of the four quartiles. A quartile is one fourth of the group. For the boys, the top quartile is very small, from somewhere around 88 to a 92. Their second quartile is a little larger, from 79 to about 88. The third quartile is unusually long, from about 58 yo 79. And the fourth quartile is very long, from 0 to about 58

3.  You see the median grade for each group. That is 79 for the boys and 83 for the girls, not that much difference.

So what is the answer to the question. What kind of graph is best?

The answer is “Best for What?” What is the purpose of the graph? If the teacher wants a quick picture of how his students are doing, any of these are more helpful than a simple list of grades. It’s hard to be sure, but a good guess would be the professor would choose the histogram because it’s more familiar. My choice would be the Stem and Leaf graph because it includes a great deal more information. It would be nice to add the average or mean and the range and median. I probably would not use the box plot, though they are interesting. It might be a good choice to compare a large number of classes.

There are obviously many kinds of graphs beyond the big three that you can use to organize data for different purposes.

Return to the previous page: Statistics 1

 

Leave a Reply