Stats Hacks #7 – Blogschmog

From Statistics Hacks … Hack #7: Measure Up.

Two revelations for me today. First, this book has already been a big help. The big bugaboo for me with my previous quantitative research was the Why of the What. I got plenty of advice on what buttons in SPSS I was supposed to click, but not as much explanation as to why those buttons were so special. Flipping ahead a few dozen hacks, I’m optimistic that I’ll emerge from this slow reading armed with the kind of foundation knowledge that will let me confidently do things with the numbers. To me, it is the difference between someone telling me how to assemble something — oh, say, a playset — and having the skills to design it.

Second insight … I don’t have time for this. Ironically, it may be easier to carve out the 30-60 minutes a day to process a hack into blog form once school begins. Right now, I’m behind several eight balls (the whole pool table is nothing but black) trying to get through the rest of the summer. I don’t want my get-this-done-by-Archie’s-birthday goal to become an artificial reason to cram hacks to catch up. One of the lesser but important points of this exercise is to train myself to do work consistently and accept what I can’t do.

This hack exemplifies both of those insights. I’m dying to spend the day reading and writing about statistics, but I’m not going to do that. I’m just going to write about this one, regarding levels of measurement used in scoring samples. It is the most illuminating hack for me so far.

Statistics is about collecting data in a practical way to find about about a population. Statistical analysis will either describe what has already happened or predict what is expected to happen. In order for that data, the scores from a sample, to become mathematically useful, it is important to understand what information is embedded in it. All data is not the same, and therefore not every analytical technique is going to be possible. It is helpful to pick the level of measurement intentionally to meet the goals of the study and the limits of the population.

There are four kinds of data: nominal, ordinal, interval and ratio. Each has an increasing amount of information contained within it.

Nominal data has no order; it is just a label, a way of dividing the population into smaller groups. Ordinal data has a sequence to it. One label has a specific orientation with regard to any other label. Interval data adds the notion of how big the gaps are between each ordered label (ordinal data just indicates one group is more than another, without saying by how much). Ratio data is interval data without the chance of negative numbers. Ratios have a low value of zero. With all measurement levels, it is an easy matter to break a population up into distinguishable groups. But the ways the numbers can be crunched is limited by the level of the data.

For example, a study might ask whether individuals are male or female. This is nominal data. You can look at the scores and say that half of the sample is male, but it is meaningless to use the mean to describe the population. Fifty men and fifty women cannot be described as being transsexual on average. Likewise, NASCAR (blech!) is a good example of ordinal data. Danica takes 1st place and Junior comes in 3rd. That is an indication that, at least on that day, Danica was a better driver than Junior. However, it is meaningless to say that together they averaged second place. It also doesn’t say anything about the amazing 5-lap lead Danica had over Junior in the 200 required to finish the race.

At the interval level, data now has more mathematical properties. A temperature drop from 90 to 89 degrees F in July is the same as a drop from 34 to 33 in January. On the other hand, one can’t say that a cold front that turned an 80-degree afternoon into a 40-degree evening made the day half as warm. You could say that using the Kelvin scale where there is an absolute zero, but then the measurements would be something like a drop from 300 to 277 (so, about a 7.5% drop in the amount of warmth, not 50%). Ratio data — such as traditional school test scores — have true zero value. Zero degrees Fahrenheit does not indicate the absence of warmth, whereas a zero on a school history test is a complete absence of whatever knowledge the test was trying measure*.

The take-home point here: It is vital to know the limitations of the data being collected, preferably before you set up the survey of the sample population. If your desired conclusion requires the ability to calculate a ratio, don’t ask questions that turn out to be ordinal. Measuring up means you use the highest level of measurement you can to collect the most embedded information possible.

Bruce Frey includes a reference to an article on testing by Frederic Lord, published in American Psychologist in 1953. In it, Lord describes a statistician who watched a football game and ran the data through all sorts of math calisthenics to get some interesting findings. Unfortunately, the data he recorded came from the numbers on the backs of the jerseys. Paraphrasing: The numbers don’t know where they come from. Stats are stats. It is up to the researcher to make sure they are imbued with meaning.

* I’ll leave the inadequacy of this kind of approach to measuring knowledge to Alfie Kohn.

Also:

Some definitions:

score

numbers with meaning, collected by statisticians from a sample population

measurement

assigning numbers to objects or concepts in a meaningful way

level of measurement

determines which kinds of analyses can be used on a given set of scores

nominal

categorical data lacking any indication of quantity or relation to other categories (ex. gender)

ordinal

categorical data arranged in a specific sequence relative to the other categories (ex. high school ranking)

interval

ordinal data with the same distance between each category (ex. temperature)

ratio

interval data with a lowest value of zero, indicating a complete absence of the characteristic being measured (ex. history test)

By Kevin Makice