These are correlations of various college football and men's basketball rankings specifically comparing early weeks' individual rankings with Kenneth Massey's comparison pages' most recent average (mean) rankings. The intent is to show which individual rankings most accurately predict later consensus as represented by the average rankings.

Links to the results of these correlation calculations are at the bottom of this page. The input data is taken from Kenneth Massey's comparison pages:

This explains the pages linked below. You need to be familiar with Kenneth's comparison pages to make sense of them.

Example lines in my pages:

DEN Week10 36% 921 DEN Week11 48% 938

**DEN** designates the ranking, the same abbreviation
as Kenneth's comparison page.

**Week10** is the week within the season, the number
matching those in Kenneth's historical comparison-page URLs.

**921** is the correlation between "DEN" for week 10 and the current
"consensus", i.e. the ranking listed on Kenneth's current comparison
page derived from the average (mean) ranking.

**36%** is a percentile, indicating "DEN" had a higher correlation than
36 percent of the rankings included on the week-10 comparison page.
The number actually represents the percent of rankings this one beat beat so
the highest number is typically in the 95%-98% range (since it didn't
beat itself) and the lowest 0%.

Besides comparing all the individual rankings from Kenneth's page, I also threw in some other rankings. For lines with the first column containing:

**Consensus** correlates earlier weeks' consensus with the current consensus.
(By "Consensus", I mean the ranking as the teams are listed on the comparison
page, derived from the mean of the individual rankings.)
You can see how various individual rankings' predictive ability compares
with the predictive ability of the
consensus of all rankings. In a typical week it beats almost
all individual rankings.

**Con2001** (or whatever year)
correlates the previous year's final ranking with this year's current
ranking. It does **NOT** use each week from the previous
year: only the previous year's final ranking.
However, it is listed against each week to show what percentile
it achieves as compared to that week's individual rankings.
I originally had this thought: "I'll bet the
same teams are typically on top every year and last year's consensus
ranking might stack up very well against the rankings we all produce,
especially in the early weeks". That proved to be false since
it shows a low percentile even the first week.
I don't know what folks use to initialize their data but it
appears to be better than simply taking the previous year's rankings.

For each sport, there are four pages that are calculated with slight differences.

**xx-corr.txt**- The "ordinary" calculation, nothing special--the correlation of each individual rank is done with the order of the teams listed on the comparison page, i.e. the team at the top is considered 1, the next one 2, etc.
**xx-corr-25.txt**- "top 25 only"--ignores all rankings greater than 25. Any unranked team or team with a rank higher than 25 is given a rank of 26 under all the different rankings. This allows a limited-but-level comparison between rankings that only do 25 teams such as AP and USA versus other rankings.
**xx-corr-average.txt**- Correlates against the "Average" or "Mean" fractional ranks rather than against the natural number ranks derived from them.
**xx-corr-miss.txt**- Fills in missing ranks, any individual ranking that doesn't have a number all the teams is extended to have a number for every team, as if it were a multi-way tie. For example, the "WAJL10" football ranking ranks 50 teams, so in this calculation, we produce the correlation for a "modified WAJL10" with each non-ranked team assigned a rank of 51. This was my final attempt at a reasonable method and I consider it the "best" except specifically for evaluating the "top-25-only" rankings like AP and USA.

Within each page there are four sections.

- The current week's correlations. This is an attempt to duplicate the numbers across the bottom of Kenneth's comparison page. These don't match Kenneth's though I've checked and rechecked my formulas and programming. The numbers I produce look generally plausible, e.g. generally the same rankings have high and low correlations, but there are occasions when on Kenneth's page, "A" has a higher correlation than "B", whereas on my page "B" has the higher correlation, etc.
- Week by week listings of the earlier weeks.
- Same data in a single table, ordered from highest correlation to lowest. This will sometimes show you that one individual rankings' predictive ability is weeks behind another, e.g. ones Week8 correlates less than anothers Week7.
- Same data grouped by individual ranking, e.g. all the AP rankings are together. This will show you how a particular rank's correlation has risen over time and how its percentile has changed over time.

As I said above, when I applied my correlation formula in a similar manner to that on Kenneth Massey's comparison page, I haven't been able to reproduce his numbers though mine are generally close to his. Another issue is that the procedure/formula I use does not handle unranked teams in a reasonable manner. Thus you will see negative numbers for some of the correlations when it is only a partial ranking such as AP and USA, and I suppose this is an artifact of the particular formula I use and how I apply it to incomplete rankings.

This doesn't explain why I can't reproduce Kenneth's numbers because differences show up for rankings for which that is not a factor.

Also, I list a "concordance" for each week. This formula also assumes a ranking from 1 to N but has been applied to data that doesn't fill this requirement, thus is a little off.

**From Kenneth Massey's Football Comparison 2003/2004**

fb-corr.html

fb-corr-25.html: top 25 only

fb-corr-average.html: against mean ranks

fb-corr-miss.html: with missing ranks filled in

**From Kenneth Massey's Basketball Comparison 2002/2003**

bb-corr.txt

bb-corr-25.txt: top 25 only

bb-corr-average.txt: against mean rank

bb-corr-miss.txt: with missing ranks filled in

-John Wobus, 9/8/03