I give each team a numerical rating, then rank them accordingly. The ratings are done by formula; my own judgment entered the process only when I devised the formula.
These ratings avoid the disadvantages, but thereby pass up the advantages, of utilizing game scores or scoring differentials. For example, if West State beats East College 40 to 30, then neither the fact that the score was 40 to 30 nor that West State won by 10 enters the formula: it only uses the fact that West State won the game against East College. Thus a team that wishes to positively affect this rating cannot do so by running up scores and conversely, a team that forgoes running up scores has not adversely affected this rating. No doubt these ratings suffer by overlooking this valuable information, but fortunately others provide rating and ranking systems that make good use of it. What I offer is an alternate look.
This formula does take into account the ratings of each team's opponents, and thereby rewards quality wins. However, it does not take into account previous seasons, injuries, recruiting, or any other game or performance factors such as yardage gained in football, shooting percentage in basketball, etc.
I sometimes calculate the ratings from the very beginning of the season but early-season ratings show very little relation to ratings calculated during the latter part of the season. This is true of any poll or rating system, but ratings based solely on wins and losses often show more of this tendency.
This rating uses roughly the same data incorporated into the RPI ("Ratings Percentage Index"), developed by the NCAA for use with a number of sports (not including Division I-A Football) to help choose and seed teams in the respective NCAA tournaments. The one difference between the data used by this and that used in determining the RPI is that this rating incorporates the results of games with lower-division teams. However, this rating is derived from that data in a different manner. For one thing, the RPI always counts its strength-of-schedule ("SOS") factors as exactly 3/4 of the rating whereas this rating is not constructed in anything like the same manner. This one takes the rating of each team you beat (and each team you lost to) and factors in what that suggests about your rating.
There is at least one potential problem regarding SOS that this rating avoids. If you treat SOS as a total or an average, as does the RPI, then one can imagine scenarios where the whole story is not told. Should a team happen to schedule twenty teams consisting of the top ten teams in the division (other than itself) and the bottom ten teams in the division, then a totaled or averaged SOS would be roughly average as compared with all the teams in the division. However, if this team won all those games, the fact that it beat every one of the top ten teams tells you this team is very good, thus a reasonable rating would be quite high. However, the RPI's averaged SOS would drag down such a team's RPI rating. This rating, on the other hand, credits you for each top team you beat, and it doesn't require many wins (assuming no losses) before your rating is higher than theirs. The fact that you may also have easy games in your schedule is far less relevant unless you lose some of those.
While this exact scenario is unlikely to happen, no doubt this effect happens to some extent: some teams tend to have both very strong and very weak teams on their schedule making their RPI SOS a bit lower than teams with more middle-to-upper opponents. I have heard that the RPI has undisclosed adjustments to reward quality wins, presumably to address this issue.
This rating treats wins and losses in exactly complimentary fashion and the team that loses to low-rated teams is credited with those loses in such manner that a high "average SOS" doesn't help. This rating system applies to all the teams in the division. My understanding of the RPI is that its purpose is to determine the necessary teams to fill out the NCAA tournament, i.e. to provide a reasonable rating at the top of the division.
An advantage of the RPI (or at minimum, the disclosed portion of the RPI) over this system is that the RPI is a relatively simple and easy-to-describe formula. This rating is not so easily described and has been called "advanced" mathematics (Wilson's list of CF Ratings), though I freely admit that other folks employ techniques that would take me a long while to comprehend.
While my ratings and rankings are among those produced by formula, there are also folks who produce ratings and rankings based upon their own judgment, as well as rankings based on the results of polls of various interested and knowledgeable groups. For example, typically, the news industry provides each sport/division with a ranking based upon the votes of a select group of coaches as well as one based upon the votes of such a group of sports writers.
Humans have the wonderful advantage of basing their ratings on factors that formulas cannot capture. In basketball, for example, if you are deciding which team is better, you can watch the two teams play, you can take into account such factors as their athleticism, their outside shooting ability, their rebounding ability, their inside game, their teamwork, etc. You can watch individual players and make judgments about how they would fare against other teams. You can look at two teams and decide from these factors which is more likely to win a neutral-site game between the two and which is likely to win more games against a variety of other teams. Then you rate or rank teams based upon all these judgments.
And perhaps most important, you can look at games that are out of reach and make a judgment whether the teams are "trying". You can judge whether (say) a moderate scoring differential in the game result actually reflects the two teams' relative ability to win against other opponents.
Computer/formula ratings obviously cannot do all these things and though a computer ranking could, in theory, incorporate player and team statistics, in general they do not. What computer analysis of the season can do that humans cannot is to factor in the results of every single game and do so in an exacting and appropriate manner. If two teams that have been tied for 300th place finally play, then the formula can not only use that fact to differentiate those two teams' ratings, it can use this additional data to adjust the rating of every other team that has played them or has played a team that has played them, etc., perhaps to the point of breaking a first-place tie. Not only can the formula do this, but it can do this to an appropriate degree: neither crediting any team too much nor too little for this new factor in the ratings. It is this comprehensiveness and exactness that humans cannot possibly accomplish through knowledge of the game and judgment. Humans certainly have done similar-scale calculations by hand, in pre-computer days, but the equivalent of a single computer rating could require months of calculating.
Thus the computer evaluation and the human evaluation aren't really giving you the same information even though each is expressed as a ranking. One advantage that humans have is that they can use the computer evaluations as one of the factors to incorporate into their judgments. For example, a human might not accept the computer's result that West State is ranked ten places higher than East College, but if the human's initial impression had been the opposite, then upon seeing the computer evaluation, the human might choose to take a harder look at the two teams to refine his/her evaluation. Just as a coach who does not treat player statistics as gospel may still judge them worth a look as a source for potential coaching points.
John Wobus, 10/11/05
Wobus Sports: www.vaporia.com/sports