During the past few years, I have used the data graciously and judiciously provided on Al's fastball website to conduct a statistical analysis of the performance of the ISC Ranking Committee over the years 2002-2012. The analysis was based on the final set of team rankings for each year and the order of elimination of those teams in the corresponding year's ISC World Tournament.
Â
There were two specific questions that I considered. First, I addressed the issue of the "reliability" of the rankings, that is, the degree of consistency among the rankings of the various raters for a given year. For this purpose, I used Kendall's Coefficient of Concordance (W). The maximum value for this statistic is 1.00. This is the value that would be obtained were each team to be assigned the same rank by each one of the raters. The value of W for each year from 2002 through 2012 is as follows:
Â
Reliability:Â Kendall 's Coefficient of Concordance (W)
2002Â Â Â Â .932
2003Â Â Â Â .897
2004Â Â Â Â .870
2005Â Â Â Â .897
2006Â Â Â Â .901
2007Â Â Â Â .945
2008Â Â Â Â .948
2009Â Â Â Â .927
2010Â Â Â Â .857
2011Â Â Â Â .980
2012Â Â Â Â .949
Â
Mean   .918
Â
These are robust numbers. They clearly indicate a very high degree of inter-rater reliability. Thus, any shortcomings in the accuracy of the ranking system cannot be attributed to differences among the rankings of the various raters.
Â
Although high reliability is very important, this feature in itself does not guarantee the usefulness of a set of rankings in predicting performance of the teams at the ISC World Tournament. For example, the raters might attain perfect agreement in ranking teams on the distance of Kimberly, WI from each team's home base. However these rankings would not be at all useful because there is likely no relationship between this measure and the teams' performance at the tournament.
Â
The more important feature of a set of rankings is its "validity", that is, the degree of accuracy of the rankings with respect to predicting final standings in the tournament. To assess the validity of the rankings, I used the Spearman Rank Order Correlation Coefficient (rho).
There's a minor difficulty in applying this statistic to the data at hand. A requirement is that there be an equal number of ranks in the two measures being considered. The source of the problem is twofold. First, all of the teams that are ranked do not play in the ISC tournament. And second, there are usually a few "Stuffy" teams that participate in the tournament but have not been ranked.
Â
Consequently, before rho could be calculated, any team that was ranked but did not play in the tournament or any team that played in the tournament but was not ranked was discarded and the ranks reassigned on the basis of the teams' relative standing with regard to each of the two variables, that is, the rankings by raters on the one hand and the order of teams' elimination in the tournament on the other.
Â
Spearman's rho also has a maximal value of 1.0. The value of rho for each year from 2002 through 2012 is shown below
Â
Validity: Spearman Rank Order Correlation (rho)
2002Â Â Â Â .749
2003Â Â Â Â .909
2004Â Â Â Â .733
2005Â Â Â Â .600
2006Â Â Â Â .794
2007Â Â Â Â .799
2008Â Â Â Â .748
2009Â Â Â Â .877
2010Â Â Â Â .719
2011Â Â Â Â .790
2012Â Â Â Â .821
Â
Mean    .776
Â
As with the reliability data, these numbers are quite satisfactory. In order to put them into perspective, consider the Spearman Rank Order Correlation between the posted odds and the order of finish for the 20 entrants in the 2012 Kentucky Derby. The coefficient was a meager .301. A corresponding analysis of the data from the 2012 Belmont Stakes, with a field of 11 horses, yielded a Spearman rho value of .475.
Â
Wouldn't you like to be able to predict the performance of the ponies with the same degree of accuracy shown by the ISC raters in predicting the outcome of the world tournament?
Â
Jim Johnson
Vancouver
Â