At a seminar recently at Ural Federal University in Ekaterinburg the question was raised whether we could evaluate and rank rankings.
That's a tough one. Different rankings have different approaches, audiences and methodologies. The Shanghai rankings embody the concerns of the Chinese ruling class, convinced that salvation lies in science and technology and disdainful -- not entirely without reason -- of the social sciences and humanities. The Times Higher Education world rankings have claimed to be looking for quality, recently tempered by a somewhat unconvincing concern for inclusiveness.
But it is possible that there are some simple metrics that could be used to compare global rankings. here are some suggestions.
Stability
Universities are big places typically with thousands of students and hundreds of staff. In the absence of administrative reorganisation or methodological changes we should not expect dramatic change from one year to another. Apparently a change of four places over a year is normal for the US News America's Best Colleges so nobody should get excited about going up or down a couple of places.
It would seem reasonable then that rankings could be ordered according to the average change in position over a year. I have already done some calculations with previous years' rankings (see posts 09/07/14, 17/07/14, 04/11/14).
So we could rank these international rankings according to the mean number of position changes in the top 100 between 2013 and 2014. The fewer the more stable the rankings.
1. Quacquarelli Symonds World University Rankings 3.94
2. Times Higher Education World University Rankings 4.34
3. Shanghai ARWU 4.92
4. National Taiwan University Rankings 7.30
5. CWUR (Jeddah) 10.59
6. Webometrics 12.08
Consistency and Redundancy
It is reasonable that if the various ranking indicators are measuring quality or highly valued attributes there should be at least a modest correlation between them. Good students will attract good teachers who might also have the attributes, such as an interest in their field or reading comprehension skills, required to do research. Talented researchers will be drawn to places that are generously funded or highly reputed.
On the other hand, if there is a very high correlation between two indicators, perhaps above .850, then this probably means that they are measuring the same thing. One of them could be discarded.
Transparency
Some rankings have adopted the practice of putting universities into bands rather than giving them individual scores. This is, I suppose, a sensible way of discouraging people from getting excited about insignificant fluctuation but it might also suggest a lack of confidence in the rankers' data or the intention of selling the data in some way, perhaps in the guise of benchmarking. Since 2010 THE have bundled indicator scores into clusters making it very difficult to figure out exactly what is causing universities to rise or fall. Rankings could be ranked according to the number of universities for which overall scores and indicator scores are provided.
Inclusiveness
It would be very easy to rank rankings according to the number of universities that they include. This is something where they vary considerably. The Shanghai ARWU ranks 500 universities while Webometrics ranks close to 24,000.
Comprehensiveness
Some rankings such as ARWU, the NTU rankings (Taiwan), URAP (Middle East Technical University) measure only research output and impact. The THE and QS rankings attempt to include metrics related, perhaps distantly, to teaching quality and innovation. QS has an indicator that purports
Balance
Some rankings award a disproportionate weighting to a single indicator, QS's academic survey (40%), THE's citation indicator (30%). Also, if a university or universities are getting disproportionately high scores for a specific indicator this might mean that the rankings are being manipulated or are seriously flawed in some way.
External Validation
How do we know that rankings measure what they are supposed to measure? It might be possible to measure the correlation between international rankings and national rankings which often include more data and embody local knowledge about the merits of universities.
Replicability
How long would it take to check whether the rankings have given your university the correct indicator score? Try it for yourself with the Shanghai highly cited researchers indicator. Go here and find the number of highly cited researchers with Harvard as their primary affiliation and the number with your university. Find the square root of both numbers. Then give Harvard a score of 100 and adjust your university's score accordingly.
Now for the THE citations impact indicator. This is normalised by field and by year of citation so that the what matters is not the number of citations that a publication gets but the number of publications compared to the world average in 334 fields and in the first, second, third, fourth, fifth or sixth year of publication.
Dead simple isn't it?
And don't forget the regional modification.
I hope the next post will be a ranking of rankings according to stability.
We did a ranking of rankings back in 2010, based on the Berlin Principles -- it was fun, and a good exercise to understand rankings (http://medarbetarportalen.gu.se/digitalAssets/1326/1326714_rapport2010-03_university-ranking-list.pdf). We did it in a rather subjective way, though, and if one could find decent indicators, that would be a definite improvement.
ReplyDeleteAbout your list of metrics 1: I do not see why balance is important. If one has found a good indicator for one aspect of university quality, then there is no reason not to give it a lot of weight.
About your list of metrics 2: External validation is important, but I do not agree that national rankings will be good enough for that -- they are rarely reliable. You'll need some kind of peer review/expert assessment.
Thanks for a good blog!
Thank you very much for an excellent article and sharing your ideas on that question.
ReplyDeleteI hope you'll continue to look into this interesting question.
On "Inclusiveness" metric: I agree that number of universities can be used to measure rankings. But to use that instrument effectively you will need to separate ranking in different type: Webometrics can measure more universities than others. National rankings are limited in the amount of universities they can include.
On "Consistency and Redundancy" metric: It's an interesting concept but it will require additional research to say how can it be used.
It can be affected by amount of universities presented in the ranking. Best universities will follow the logic you have shown: best professors attract students, etc. But the lower quality of the university is, there will be less correlation between indicators.