University Ranking Watch: Serious Wonkiness

Alex Usher at HESA had a post on the recent THE Under-50 Rankings. Here is an except about the Reputation and Citations indicators.

"But there is some serious wonkiness in the statistics behind this year’s rankings which bear some scrutiny. Oddly enough, they don’t come from the reputational survey, which is the most obvious source of data wonkiness. Twenty-two percent of institutional scores in this ranking come from the reputational ranking; and yet in the THE’s reputation rankings (which uses the same data) not a single one of the universities listed here had a reputational score high enough that the THE felt comfortable releasing the data. To put this another way: the THE seemingly does not believe that the differences in institutional scores among the Under-50 crowd are actually meaningful. Hmmm.

No, the real weirdness in this year’s rankings comes in citations, the one category which should be invulnerable to institutional gaming. These scores are based on field-normalized, 5-year citation averages; the resulting institutional scores are then themselves standardized (technically, they are what are known as z-scores). By design, they just shouldn’t move that much in a single year. So what to make of the fact that the University of Warwick’s citation score jumped 31% in a single year, Nanyang Polytechnic’s by 58%, or UT Dallas’ by a frankly insane 93%? For that last one to be true, Dallas would have needed to have had 5 times as many citations in 2011 as it did in 2005. I haven’t checked or anything, but unless the whole faculty is on stims, that probably didn’t happen. So there’s something funny going on here."

Here is my comment on his post.

Your comment at University Ranking Watch and your post at your blog raise a number of interesting issues about the citations indicator in the THE-TR World University Rankings and the various spin-offs.

You point out that the scores for the citations indicator rose at an unrealistic rate between 2011 and 2012 for some of the new universities in the 100 Under 50 Rankings and ask how this could possibly reflect an equivalent rise in the number of citations.

Part of the explanation is that the scores for all indicators and nearly all universities in the WUR, and not just for the citations indicator and a few institutions, rose between 2011 and 2012. The mean overall score of the top 402 universities in 2011 was 44.3 and for the top 400 universities in 2012 it was 49.5.

The mean scores for every single indicator or group of indicators in the top 400 (402 in 2011) have also risen although not all at the same rate. Teaching rose from 37.9 to 41.7, International Outlook from 51.3 to 52.4, Industry Income from 47.1 to 50.7, Research from 36.2 to 40.8 and Citations from 57.2 to 65.2.

Notice that the scores for citations are higher than for the other indicators in 2011 and that the gap further increases in 2012.

This means that the citations indicator had a disproportionate effect on the rankings in 2011, one that became more disproportionate in 2012

It should be remembered that the scores for the indicators are z scores and therefore they measure not the absolute number of citations but the distance in standard deviations from the mean number of normalised citations of all the universities analysed. The mean is the mean not of the 200 universities listed in the top 200 universities in the printed and online rankings or the 400 included in the ipad/iphone app but the mean of the total number of universities that have asked to be ranked. That seems to have increased by a few hundred between 2011 and 2012 and will no doubt go on increasing over the next few years but probably at a steadily decreasing rate.

Most of the newcomers to the world rankings have overall scores and indicator scores that are lower than those of the universities in the top 200 or even the top 400. That means that the mean of the unprocessed scores on which the z scores are based decreased between 2011 and 2012 so that the overall and indicator scores of the elite universities increased regardless of what happened to the underlying raw data.

However, they did not increase at the same rate. The scores for the citations indication, as noted, were much higher in 2011 and in 2012 than they were for the other indicators. It is likely that this was because the difference between top 200 or 400 universities and those just below the elite is greater for citations than it is for indicators like income, publications and internationalisation. After all, most people would probably accept that internationally recognised research is a major factor in distinguishing world class universities from those that are merely good.

Another point about the citations indicator is that after the score for field and year normalised citations for each university is calculated it is adjusted according to a “regional modification”. This means that the score, after normalisation for year and field, is divided by the square root of the average for the country in which the university is located. So if University A has a score of 3.0 citations per paper and the average for the country is 3.0 then the score will be divided by 1.73, the square of 3, and the result is 1.73. If a university in country B has the same score of 3.0 citations per paper but the overall average is just 1.0 citation per paper the final score will be 3.0 divided by the square root of 1, which is 1, and the result is 3.

University B therefore gets a much higher final score for citations even though the number of citations per paper is exactly the same as University A’s . The reason for the apparently higher score is simply that the two universities are being compared to all the other universities in their country. The lower the score for universities in general then the higher the regional modification for specific universities. The citations indicator is not just measuring the number of citations produced by universities but also in effect the difference between the bulk of a country’s universities and the elite that make into the top 200 or 400.

It is possible then that a university might be helped into the top 200 or 400 by having a high score for citations that resulted from being better than other universities in a particular country that were performing badly.

It is also possible that if a country’s research performance took a dive, perhaps because of budget cuts, with the overall number of citations per paper declining, this would lead to an improvement in the score for citations of a university that managed to remain above the national average.

It is quite likely that -- assuming the methodology remains unchanged -- if countries like Italy, Portugal or Greece experience a fall in research output as a result of economic crises, their top universities will get a boost for citations because they are benchmarked against a lower national average.

Looking at the specific places mentioned, it should be noted once again that Thomson Reuters do not simply count the number of citations per paper but compare them with the mean citations for papers in particular fields published in particular years and cited in particular years.

Thus a paper in applied mathematics published in a journal in 2007 and cited in 2007, 2008, 2009, 2010, 2011 and 2012 will be compared to all papers in applied maths published in 2007 and cited in those years.

If it is usual for a paper in a specific field to receive few citations in the year of publication or the year after then even a moderate amount of citations can have a disproportionate effect on the citations score.

It is very likely that Warwick’s increased score for citations in 2012 had a lot to do with participation in a number of large scale astrophysical projects that involved many institutions and produced a larger than average number of citations in the years after publication. In June 2009, for example, the Astrophysical Journal Supplement Series published ‘The seventh data release of the Sloan Digital Sky Survey’ with contributions from 102 institutions, including Warwick. In 2009 it received 45 citations. The average for the journal was 13. The average for the field is known to Thomson Reuters but it is unlikely that anyone else has the technical capability to work it out. In 2010 the paper was cited 262 times: the average for the journal was 22. In 2011 it was cited 392 times: the average for the journal was 19 times.

This and similar publications have contributed to an improved performance for Warwick, one that was enhanced by the relatively modest number of total publications by which the normalised citations were divided.

With regard to Nanyang Technological University, it seems that a significant role was played by a few highly cited publications in Chemical Reviews in 2009 and in Nature in 2009 and 2010.

As for the University of Texas at Dallas, my suspicion was that publications by faculty at the University of Texas Southwestern Medical Center had been included, a claim that had been made about the QS rankings a few years ago. Thomson Reuters have, however, denied this and say they have observed unusual behaviour by UT Dallas which they interpret as an improvement in the way that affiliations are recorded. I am not sure exactly what this means but assume that the improvement in the citations score is an artefact of changes in the way data is recorded rather than any change in the number or quality of citations.

There will almost certainly be more of this in the 2013 and 2014 rankings."

University Ranking Watch

Friday, July 12, 2013

Serious Wonkiness

No comments: