Saturday, October 04, 2014

How to win citations and rise in the rankings

A large part of the academic world has either been congratulating itself on performing well in the latest Times Higher Education  (THE) world rankings, the data for which is provided by Thomson Reuters (TR), or complaining that only large injections of public money will keep their universities from falling into the great pit of the unranked.

Some, however, have been baffled by some of the placings reported by THE this year. Federico Santa Maria Technical University in Chile is allegedly the fourth best university in Latin America, Scuola Normale Superiore di Pisa the best in Italy and Turkish universities are apparently the rising stars of the academic world.

When there is a a university that appears to be punching above its weight the cause often turns out to be the citations indicator

Scuola Normale Superiore di Pisa is 63rd in the world with an overall score of 61.9 but a citations score of 96.4.

Royal Holloway, University of London is 118th in the world with an overall score of 53 but a citations score of 98.9

The University of California Santa Cruz is top of the world for citations with an overall score of 53.7 and 100 for citations

Bogazici University is 139th in the world with an overall score of 51.1 and a citations score of 96.8.

Federico Santa Maria Technical University in Valparaiso is in the 251-175 band so the total score is not given although it would be easy enough to work out. It has a score of 99.7 for citations.

So what is going on?

The problem lies with various aspects of Thomson Reuters' methodology.

First they use field normalisation. That means that they do not simply count the number of citations but compare the number of citations in 250 fields with the world average in each field. Not only that, but they compare each year in which the paper is cited with the world average of citations for that year.

The rationale for this is that the number  of citations and the rapidity with which papers are cited vary from field to field. A paper reporting a cure for cancer or the discovery of a new particle will be cited hundreds of times within weeks. A paper in philosophy, economics or history may languish for years before anyone takes notice. John Muth's work on rational expectations was hardly noticed or cited for years before eventually starting a revolution in economic theory. So universities should be compared to the average for fields and years. Otherwise, those that are strong in the humanities and social sciences will be penalised.

Up to a point this is not a bad idea. But it does assume that all disciplines are equally valuable and demanding. But if the world has decided that it will fund medical research or astrophysics and support journals and pay researchers to read and cite other researchers' papers rather than media studies or education, then this is perhaps something rankers and data collectors should take account of.

In any case, by normalising for so many fields and then throwing normalisation by year into the mix, TR increase the likelihood of statistical anomalies. If someone can get a few dozen citations within a couple of years after publication in a field where citations, especially early ones, average below one a year then this could give an enormous boost to a university's citation score. That is precisely what happened with Alexandria University in 2010. Methodological tweaking has mitigated the risk to some extent but not completely. A university could also get a big boost by getting credit, no matter how undeserved, for a breakthrough paper or a review that is widely cited.

So let's take a look at some of the influential universities in the 2014 THE rankings. Scuola Normale Superiore di Pisa (SNSP) is a small research intensive institution that might not even meet the criteria to be ranked by TR. Its output is modest, 2,407 publications in the Web of Science core collection between 2009 and 2013, although for a small institution that is quite good.

One of those publications is 'Observation of a new boson...' in Physics Letters B in September 2012, which has been cited 1,631 times.

The paper has 2,896 "authors", whom I counted by looking for semicolons in the "find" box, affiliated to 228 institutions. Five of them are from SNSP.

To put it crudely, SSNP is making an "authorship" contribution of 0.17 %  to the paper but getting 100% of the citation credit, as does every other contributor. Perhaps its researchers are playing a leading role in the Large Hadron Collider project or perhaps it has made a disproportionate financial contribution but TR provide no reason to think so.

The University of the Andes, supposedly the second best university in Latin America, is also a contributor to this publication, as is Panjab University, supposedly the second best institution in the Indian subcontinent.

Meanwhile, Royal Holloway, University of London has contributed to "Observation of a new particle...' in the same journal and issue. This has received 1,734 citations and involved  2,932 authors from 267 institutions, along with Tokyo Metropolitan University, Federico Santa Maria Technical University, Middle Eastern Technical University and Bogazici University.

The University of California Santa Cruz is one of 119 institutions that contributed to the 'Review of particle physics...'  2010 which has been cited 3,739 times to date. Like all the other contributors it gets full credit for all those citations.

It is not just the number of citations that boosts citation impact scores but also their occurrence within a year or two of publication so that the number of citations is much greater than the average for that field and those years.

The proliferation of papers with hundreds of authors is not confined to physics. There are several examples from medicine and genetics as well.

At this point, the question arises why not divide the citations for each paper among the authors of the paper? This is an option available in the Leiden Ranking so it should not be beyond TR's technical capabilities.

Or why not stop counting multi - authored publications when they exceed a certain quota of authors? This is exactly what TR did earlier this year when collecting data for its new highly cited researchers lists. Physics papers with more than 30 institutional affiliations were omitted, a very sensible procedure that should have been applied across the board.

So basically, one route to success in the rankings is to get into a multi - collaborator mega - cited project.

But that is not enough in itself. There are hundreds of universities contributing to these publications. But not all of them  reap such disproportionate benefits. It is important not to publish too much. A dozen LHC papers will do wonders if you publish 400 or 500  papers a year. Four thousand a year and it will make little difference. One reason for the success of otherwise obscure institutions is that the number of papers by which the citations are divided is small.

So why on earth are TR using a method that produces such laughable results? Lets face it, if any other ranker put SNS Pisa, Federico Santa Maria or Bogaziii at the top of its flagship indicator we would go deaf from the chorus of academic tut-tutting.

TR, I suspect, are doing this because this method is identical or nearly identical to that used for their InCites system for evaluating individual academics within institutions, which appears very lucrative, and they do not want the expense and inconvenience of recalculating data.

Also perhaps, TR have become so enamoured of the complexity and sophistication of their operations that they really do think that they have actually discovered pockets of excellence in unlikely places that nobody else has the skill or the resources to even notice.

But we have not finished. There is one  more element in TR's distinctive methodology and that is its regional modification introduced by Thompson Reuters in 2011.

This means that the normalised citation impact score of the university is divided by the square root of the impact score of  the country in which it is located. A university located in a low scoring country will get a bonus that will be greater the lower the country's impact score. This would clearly be an advantage to countries like Chile, India and Turkey.

Every year there are more multi - authored multi -cited papers. It would not be surprising if university presidents start scanning the author lists of publications like the Review of Particle Physics, send out recruitment letters and get ready for ranking stardom.


Rafael M Santos said...

Nice to see a reference to the Leiden Ranking, which is in many ways superior to other more marketed rankings.

Adrienne Jesse Maleficio said...

This is a very helpful article. For world University rankings, please see this