University Ranking Watch: THE's WUR 3.0 is on the way

Alert to readers. Some of this post covers ground I have been over before. See here, here and here. I plead guilty to self-plagiarism.

Times Higher Education (THE) is talking about a 3.0 version of its World University Rankings to be announced at this year's academic summit in Toronto and implemented in 2021, a timetable that may not survive the current virus crisis. I will discuss what is wrong with the rankings, what THE could do, and what it might do.

The magazine has achieved an enviable position in the university rankings industry. Global rankings produced by reliable university researchers with sensible methodologies, such as the CWTS Leiden Ranking, University Ranking by Academic Performance (Middle East Technical University) and the National Taiwan University Rankings are largely ignored by the media, celebrities and university administrators. In contrast, THE is almost always one of the Big Four rankings (the others are QS, US News, and Shanghai Ranking), the Big Three or the Big Two and sometimes the only global ranking that is discussed.

The exalted status of THE is remarkable considering that it has many defects. It seems that the prestigious name -- there are still people who think that is the Times newspaper or part of it -- and skillful public relations campaigns replete with events, workshops. gala dinners and networking lunches have eroded the common sense and critical capacity of the education media and the administrators of the Ivy League, the Russell Group and their imitators.

There are few things more indicative of the inadequacy of the current leadership of Western higher education than their toleration of a ranking that puts Aswan University top of the world for research impact by virtue of its participation in the Gates funded Global Burden of Disease Study and Anadolu University top for innovation because it reported its income from private online courses as research income from industry. Would they really accept that sort of thing from a master's thesis candidate? It is true that the "Sokal squared" hoax has shown that that the capacity for critical thought has been seriously attenuated in the humanities and social sciences but one would expect better from philosophers, physicists and engineers.

The THE world and regional rankings are distinctively flawed in several ways. First, a substantial amount of their data comes directly from institutions. Even if universities are 100% honest and transparent the probability that data will flow smoothly and accurately from branch campuses, research centres and far flung campuses through the committees tasked with data submission and on to the THE team is not very high.

THE has implemented an audit by PricewaterhouseCooper (PwC) but that seems to be about "testing the key controls to capture and handle data, and a full reperformance of the calculation of the rankings" and does not extend to checking the validity of the data before it enters the mysterious machinery of the rankings. PwC state that this is a "limited assurance engagement."

Second, THE is unique among the well-known rankings in bundling eleven of its 13 indicators in three groups with composite scores. That drastically reduces the utility of the rankings since it is impossible to figure out whether, for example, an improvement for research results from an increase in the number of published papers, an increase in research income, a decline in the number of research and academic staff, a better score for research reputation, or some combination of these. Individual universities can gain access to more detailed information but that is not necessarily helpful to students or other stakeholders.

Third, the THE rankings give a substantial weighting to various input metrics. One of these is income which is measured by three separate indicators, total institutional income, research income, and research income from industry. Of the other world rankings only the Russian Round University Rankings do this.

There is of course some relationship between funding and productivity but it is far from absolute and universal. The Universitas 21 system rankings, for example, show that countries like Malaysia and Saudi Arabia have substantial resources but so far have achieved only a modest scientific output while Ireland has done very well in maintaining output despite a limited and declining resource base.

The established universities of the world seem to be quite happy with these income indicators which, whatever happens, are greatly to their advantage. If their overall score goes down this can be plausibly attributed to a decline in funding that can be used to demand money from national resources. At a time when austerity has threatened the well being of many vulnerable groups, with more suffering to come in the next few months, it is arguable that universities are not those most deserving of state funding.

Fourth, another problem arises from THE counting doctoral students in two indicators. It is difficult to see how the number of doctoral students or degrees can in itself add to the quality of undergraduate or master's teaching and this could act to the detriment of liberal arts colleges like Williams or Harvey Mudd which have an impressive record of produced employable graduates.

These indicators may also have the perverse consequence of forcing people who would benefit from a master's or post graduate diploma course into doctoral programs with high rates of non-completion.

Fifthly, the two stand alone indicators are very problematic. The industry income indicator purports to represent universities' contributions to innovation. An article by Alex Usher found that the indicator appeared to be based on very dubious data. See here for a reply by Phil Baty that is almost entirely tangential to the criticism. Even if the data were accurate it is a big stretch to claim that this is a valid measure of a university's contribution to innovation.

The citations indicator which is supposed to measure research impact, influence or quality is a disaster. Or it should be: the defects of this metric seem to have passed unnoticed everywhere it matters.

The original sin of the citations indicator goes back to the early days of the THE rankings after that unpleasant divorce from QS. THE used data from the ISI database, as it was then known, and in return agreed to give prominence to an indicator that was almost the same as the InCites platform that was a big-selling product.

The indicator is assigned a weighting of 30% which is much higher than that given to publications and higher than given to citations by QS, Shanghai US News or RUR. In fact this understates the weighting. THE has a regional modification or country bonus that divides the impact score of a university by the square root of the impact score of the country where it is located. The effect of this is that the scores of universities in the top country will remain unchanged but everybody else will get an increase, a big one for low scoring countries, a smaller one for those scoring higher. Previously the bonus applied to the whole of the indicator but now it is 50%. Basically this means that universities are rewarded for being in a low scoring country.

The reason originally given for this was that some countries lack the networking and funds to nurture citation rich research. Apparently, such a problem has no relevance to international indicators. This was in fact probably an ad hoc way of getting round the massive gap between the world's elite and other universities with regard to citations, much bigger than most other metrics.

The effect of this was to give a big advantage to mediocre universities surrounded by low achieving peers. Combined with other defects it has produced big distortions in the indicator.

This indicator is overnormalised. Citation scores are based not on a simple count of citations but rather on a comparison with the world average of citations according to year of publication, type of publication, and academic field, over three hundred of them. A few years ago someone told THE that absolute counting of citations was a mortal sin and that seems to have become holy scripture. There is clearly a need to take account of disciplinary variations, such as the relative scarcity of citations in literary studies and philosophy and their proliferation in medical research and physics but the finer the analysis gets the more chance there is that outliers will exert a disproportionate effect on the impact score.

Perhaps the biggest problem with the THE rankings is the failure to use fractional counting of citations. There is an increasing problem with papers with scores, hundreds, occasionally thousands of "authors", in particle physics, medicine and genetics. Such papers often attract thousands of citations partly because of their scientific importance, partly because many of their authors will find opportunities to cite themselves.

The result is that until 2014-15 a university with a modest contribution to a project like the Large Hadron Collider Project could get a massive score for citations especially if its overall output of papers was not high and especially if it was located in a country were citations were generally low.

The 2014-15 THE world rankings included among the world's leaders for citations Tokyo Metropolitan University, Federico Santa Maria Technical University, Florida Institute of Technology and Bogazici University.

Then THE introduced some reforms. Papers with over a thousand authors were excluded from the citation count, the country bonus was halved, and the source of bibliometric data was switched from ISI to Scopus. This was disastrous for those universities that had over-invested in physics especially in Turkey, South Korea and France.

The next year THE started counting the mega-papers again but introduced a modified form of fractional counting. Papers with a thousand plus papers were counted according to their contribution to the paper with a minimum of five per cent.

The effect of these changes was to replace physics privilege with medicine privilege. Fractional counting did not apply to papers with hundreds of authors but less than a thousand and so a new batch of improbable universities started getting near perfect scores for citations and began to break into the top five hundred or thousand in the world. Last year these included Aswan University, the Indian University of Technology Ropar, the University of Peradeniya, Anglia Ruskin University, the University of Reykjavik, and the University of Environmental and Occupational Health Japan.

They did so because of participation in the Global Burden of Disease Study combined with a modest overall output of papers and/or the good fortunate to be located in a country with a low impact score.

There is something else about the indicator that should be noted. THE includes self-citations and on a couple of occasions has said that this does not make any significant difference. Perhaps not in the aggregate, but there have been occasions when self-citers have in fact made a large difference to the scores of specific universities. In 2009 Alexandria University broke into the top 200 world universities by virtue of a self-citer and a few friends. In 2017 Veltech University was the third best university in India and the best in Asia for citations all because of exactly one self-citing author, In 2018 the university had for some reason completely disappeared from the Asian rankings.

So here are some fairly obvious things that THE ought to do:

change the structure of the rankings to give more prominence to publications and less to citations
remove the income indicators or reduce their weighting
replace the income from industry indicator with a count of patents preferably those accepted rather than filed
in general, where possible replace self-submitted with third party data
if postgraduate students are to be counted then count master's as well as doctoral students
get rid of the country bonus which exaggerates the scores of mediocre or sub-mediocre institutions because they are in the poorly performing countries
adopt a moderate form of normalisation with a dozen or a score of fields rather than the present 300+
use full-scale fractional counting
do not count self citations, even better do not count intra-institutional citations
do not count secondary affiliations, although that is something that is more the responsibility of publishers
introduce two or more measures of citations.

But what will THE actually do?

Duncan Ross, THE data director, has published a few articles setting out some talking points (here, here, here, here).
He suggests that in the citation indicator THE should take the 75th percentile as the benchmark rather than the mean when calculating field impact scores. If I understand it correctly this would reduce the extreme salience of outliers in this metric.

It seems that a number of new citations measures are being considered with the proportion of most cited publications apparently getting the most favourable consideration. Unfortunately it seems that they are not going any further with fractional counting, supposedly because it will discourage collaboration.

Ross mentions changing the weighting of the indicators but does not seem enthusiastic about this. He also discusses the importance of measuring cross-disciplinary research.

THE is also considering the doctoral student measures with the proportion of doctoral students who eventually graduate. They are thinking about replacing institutional income with "a more precise measure," perhaps spending on teaching and teaching related activities. That would probably not be a good idea. I can think of all sorts of ways in which institutions could massage the data so that in the end it would be as questionable as the current industry income indicator.

It seems likely that patents will replace income from industry as the proxy for innovation.

So it appears that there will be some progress in reforming the THE world rankings. Whether it will be enough remains to be seen.

University Ranking Watch

Sunday, April 19, 2020

THE's WUR 3.0 is on the way

1 comment: