Friday, September 24, 2010

Here We Go Again

Times Higher Education have released the ranking of the world's top universities in engineering and technology (subscription required). Caltech is number 1 and MIT is second. So far, nothing strange.

But one wonders about the Ecole Normale Superieure in Paris in 34th place and Birkbeck College London in 43rd.

Clicking on the Citations Indicator we find that the Ecole has a score of 99.7 and Birkbeck 100.

So Birkbeck has the highest research impact for engineering and technology in the world and ENS Paris the second highest?

4 comments:

Mark Wilson said...

Do you understand the exact methodology used for citations? The g-index computations at highimpactuniversities.com seem quite different from these, and also from Leiden. What citation measure are THE using?

Jason said...

N.B.: The universities fill out forms provided by THE. Richard reported that Texas opted out because it was too much trouble. Leaving it up to the universities is economical for THE but invites creative interpretation by the administrators tasked with filling them out. That's a fundamental flaw, and cannot be fixed by changing the instructions or the weighting factors. It rewards dishonesty and even semi-innocent self-serving lax interpretation of whatever system is in place. A commenter on my blog who spent five years at Alexandria University, and knows whereof he speaks, casts doubt not only on "research impact" but on the other statistics Alexandria University provided to THE.

Magnus Gunnarsson said...

(Richard, sorry about the long post and technical details; do not publish it if you do not find it interesting. Also, feel free to correct the language; English isn't my first.)

We do not know exactly how the citation indicator is calculated, but we know that it is an average field-normalised citation score (AFC). This is not a new measure -- CWTS in Leiden has been using it for several years, as well as several other bibliometric instances. It is constructed so that 1 means 'as many citations as the world average for this field, year and publication type', and 1.5 means '50% more citations than the world average for this field, year and publication type'. For small numbers of publications, or for publications in small fields (mostly in the humanities), this measure may behave unexpectedly, but for fairly large institutions (say, more than 1000 publications) it is not a big problem.

We do not have access to the data THE used, but the CWTS ranking (Leiden) provides us with data that ought to be very similar. Looking at the CWTS top 500 list, we see that more than 50% of the universities have less than 7000 publications during a four year period (which is the time span used by both CWTS and THE). The AFC score for institutions of this size is fairly stable in the first decimal, but varies considerably in the second decimal. (Just compare the CWTS rankings for 2008 and 2010.) The AFC distribution for the top 500 universities is shown in figure 1; it is pretty much a normal distribution.

So far there is not much controversy regarding the AFC score (as long as citations are accepted as an indicator of quality), at least not compared to raw citation counts or the h-index. You could always discuss the size threshold for including universities (1000 publications? 3000 publications?) but the vast majority of the universities on the THE list are large enough for this citation measure to be meaningful.

However, the THE applies a z-normalisation and a cumulative probability score (CP) to the AFC. That is where the controversy starts. This manoeuvre spreds out the ranked institutions to be evenly distributed from 1 to 100 (figure 2). Using the CWTS top 500 list again we find that a move from 1.1 to 1.2 in AFC score pushes an institution past 70 other institutions and causes a jump from 38.6 in CP score to 52.6. Increasing AFC by 9% here causes an increase in CP of 36%.

(These effects only occur around the center of the AFC distribution. Making these changes in the far ends of the distribution has much smaller effect - moving from 1.9 to 2.0 in AFC makes the CP go from 98,2 to 99, less than 1 % change.)

The z-score and the cumulative probability score thus gives the citation score a leverage that we are not used to. It also, in practice, brings importance to changes in the third decimal of the average field-normalised citation score. Not good.

The reference panel that the THE used when deciding on the weights of the indicators placed a lot of importance on citations, and THE responded by giving this indicator a large weight in the total score, 32.5%. However, they also added extra 'weight' to the citations by using the cumulative probability score. I suspect this came as a surprise to the reference panel...

Anonymous said...

I work at the Ecole Normale Supérieure, so I can offer a little background. It is an elite "Grande Ecole" and therefore has only a few highly-selected students. In addition to a relatively large research activity, many research staff may be loosely affiliated with it. This explains the high staff to student ratio and probably the sudden jump in this ratio, which is very unlikely to represent real recruitment or a reduction in student numbers, but rather "different" accounting of the affiliated researchers.

It is nevertheless genuinely very strong in a number of subjects, especially maths and physics.

I think this combination of factors allows the ENS to attain a high score.