Showing posts sorted by relevance for query self citation. Sort by date Show all posts
Showing posts sorted by relevance for query self citation. Sort by date Show all posts

Monday, August 27, 2012

Self Citation

In 2010 Mohamed El Naschie, former editor of the journal Chaos, Solitons and Fractals, embarrassed a lot of people by launching the University of Alexandria into the world's top five universities for research impact in the new Times Higher Education (THE) World University Rankings. He did this partly by  diligent self citation and partly by lot of mutual citation with a few friends and another journal. He was also helped by a ranking indicator that gave  the university disproportionate credit for citations in a little cited field, for citations in a short period of time and for being in a country were there are few citations.

Clearly self citation was only part of he story of Alexandria's brief and undeserved success but it was not an insignificant one.

It now seems that Thomson Reuters (TR), who collect and process the data for THE beginning to get a bit worried about  "anomalous citation patterns" . According to an article by Paul Jump in THE.

When Thomson Reuters announced at the end of June that a record 26 journals had been consigned to its naughty corner this year for "anomalous citation patterns", defenders of research ethics were quick to raise an eyebrow.

"Anomalous citation patterns" is a euphemism for excessive citation of other articles published in the same journal. It is generally assumed to be a ruse to boost a journal's impact factor, which is a measure of the average number of citations garnered by articles in the journal over the previous two years.

Impact factors are often used, controversially, as a proxy for journal quality and, even more contentiously, for the quality of individual papers published in the journal and even of the people who write them.

When Thomson Reuters discovers that anomalous citation has had a significant effect on a journal's impact factor, it bans the journal for two years from its annual Journal Citation Reports (JCR), which publishes up-to-date impact factors.

"Impact factor is hugely important for academics in choosing where to publish because [it is] often used to measure [their] research productivity," according to Liz Wager, former chair of the Committee on Publication Ethics.

"So a journal with a falsely inflated impact factor will get more submissions, which could lead to the true impact factor rising, so it's a positive spiral."

One trick employed by editors is to require submitting authors to include superfluous references to other papers in the same journal.

A large-scale survey by researchers at the University of Alabama in Huntsville's College of Business Administration published in the 3 February edition of Science found that such demands had been made of one in five authors in various social science and business fields.

That TR are beginning to crack down on self citation is good news. But will they follow their rivals QS and stop counting self citation in the citation indicator in their rankings? When I spoke to Simon Pratt of TR at the Shanghai World Class Universities conference in Shanghai at the end of last year he seemed adamant that they would go on counting self citations.

Even if TR and THE start excluding self citations, it would probably not be enough.. It may soon become necessary to exclude intra-journal citations as well.

Sunday, August 13, 2017

The Need for a Self Citation Index

In view of the remarkable performance of Veltech University in the THE Asian Rankings, rankers, administrators and publishers need to think seriously about the impact of self-citation, and perhaps also intra-institutional ranking. Here is the abstract of an article by Justin W Flatt, Alessandro Blassime, and Effy Vayena.

Improving the Measurement of Scientific Success by Reporting a Self-Citation Index

Abstract

: 
Who among the many researchers is most likely to usher in a new era of scientific breakthroughs? This question is of critical importance to universities, funding agencies, as well as scientists who must compete under great pressure for limited amounts of research money. Citations are the current primary means of evaluating one’s scientific productivity and impact, and while often helpful, there is growing concern over the use of excessive self-citations to help build sustainable careers in science. Incorporating superfluous self-citations in one’s writings requires little effort, receives virtually no penalty, and can boost, albeit artificially, scholarly impact and visibility, which are both necessary for moving up the academic ladder. Such behavior is likely to increase, given the recent explosive rise in popularity of web-based citation analysis tools (Web of Science, Google Scholar, Scopus, and Altmetric) that rank research performance. Here, we argue for new metrics centered on transparency to help curb this form of self-promotion that, if left unchecked, can have a negative impact on the scientific workforce, the way that we publish new knowledge, and ultimately the course of scientific advance.
Keywords:
 publication ethics; citation ethics; self-citation; h-index; self-citation index; bibliometrics; scientific assessment; scientific success


Sunday, October 17, 2010

Debate, anyone?

Times Higher Education and Thomson Reuters have said that they wish to engage and that they will be happy to debate their new rankings methodology. So far we have not seen much sign of a debate although I will admit that perhaps more things were said at the recent seminars in London and Spain than got into print. In particular, they have been rather reticent about defending the citations indicator which gives the whole ranking a very distinctive cast and which is likely to drag down what could have been a promising development in ranking methodology.

First, let me comment on the few attempts to defend this indicator, which accounts for nearly a third of the total weighting and for more in some of the subject rankings. It has been pointed out that David Willetts, British Minister for Universities and Science has congratulated THE on its new methodology.


“I congratulate THE for reviewing the methodology to produce this new picture of
the best in higher education worldwide. It should prompt all of us who care
about our universities to see how we can improve the range and quality of the
data on offer. Prospective students — in all countries — should have good
information to hand when deciding which course to study, and where. With the
world to choose from, it is in the interests of universities themselves to
publish figures on graduate destinations as well as details of degree
programmes.”
Willetts has praised THE for reviewing its methodology. So have many of us but that is not quite the same as endorsing what has emerged from that review.

Steve Smith, President of Universities UK and Vice-Chancellor of Exeter University is explicit in supporting the new rankings, especially the citations component.


But, as we shall see in a moment, there are serious issues with the robustness of citations as a measure of research impact and, if used inappropriately, they can become indistinguishable from a subjective measure of reputation.

The President of the University of Toronto makes a similar point and praises the new rankings’ reduced emphasis on subjective reputational surveys and refers to the citations (knowledge transfer?) indicator.


It might be argued that this indicator is noteworthy for revealing that some universities possess hitherto unsuspected centres of research excellence. An article by Phil Baty in THE of the 16th of September refers to the most conspicuous case, a remarkably high score for citations by Alexandria University, which according to the THE rankings has had a greater research impact than any university in the world except Caltech, MIT and Princeton. Baty suggests that there is some substance to Alexandria University’s extraordinary score. He refers to Ahmed Zuweil, a Nobel prize winner who left Alexandria with a master’s degree some four decades ago. Then he mentions some frequently cited papers by a single author in one journal.

The author in question is Mohamed El Naschie, who writes on mathematical physics and the journals – there are two that should be given the credit for Alexandria’s performance, not one – are Chaos, Solitons and Fractals and the International Journal of Nonlinear Sciences and Numerical Simulation. The first is published by Elsevier and was until recently edited by El Naschie. It has published a large number of papers by El Naschie and these have been cited many times by himself and by some other writers in CSF and IJNSNS.

The second journal is edited by Ji-Huan He of Donghua University in Shanghai, China with El Naschie as co-editor and is published by the Israeli publishing company, Freund Publishing House Ltd of Tel Aviv.

An amusing digression. In the instructions for authors in the journal the title is given as International Journal of Nonlinear Sciences and Numerical Stimulation. This could perhaps be described as a Freundian slip.

Although El Naschie has written a large number of papers and these have been cited many times, his publication and citation record is far from unique. He is not, for example, found in the ISI list of highly cited researchers. His publications and citations were perhaps necessary to push Alexandria into THE’s top 200 universities but they were not enough by themselves. This required a number of flaws in TR’s methodology.

First, TR assigned a citation impact score that compares actual citations of a paper with a benchmark score based on the expected number of citations for a specific subject in a specific year. Mathematics is a field where citations are relatively infrequent and usually occur a few years after publication. Since El Naschie published in a field in which citations are relatively scarce and published quite recently this boosted the impact score of his papers. The reason for using this approach is clear and sensible, to overcome the distorting effects of varying citation practices in different disciplines when comparing individual researchers or departments. But there are problems if this method is used to compare whole universities. A great deal depends on when the cited and citing articles are published and in which subject they were classified by TR.

A question for TR. How are articles classified? Is it possible to influence the category in which they are placed by the use of key words or the wording of the title?

Next, note that TR were measuring average citation impact. A consequence of this is that the publication of large numbers of papers that are cited less frequently than the high fliers could drag down the score. This explains an apparent oddity of the citation scores in the 2010 THE rankings. El Naschie listed nine universities as his affiliation in varying combinations between 2004 and 2008. yet it was only Alexandria that managed to leave the Ivy League and Oxbridge standing in the research impact dust. Recently, El Naschie’s list of affiliations has consisted of Alexandria, Cairo, Frankfurt University and Shanghai Jiao Tong University.

What happened was quite simply that all the others were producing so many papers that El Naschie’s made little or no difference. For once, it would be quite correct if El Naschie announced that he could not have done it without the support of his colleagues. Alexandria University owes its success not only to El Naschie and his citers but also to all those researchers who refrained from submitting articles to ISI–indexed journals or conference proceedings.

TR have some explaining to do here. If an author lists more than one affiliation, are they all counted? Or are fractions awarded for each paper? Is there any limit on the number of affiliations that an author may have. I think that it is two but would welcome clarification

As for the claim that Alexandria is strong in research, a quick look at the Scimago rankings is enough to dispose of that. It is ranked 1,047th in the 2010 rankings, which admittedly include many non-university organizations, for total publications over a decade. Also, one must ask how much of El Naschie’s writing was actually done in Alexandria, seeing that he had eight other affiliations between 2004 and 2008.

It has to be said that even if El Naschie is, as has been claimed in comments on Phil’s THE article and elsewhere, one of the most original thinkers of our time, it is strange that THE and TR should use a method that totally undermines their claim that the new methodology is based on evidence rather than reputation By giving any sort of credence to the Alexandria score, THE are asking us to believe that Alexandria is strong in research because precisely one writer is highly reputed by himself and a few others. Incidentally, will TR tell us what score Alexandria got in the research reputation survey?

I am not qualified to comment on the scientific merits of El Naschie’s work. At the moment it appears, judging from the comments in various physics blogs, that among physicists and mathematicians there are more detractors than supporters. There are also few documented signs of conventional academic merit in recent years such as permanent full time appointments or research grants. None of his papers between 2004 and 2008 in ISI-indexed journals,for example, apparently received external funding. His affiliations, if documented, turn out to be honorary, advisory or visiting.To be fair, readers might wish to visit El Naschie’s site. I will also publish any comments of a non-libellous nature that support or dispute the scientific merits of his writings.

Incidentally, it is unlikely that Alexandria’s score of 19.3 for internationalisation was faked. TR use a logarithm. If there were zero international staff and students a university would get a score of 1 and a score of 19.3 actually represents a small percentage. On the other hand, I do wonder whether Alexandria counted those students in the branch campuses in Lebanon, Sudan and Chad.

Finally, TR did not take the very simple and obvious step of not counting individual self-citations. Had they done so, they would have saved everybody, including themselves a lot of trouble. It would have been even better if they had excluded intra-institutional and intra-journal citation. See here for the role of citations among the editorial board of IJNSNS in creating an extraordinarily high Journal Impact Factor.

THE and TR have done everyone a great service by highlighting the corrosive effect of self citation on the citations tracking industry. It has become apparent that there are enormous variations in the prevalence of self citation in its various forms and that these have a strong influence on the citation impact score.

Professor Dirk Van Damme is reported to have said at the London seminar that the world’s elite universities were facing a challenge from universities in the bottom half of the top 200. If this were the case then THE could perhaps claim that their innovative methodology had uncovered reserves of talent ignored by previous rankings. But what exactly was the nature of the challenge? It seems that it was the efficiency with which the challengers turned research income into citations. And how did they do that?

I have taken the simple step of dividing the score for citations by the score for the research indicator (which includes research income) and then sorting the resulting values. The top ten are Alexandria, Hong Kong Baptist University, Barcelona, Bilkent, William and Mary, ENS de Lyon, Royal Holloway, Pompeu Fabra, University College Dublin, the University of Adelaide.

Seriously, these are a threat to the world’s elite?

The high scores for citations relative to research were the result of a large number of citations or a small number of total publications or both. It is of interest to note that in some cases the number of citations was the result of assiduous self-citation.

This section of the post contained comments about comparative rates of self citation among various universities. The method used was not correct and I am recalculating.
As noted already, using the THE iPad app to change the importance attached to various indicators can produce very different results. This is a list of universities that rise more than a hundred places when the citations indicator is set to ‘not important’. They have suffered perhaps because of a lack of super-cited papers, perhaps also because they just produced too many papers.

Loughborough
Kyushu
Sung Kyun Kwan
Texas A and M
Surrey
Shanghai Jiao Tong University
Delft University of Technology
National Chiao Tung University (Taiwan)
Royal Institute of Technology Sweden
Tokushima
Hokkaido

Here is a list of universities that fall more than 100 places when the citations indicator is set to ‘not important’. They have benefitted from a few highly cited papers or low publication counts or a combination of the two.

Boston College
University of California Santa Cruz
Royal Holloway, University of London
Pompeu Fabra
Bilkent
Kent State University
Hong Kong Baptist University
Alexandria
Barcelona
Victoria University Wellington
Tokyo Metropolitan University
University of Warsaw

There are many others that rise or fell seventy, eighty, ninety places when citations are taken out of the equation. This is not a case of a few anomalies. The whole indicator is one big anomaly.

Earlier Jonathon Adams in a column that has attracted one comment said:

"Disciplinary diversity is an important factor, as is international diversity. How would you show the emerging excellence of a really good university in a less well known country such as Indonesia? This is where we would be most controversial, and most at risk, in using the logic of field-normalisation to add a small weighting in favour of relatively good institutions in countries with small research communities. Some may feel that we got that one only partially right."

The rankings do not include universities in Indonesia, really good or otherwise. The problem is with good, mediocre and not very good universities in the US, UK, Spain, Turkey, Egypt, New Zealand, Poland etc. . It is a huge weighting, not a small one, the universities concerned range from relatively good to relatively bad, in one case the research community seems to consist of one person and many are convinced that TR got that one totally wrong.


Indonesia may be less well known to TR but it is very well known to itself and neighbouring countries.


I will publish any comments by anyone who wishes to defend the citation indicator of the new rankings. Here are some questions they might wish to consider.


Was it a good idea to give such a heavy weighting to research impact, 32.5% in the oerall rankings , 37.5 in at least 2 subject rankings? Is it possible that commercial considerations, citations data being a lucrative business for TR , had something to do with it?

Are citations such a robust indictor? Is there not enough evidence now to suggest that manipulation of citations, including self-citation, intra-institutional citation and intra-journal citation, is so pervasive that the robustness of this measure is very slight?

Since there are several ways to measure research impact, would it not have been a good idea to have used several methods? After all, Leiden University has several different ways of assessing impact. Why use only one?


Why set the threshold for inclusion so low at 50 papers per year?

Thursday, April 06, 2017

Doing Something About Citations and Affiliations

University rankings have proliferated over the last decade. The International Rankings Expert Group's (IREG) inventory of national rankings counted 60 and there are now 40 international rankings including global, regional, subject, business school and system rankings.

In addition, there have been  a variety of spin offs and extracts from the global rankings, especially those published by Times Higher Education, including Asian, Latin American, African, MENA, Young University rankings and most international universities. The value of these varies but that of the Asian rankings must now be considered especially suspect.

THE have just released the latest edition of their Asian rankings using the world rankings indicators with a recalibration of the weightings. They have reduced the weighting given to the teaching and research reputation surveys and increased that for research income, research productivity and income from industry. Unsurprisingly, Japanese universities, with good reputations but affected by budget cuts, have performed less well than in the world rankings.

These rankings have, as usual, produced some results that are rather counter intuitive and illustrate the need for THE, other rankers and the academic publishing industry to introduce some reforms in the presentation and counting of publications and citations.

As usual, the oddities in the THE Asian rankings have a lot to do with the research impact indicator supposedly measured by citations. This, it needs to be explained, does not simply count the number of citations but compares them with the world average for over three hundred fields, five years of publications and six years of citations. Added to all that is a "regional modification" applied to half of the indicator by which the score for each university is divided by the square root of the score for the country in which the university is located. This effectively gives a boost to everybody except those places in the top scoring country, one that can be quite significant for countries with a low citation impact.

What this means is that a university with  a minimal number of papers can rack up a large and disproportionate score if it can collect large numbers of citations for a relatively small number of papers. This appears to be what has contributed to the extraordinary success of the institution variously known as Vel Tech University, Veltech University, Veltech Dr. RR & Dr. SR University and Vel Tech Rangarajan Dr Sagunthala R & D Institute of Science and Technology.

The university has scored a few local achievements, most recently ranking 58th for engineering institutions in the latest Indian NRIF rankings, but internationally, as Ben Sowter indicated in Quora, it is way down the ladder or even unable to get onto the bottom rung.

So how did it get to be the third best university and best private university in India according to the THE Asian rankings? How could it have the highest research impact of any university in Chennai, Tamil Nadu, India and Asia and perhaps the highest or second highest in the world.

Ben Sowter of QS Intelligence Unit has provided the answer. It is basically due to industrial scale self-citation.

"Their score of 100 for citations places them as the topmost university in Asia for citations, more than 6 points clear of their nearest rival. This is an indicator weighted at 30%. Conversely, and very differently from other institutions in the top 10 for citations, with a score of just 8.4 for research, they come 285/298 listed institutions. So an obvious question emerges, how can one of the weakest universities in the list for research, be the best institution in the list for citations?
The simple answer? It can’t. This is an invalid result, which should have been picked up when the compilers undertook their quality assurance checks.
It’s technically not a mistake though, it has occurred as a result of the Times Higher Education methodology not excluding self-citations, and the institution appears to have, for either this or other purposes, undertaken a clear campaign to radically promote self-citations from 2015 onwards.
In other words and in my opinion, the university has deliberately and artificially manipulated their citation records, to cheat this or some other evaluation system that draws on them.
The Times Higher Education methodology page explains: The data include the 23,000 academic journals indexed by Elsevier’s Scopus database and all indexed publications between 2011 and 2015. Citations to these publications made in the six years from 2011 to 2016 are also collected.
So let’s take a look at the Scopus records for Vel Tech for those periods. There are 973 records in Scopus on the primary Vel Tech record for the period 2011–2015 (which may explain why Vel Tech have not featured in their world ranking which has a threshold of 1,000). Productivity has risen sharply through that period from 68 records in 2011 to 433 records in 2015 - for which due credit should be afforded.
The issue begins to present itself when we look at the citation picture. "
He continues:
 "That’s right. Of the 13,864 citations recorded for the main Vel Tech affiliation in the measured period 12,548 (90.5%) are self-citations!!
A self-citation is not, as some readers might imagine, one researcher at an institution citing another at their own institution, but that researcher citing their own previous research, and the only way to a group of researchers will behave that way collectively on this kind of scale so suddenly, is to have pursued a deliberate strategy to do so for some unclear and potentially nefarious purpose.
It’s not a big step further to identify some of the authors who are most clearly at the heart of this strategy by looking at the frequency of their occurence amongst the most cited papers for Vel Tech. Whilst this involves a number of researchers, at the heart of it seems to be Dr. Sundarapandian Vaidyanathan, Dean of the R&D Center.
Let’s take as an example, a single paper he published in 2015 entitled “A 3-D novel conservative chaotic system and its generalized projective synchronization via adaptive control”. Scopus lists 144 references, 19 of which appear to be his own prior publications. The paper has been cited 114 times, 112 times by himself in other work."

In addition, the non-self citations are from a very small number of people, including his co-authors. Basically his audience is himself and a small circle of friends.

Another point is that Dr Vaidyanathan has published in a limited of journals and conference proceedings the most important of which are the International Journal of Pharmtech Research and the International Journal of Chemtech Research, both of which have Vaidyanathan as an associate editor. My understanding of Scopus procedures for inclusion and retention in the database is that the number of citations is very important. I was once associated with a journal that was highly praised by the Scopus reviewers for the quality of its contents but rejected because it had few citations. I wonder if Scopus's criteria include watching out for self-citations.

The Editor in Chief of the International Journal of Chemtech Research is listed as Bhavik J Bhatt who received his Ph D from the University of Iowa in 2013 and does not appear to have ever held a full time university post.

The Editor in Chief of the International Journal of Pharmtech Research is Moklesur R Sarker, associate professor at Lincoln University College Malaysia, which in 2015 was reported to be in trouble for admitting bogus students.

I will be scrupulously fair and quote Dr Vaidyanathan.

"I joined Veltech University in 2009 as a Professor and shortly, I joined the Research and Development Centre at Veltech University. My recent research areas are chaos and control theory. I like to stress that research is a continuous process, and research done in one topic becomes a useful input to next topic and the next work cannot be carried on without referring to previous work. My recent research is an in-depth study and discovery of new chaotic and hyperchaotic systems, and my core research is done on chaos, control and applications of these areas. As per my Scopus record, I have published a total of 348 research documents. As per Scopus records, my work in chaos is ranked as No. 2, and ranked next to eminent Professor G. Chen. Also, as per Scopus records, my work in hyperchaos is ranked as No. 1, and I have contributed to around 50 new hyperchaotic systems. In Scopus records, I am also included in the list of peers who have contributed in control areas such as ‘Adaptive Control’, ‘Backstepping Control’, ‘Sliding Mode Control’ and ‘Memristors’. Thus, the Scopus record of my prolific research work gives ample evidence of my subject expertise in chaos and control. In this scenario, it is not correct for others to state that self-citation has been done for past few years with an intention of misleading others. I like to stress very categorically that the self-citations are not an intention of me or my University.         
I started research in chaos theory and control during the years 2010-2013. My visit to Tunisia as a General Chair and Plenary Speaker in CEIT-2013 Control Conference was a turning point in my research career. I met many researchers in control systems engineering and I actively started my research collaborations with foreign faculty around the world. From 2013-2016, I have developed many new results in chaos theory such as new chaotic systems, new hyperchaotic systems, their applications in various fields, and I have also published several papers in control techniques such as adaptive control, backstepping control, sliding mode control etc. Recently, I am also actively involved in new areas such as fractional-order chaotic systems, memristors, memristive devices, etc."
...
"Regarding citations, I cite the recent developments like the discovery of new chaotic and hyperchaotic systems, recent applications of these systems in various fields like physics, chemistry, biology, population ecology, neurology, neural networks, mechanics, robotics, chaos masking, encryption, and also various control techniques such as active control, adaptive control, backstepping control, fuzzy logic control, sliding mode control, passive control, etc,, and these recent developments include my works also."


His claim that self citation was not his intention is odd. Was he citing in his sleep or was he possessed by an evil spirit when he wrote his papers or signed off on them? The claim about citing recent developments that include his own work misses the point. Certainly somebody like Chomsky would cite himself when reviewing developments in formal linguistics but he would also be cited by other people. Aside from himself and his co-authors Dr Vaidyanathan is cited by almost nobody.

The problems with the citations indicator in the THE Asian rankings do not end there. Here are a few cases of universities with very low scores for research and unbelievably high scores for research impact

King Abdulaziz University is ranked second in Asia for research impact. This is an old story and it is achieved by the massive recruitment of adjunct faculty culled from the lists of highly cited researchers.

Toyota Technological Institute is supposedly best in Japan for research impact, which I suspect would be news to most Japanese academics, but 19th for research.

Atilim University in Ankara is supposedly the best in Turkey for research impact but also has a very low score for research.

The high citations score for Quaid i Azam University in Pakistan results from participation in the multi-author physics papers derived from the CERN projects. In addition, there is one hyper productive researcher in applied mathematics.

Tokyo Metropolitan University gets a high score for citation because of a few much cited papers in physics and molecular genetics.

Bilkent university is a contributor to frequently cited multi-author papers in genetics.

According to THE Universiti Tunku Abdul Rahman (UTAR) is the second best university in Malaysia and best for research impact, something that will come as a surprise to anyone with the slightest knowledge of Malaysian higher education. This is because of participation in the global burden of disease study, whose papers propelled Anglia Ruskin University to the apex of British research. Other universities with disproportionate scores for research impact include Soochow University  China, North East Normal University China, Jordan University of Science and Technology, Panjab University India, Comsats Institute of Information Technology Pakistan and Yokohama City University Japan.

There are some things that the ranking and academic publishing industries need to do about the collection, presentation and distribution of publications and citations data.


1.  All rankers should exclude self- citations from citation counts. This is very easy to do, just clicking a box, and has been done by QS since 2011. It would be even better if intra-university and intra-journal citations were excluded as well.

2.  There will almost certainly be a growing problem with the recruitment of adjunct staff who will be asked to do no more than list an institution as a secondary affiliation when publishing papers. It would be sensible if academic publishers simply insisted that there be only one affiliation per author. If they do not it should be possible for rankers to count only the first named author.

3.  The more fields there are the greater the chances that rankings can be skewed by strategically or accidentally placed citations. The number of fields used for normalisation should be kept to a reasonable number.

4. A visit to the Leiden Ranking website and a few minutes tinkering with their settings and parameters will show that citations can be used to measure several different things. Rankers should use more than one indicator to measure citations.

5. It defies common sense for any ranking to give a greater weight to citations than to publications. Rankers need to review the weighting given to their citation indicators. In particular,  THE needs to think about their regional modification. which has the effect, noted above, of increasing the citations score for nearly everybody and so pushing the actual weighting of the indicator above 30 per cent.

6. Academic publishers and databases like Scopus and Web of Science need to audit journals on a regular basis.



Thursday, January 27, 2011

All Ears

Times Higher Education and Thomson Reuters are considering changes to their ranking methodology. It seems that the research impact indicator (citations) will figure prominently in their considerations.  Phil Baty writes:

In a consultation document circulated to the platform group, Thomson Reuters suggests a range of changes for 2011-12.


A key element of the 2010-11 rankings was a "research influence" indicator, which looked at the number of citations for each paper published by an institution. It drew on some 25 million citations from 5 million articles published over five years, and the data were normalised to reflect variations in citation volume between disciplines.


Thomson Reuters and THE are now consulting on ways to moderate the effect of rare, exceptionally highly cited papers, which could boost the performance of a university with a low publication volume.


One option would be to increase the minimum publication threshold for inclusion in the rankings, which in 2010 was 50 papers a year.


Feedback is also sought on modifications to citation data reflecting different regions' citation behaviour.


Thomson Reuters said that the modifications had allowed "smaller institutions with good but not outstanding impact in low-cited countries" to benefit.
It would be very wise to do something drastic about the citations indicator. According to last year's rankings, Alexandria university is the fourth best university in the world for research impact, Hong Kong Baptist University is second in Asia, Ecole Normale Superieure Paris best in Europe with Royal Holloway University of London fourth, University of California Santa Cruz fourth in the USA and the University of Adelaide best in Australia.

If anyone would like to justify these results they are welcome to post a comment.

I would like to make these suggestions for modifying the citations indicator.

Do not count self-citations, citations to the same journal in which a paper is published or citations to the same university. This would reduce, although not completely eliminate, manipulation of the citation system. If this is not done there will be massive self citation and citation of friends and colleagues. It might even be possible to implement a measure of net citation by deducting citations from an institution from the citations to it, thus reduce the effect of tacit citation agreements.

Normalisation by subject field is probably going to stay. It is reasonable that some consideration should be given to scholars who work in fields where citations are delayed and infrequent. However, it should be recognised that the purpose of this procedure is to identify pockets of excellent and research institutions are not built around a few pockets or even a single one. There are many ways of measuring research impact and this is just one of them. Others that might be used include total citations, citations per faculty, citations per research income and h-index.

Normalisation by year is especially problematical and should be dropped. It means that a handful of citations  to an article classified as being in a low-citation discipline in the same year could dramatically multiply the score for this indicator. It also introduces an element of potential instability. Even if the methodology remains completely unchanged this year, Alexandria and Bilkent and others are going to drop scores of places as papers go on receiving citations but get less value from them as the benchmark number rises.

Raising the threshold of number of publications might not be a good idea. It is certainly true that Leiden University have a threshold of 400 publications a year but Leiden is measuring only research impact while THE and TR are measuring a variety of indicators. There are already too many blank spaces in these rankings and their credibility will be further undermined if universities are not assessed on an indicator with such a large weighting.

Saturday, June 22, 2013

Citation Cartels

An article by Paul Jump in Times Higher Education describes how Thomson Reuters have been excluding an increasing number of journals from their Journal Citation Reports for "anomalous citation patterns" which now includes not just self-citation but excessive mutual citation.

Surely it is now time for Thomson Reuters to stop counting self-citations for the Research Influence indicator in the THE World University Rankings. The threat of the self-citations of Dr El Naschie "of" Alexandria University has receded but there are others who would have a big impact on the rankings if they ever move to a university with a low volume of publications.

TR may not want to follow QS who no longer count citations for their rankings but excluding excessive mutual citation as well would put them one up again.

Monday, August 31, 2015

Update on changes in ranking methodology

Times Higher Education (THE) have been preparing the ground for methodological changes in their world rankings. A recent article by Phil Baty  announced that the new world rankings scheduled for September 30 will not count the citations to 649 papers, mainly in particle physics, with more than 1000 authors.

This is perhaps the best that is technically and/or commercially feasible at this moment but it is far from satisfactory. Some of these publications are dealing with the most basic questions about the nature of physical reality and it is a serious distortion not to include them in the ranking methodology. There have been complaints about this. Pavel Krokovny's comment was noted in a previous post while Mete Yeyisoglu argues that:
"Fractional counting is the ultimate solution. I wish you could have worked it out to use fractional counting for the 2015-16 rankings.
The current interim approach you came up with is objectionable.
Why 1,000 authors? How was the limit set? What about 999 authored-articles?
Although the institution I work for will probably benefit from this interim approach, I think you should have kept the same old methodology until you come up with an ultimate solution.
This year's interim fluctuation will adversely affect the image of university rankings."

Baty provides a reasonable answer to the question why the cut-off point is 1,000 authors.

But there is a fundamental issue developing here that goes beyond ranking procedure. The concept of authorship of a philosophy paper written entirely by a single person or a sociological study from a small research team is very different from that of the huge multi-national capital and labour intensive publications in which the number of collaborating institutions exceeds the number of  paragraphs and there are more authors than sentences.

Fractional counting does seem to be the only fair and sensible way forward and it is now apparently on THE's agenda although they have still not committed themselves.

The objection could be raised that while the current THE system gives a huge reward to even the least significant contributing institution, fractional counting would give major research universities insufficient credit for their role in important research projects.

A long term solution might be to draw a distinction between the contributors to and the authors of the mega papers. For most publications there would be no need to draw such a distinction but for those with some sort of input from dozens, hundreds or thousands of people it might be feasible for to allot half the credit to all those who had anything to do with the project and the other half to those who meet the standard criteria of authorship. There would no doubt be a lot of politicking about who gets the credit but that would be nothing new.

Duncan Ross, the new Data and Analytics Director at THE, seems to be thinking along these lines.
"In the longer term there are one technical and one structural approach that would be viable.  The technical approach is to use a fractional counting approach (2932 authors? Well you each get 0.034% of the credit).  The structural approach is more of a long term solution: to persuade the academic community to adopt metadata that adequately explains the relationship of individuals to the paper that they are ‘authoring’.  Unfortunately I’m not holding my breath on that one."
The counting of citations to mega papers is not the only problem with the THE citations indicator. Another is the practice of giving a boost to universities in underperforming countries. Another item by Phil Baty quotes this justification from Thomson Reuters, THE's former data partner.

“The concept of the regional modification is to overcome the differences between publication and citation behaviour between different countries and regions. For example some regions will have English as their primary language and all the publications will be in English, this will give them an advantage over a region that publishes some of its papers in other languages (because non-English publications will have a limited audience of readers and therefore a limited ability to be cited). There are also factors to consider such as the size of the research network in that region, the ability of its researchers and academics to network at conferences and the local research, evaluation and funding policies that may influence publishing practice.”

THE now appear to agree that this is indefensible in the long run and hope that a more inclusive academic survey and the shift to Scopus, with broader coverage than the Web of Science, will lead to this adjustment being phased out.

It is a bit odd that TR and THE should have introduced income, in three separate indicators, and international outlook, in another three, as markers of excellence, but then included a regional modification to compensate for limited funding and international contacts.

THE are to be congratulated for having put fractional counting and phasing out the regional modification on their agenda. Let's hope it doesn't take too long.

While we are on the topic, there are some more things about the citation indicator to think about . First, to repeat a couple of points mentioned in the earlier post.

  • Reducing the number of fields or doing away with normalisation by year of citation. The more boxes into which any given citation can be dropped the greater the chance of statistical anomalies when a cluster of citations meets a low world average of citations for that particular year of citations, year of publication and field (300 in Scopus?)

  • Reducing the weighting for this indicator. Perhaps citations per paper normalized by field is a useful instrument for comparing the quality of research of MIT, Caltech, Harvard and the like but it might be of little value when comparing the research performance of Panjab University and IIT Bombay or Istanbul University and  Bogazici.

Some other things THE could think about.

  • Adding a measure of overall research impact, perhaps simply by counting citations. At the very least stop calling field- and year- normalised regionally modified citations per paper a measure of research impact. Call it research quality or something like that.

  • Doing something about secondary affiliations. So far this seems to have been a problem mainly  for the Highly Cited Researchers indicator in the Shanghai ARWU but it may not be very long before more universities realise  that a few million dollars for adjunct faculty could have a disproportion impact on publication and citation counts.

  • Also, perhaps THE should consider excluding self-citations (or even citations within the same institution although that would obviously be technically difficult). Self-citation caused a problem in 2010 when Dr El Naschie's diligent citation of himself and a few friends lifted Alexandria University to fourth place in the world for research impact. Something similar might happen again now that THE are using a larger and less selective database.


Sunday, September 19, 2010

More on the THE Citations Indicator

See this comment on a previous post:


As you can see from the following paragraph(http://www.timeshighereducation.co.uk/world-university-rankings/2010-2011/analysis-methodology.html) Thomson has normalised citations against each of their 251 subject categories (it's extremely difficult to get this data directly from WOS).. They have great experience in this kind of analysis.. to get an idea, check their in-cites website http://sciencewatch.com/about/met/thresholds/#tab3 where they have citations thresholds for the last 10 years against broad fields.

Paragraph mentioned above:
"Citation impact: it's all relative
Citations are widely recognised as a strong indicator of the significance and relevance — that is, the impact — of a piece of research.
However, citation data must be used with care as citation rates can vary between subjects and time periods.
For example, papers in the life sciences tend to be cited more frequently than those published in the social sciences.
The rankings this year use normalised citation impact, where the citations to each paper are compared with the average number of citations received by all papers published in the same field and year. So a paper with a relative citation impact of 2.0 is cited twice as frequently as the average for similar papers.
The data were extracted from the Thomson Reuters resource known as Web of Science, the largest and most comprehensive database of research citations available.
Its authoritative and multidisciplinary content covers more than 11,600 of the highest-impact journals worldwide. The benchmarking exercise is carried out on an exact level across 251 subject areas for each year in the period 2004 to 2008.
For institutions that produce few papers, the relative citation impact may be significantly influenced by one or two highly cited papers and therefore it does not accurately reflect their typical performance. However, institutions publishing fewer than 50 papers a year have been excluded from the rankings.
There are occasions where a groundbreaking academic paper is so influential as to drive the citation counts to extreme levels — receiving thousands of citations. An institution that contributes to one of these papers will receive a significant and noticeable boost to its citation impact, and this reflects such institutions' contribution to globally significant research projects."


The quotation is from the bottom of the methodology page. It is easy to miss since it is separate from the general discussion of the citations indicator.

I will comment on Simon Pratt's claim that "An institution that contributes to one of these papers will receive a significant and noticeable boost to its citation impact, and this reflects such institutions' contribution to globally significant research projects."

First, were self-citations included in the analysis?

Second, do institutions receive the same credit for contributing to a research project by providing one out of twenty co-authors that they would for contributing all of them.

Third, since citation scores vary from one subject field to another, a paper will get a higher impact score if it is classified as a subject that typically receives few citations than as one in which citations are plentiful.

Fourth, the obvious problem that undermines the entire indicator is that the impact scores are divided by the total number of papers. A groundbreaking paper with thousands of citations would would make little difference to Harvard. Change the affiliation to a small college somewhere and it would stand out (providing the college could reach 50 papers a year)>

This explains something rather odd about the data for Alexandria University. Mohamed El Naschie has published many papers with several different affiliations. Yet the many citations to these papers produced a dramatic effect only for Alexandria. This, it seems, was because his Alexandria papers had a big effect because the total number of papers was so low.

Saturday, October 27, 2012

More on MEPhI

Right after putting up the post on Moscow State Engineering Physics Institute and its "achievement" in getting the maximum score for research impact in the latest THE - TR World University Rankings, I found this exchange on Facebook.  See my comments at the end.

  • Valery Adzhiev So, the best university in the world in the "citation" (i.e. "research influence") category is Moscow State Engineering Physics Institute with maximum '100' score. This is remarkable achivement by any standards. At the same time it scored in "research" just 10.6 (out of 100) which is very, very low result. How on earth that can be?
  • Times Higher Education World University Rankings Hi Valery,

    Regarding MEPHI’s high citation impact, there are two causes: Firstly they have a couple of extremely highly cited papers out of a very low volume of papers.The two extremely highly cited papers are skewing what would ordinarily be a very g
    ood normalized citation impact to an even higher level.

    We also apply "regional modification" to the Normalized Citation Impact. This is an adjustment that we make to take into account the different citation cultures of each country (because of things like language and research policy). In the case of Russia, because the underlying citation impact of the country is low it means that Russian universities get a bit of a boost for the Normalized Citation Impact.

    MEPHI is right on the boundary for meeting the minimum requirement for the THE World University Rankings, and for this reason was excluded from the rankings in previous years. There is still a big concern with the number of papers being so low and I think we may see MEPHI’s citation impact change considerably over time as the effect of the above mentioned 2 papers go out of the system (although there will probably be new ones come in).

    Hope this helps to explain things.
    THE
  • Valery Adzhiev Thanks for your prompt reply. Unfortunately, the closer look at that case only adds rather awkward questions. "a couple of extremely highly cited papers are actually not "papers": they are biannual volumes titled "The Review of Particle Physics" that ...See More
  • Valery Adzhiev I continue. There are more than 200 authors (in fact, they are "editors") from more than 100 organisation from all over the world, who produce those volumes. Look: just one of them happened to be affiliated with MEPhI - and that rather modest fact (tha...See More
  • Valery Adzhiev Sorry, another addition: I'd just want to repeat that my point is not concerned only with MEPhI - Am talking about your methodology. Look at the "citation score" of some other universities. Royal Holloway, University of London having justt 27.7 in "res...See More
  • Alvin See Great observations, Valery.
  • Times Higher Education World University Rankings Hi Valery,

    Thanks again for your thorough analysis. The citation score is one of 13 indicators within what is a balanced and comprehensive system. Everything is put in place to ensure a balanced overall result, and we put our methodology up online for
    ...See More
  • Andrei Rostovtsev This is in fact rather philosofical point. There are also a number of very scandalous papers with definitively negative scientific impact, but making a lot of noise around. Those have also high contribution to the citation score, but negative impact t...See More

    It is true that two extremely highly cited publications combined with a low total number of publications skewed the results but what is equally or perhaps more important is that  these citations occur in the year or two years after publication when citations tend to be relatively infrequent compared to later years. The 2010 publication is a biennial review, like the 2008 publication, that will be cited copiously for two years after which it will no doubt be superseded by the 2012 edition.

    Also, we should note that in the ISI Web of Science, the 2008 publication is classified as "physics, multidisciplinary". Papers listed as multidisciplinary generally get relatively few citations so if the publication was compared to other multidisciplinary papers it would get an even larger weighting. 
    Valery has an excellent point when he points out that these publications have over 100 authors or contributors each (I am not sure whether they are actual researchers or administrators). Why then did not all the other contributors boost their instutitions' scores to similar heights? Partly because they were not in Russia and therefore did not get the regional weighting but also because they were publishing many more papers overall than MEPhI.  

    So basically, A. Romaniouk who contributed 1/173rd of one publication was considered as having more research impact than hundreds of researchers at Harvard, MIT, Caltech etc producing hundreds of papers cited hundreds of times.  Sorry, but is this a ranking of research quality or a lottery?

    The worse part of THE's reply is this:

    Thanks again for your thorough analysis. The citation score is one of 13 indicators within what is a balanced and comprehensive system. Everything is put in place to ensure a balanced overall result, and we put our methodology up online for all to see (and indeed scrutinise, which everyone is entitled to do).

    We welcome feedback, are constantly developing our system, and will definitely take your comments on board.

    The system is not balanced. Citations have a weighting of 30 %, much more than any other  indicator. Even the research reputation survey has a weighting of only 18%.  And to describe as comprehensive an indicator which allows a fraction of one or two publications to surpass massive amounts of original and influential research is really plumbing the depths of absurdity.

    I am just about to finish comparing the scores for research and research impact for the top 400 universities. There is a statistically significant correlation but it is quite modest. When research reputation, volume of publications and research income show such a modestcorrelation with research impact it is time to ask whether there is a serious problem with this indicator.

    Here is some advice for THE and TR.

    • First, and surely very obvious, if you are going to use field normalisation then calculate the score for discipline groups, natural sciences, social sciences and so on and aggregate the scores. So give MEPhI a 100 for physical or natural sciences if you think they deserve it but not for the arts and humanities.
    • Second, and also obvious, introduce fractional counting, that is dividing the number of citations by the number of authors of the cited paper.
    • Do not count citations to summaries, reviews or compilations of research.
    • Do not count citations of commercial material about computer programs. This would reduce the very high and implausible score for Gottingen which is derived from a single publication.
    • Do not assess research impact with only one indicator. See the Leiden ranking for the many ways of rating research.
    • Consider whether it is appropriate to have a regional weighting. This is after all an international ranking.
    • Reduce the weighting for this indicator.
    • Do not count self-citations. Better yet do  not count citations from researchers at the same university.
    • Strictly enforce your rule about  not including single subject institutions in the general rankings.
    • Increase the threshold number of publications for inclusion in the rankings from two hundred to four hundred.