About RePEc impact factors

July 27, 2009

Impact factors have always been a popular way to measure the influence of academic journals. They have been popularized by ISI, now part of Thomson. RePEc also provides impact factors, and this post is about explaining the differences between the two.

ISI takes a sample of journals and analyzes the citations across those journals. To be eligible, a citations has to appear within two years of the publication of the cited article, the cited article must be printed (not forthcoming, a working paper or a manuscript), and the cited article must be among the analyzed journals (286 in Economics). ISI is currently experimenting with a five year window, in addition to the existing two-year window.

RePEc considers all publications listed in its bibliographic database. Thus, it also considers other publication forms than journal articles: close to 1000 journals and 2600 working paper series. It imposes no time window, citations of any age qualify. In most cases, a citation of a working paper will count towards its published form once the article is included in RePEc, possibly after the original citation (condition: at least one author has both versions in his/her RePEc profile). This implies that working paper series and book series can also have impact factors. RePEc is thus more comprehensive.

However, the pool of citations RePEc is drawing from is different. It relies very much on working papers (who can later be published), as they are typically openly accessible. Some publishers also provide references in the bibliographic metadata, but not all. One implication of this is that RePEc is more current as it includes citations to and from research that is not yet published. As research gets published, this data gets updated. But as references from many journals are missing, RePEc citation data must still be treated as experimental. Whether these omissions matter remain to be seen. After all, impact factors always have to be considered in relative terms, not in absolute terms, and if omissions were not biased, they would not matter.

Another major difference is that RePEc excludes self-citations. This is an important issue as some journals, explicitly or implicitly, encourage authors to cite other articles published within the two year window in the same journal. Thus, just as self-citations are excluded for authors, they are excluded for journals. And this can matter a lot.

Finally, the impact factor is determined by divided the eligible citations by the number of eligible articles. ISI determines itself what articles are eligible for the denominator, and this can even be negotiated with the publisher. In RePEc’s case, if an article (or a working paper) is listed, it counts without adjustment.

RePEc also publishes variations on the “simple” impact factor: recursive impact factors, where every citation counts with the impact factor of the citing publication, this favors impact over numbers; discounted impact factors, where the impact of a citation decays with time (regardless of the age of the cited item; and a combination of the two, discounted recursive impact factors. Finally, there is now also the h-index. All variations have a different story to tell about the publication, and RePEc offers the reader the choice.

The best top level institutions in Economics

June 21, 2009

RePEc rankings are surprisingly popular, despite their experimental status, in fact this is the most read topic on this blog. So to cater to the interest of our users, let us add another ranking… RePEc has been ranking institutions for quite a while now, using the institutions listed in EDIRC. This ranks, say, at the department level, not at the university level. This is detrimental to institutions where economists are scattered in various departments, in particular in departments that are not listed in EDIRC, for example law, political sciences and statistics. A new ranking is now computed that assembles all authors within the top level institution for their affiliation(s), say a university, a government, etc. Current results are here.

The methodology is the following. For affiliations listed in EDIRC, the top level is used. That would typically be a university. For affiliations not listed in EDIRC, the homepage domain of the institution submitted by the author is matched with any institutions listed in EDIRC. If no match is found, it is taken as is. Finally, as usual with multiple affiliations, a weighing scheme is used to distribute the author’s score across all affiliations.

Note a few particularities. All components of the University of London (LSE, Imperial College, etc.) are all merged into one. All subdivisions of a national government are also merged. US Federal Reserve Banks, however, are not merged, as they are top level in their respective states.

Tips for authors to improve their RePEc ranking

April 16, 2009

By far the most popular topic on this blog is material about rankings. People love to know who the best are and how they fare. This post is about optimizing one’s ranking within RePEc, and doing so in a way that does not trigger our safeguards against cheating. It turns out all the following points are points we actually want to encourage anyway so as to improve the quality of the data collected in RePEc.

As an author, here is what you can do once you logged into the RePEc Author Service:

  1. Make sure all your works listed in RePEc are actually in your profile. Thus, do not remove from your profile working papers that have been published. Some working paper series have higher impact factors than many journals, and working papers are much more downloaded than articles. In addition, if all versions are in your profile, we can link between them. (If you previously refused items that were yours, you can recuperate them by clicking on the “refused” tab in your research page, unrefuse the relevant items, and then redo the search)
  2. Make sure the name variations listed in your profile really encompass all possible ways a publisher may have listed your name. The automatic search is only going to find works with such names.
  3. There may be additional citations waiting for your approval. These are those for which we have less confidence that they pertain to the right work. Click on the “citation” tab in your author account.
  4. Link to your profile on EconPapers or IDEAS from your homepage or email signature.
  5. When refering to your works on a web page, put the link to EconPapers or IDEAS. We cannot count downloads that do not transit through RePEc services.
  6. Make sure all your works are listed on RePEc. For the missing ones, encourage the publisher to list them, or get your department to open a working paper series, or upload your works on the Munich Personal RePEc Archive.

As an institution, you can optimize your ranking by making sure your registered authors follow the advice from above and:

  1. Make sure everyone is registered and maintains his/her profile.
  2. Make sure everyone gives the proper affiliation. You can check who is listed with you by finding your institution on EDIRC.
  3. Have your working paper series listed on RePEc. Instructions are here.

If everyone optimizes like this, RePEc data will be more complete, current and useful. Help us make it better!

The best young economists?

March 25, 2009

Who are the best young economists? RePEc publishes all sorts of rankings based on its data, but has so far been missing one that highlights the best young economists. Indeed, they are typically invisible from the general rankings as it takes many years to build up the required body of work and citations to be featured among the top economists.

Unfortunately, authors registering with RePEc do not supply their year of birth or the year they obtained their last graduate degree. However, RePEc has information about the date of most publications, and it is then possible to determine (roughly) when a career started. Here, we do not make the type of publication (article vs. working paper, for example), as the goal is to try to approximate when the economist started being active in research.

Based on this criterion, two groups of economists are selected: those with their first publication, whatever the medium, less than five years ago, and those less than ten years ago. Quite obviously, there is considerably more measurement error compared to that already present in the general ranking, first because of the imperfect measure of the start of the career, second because the body of work is typically much smaller. But we hope people will still find these rankings useful.

Call for comments: modifications in the rankings of institutions

October 19, 2008

One feature of RePEc is its ability to rank researchers and the institutions they are affiliated with. Researchers create a list of affiliations when they register in the RePEc Author Service. However, this system was devised before rankings started to be computed, and some unforeseen consequences have emerged for authors with multiple affiliations. As there is no way to determine which affiliation is the main one, or what percentage economists would allocate to each, we are forced to treat each affiliation equally for ranking purposes. This leads in several cases institutional rankings to be “hijacked” by organizations that offer secondary affiliations. See, for example, the overall ranking of institutions. Another consequence can be found in the regional ranking, where individuals with a main affiliation from outside may take the place from legitimate insiders. Prime examples are Massachusetts, the United Kingdom and Germany.

What are the solutions? The obvious one is to modify the RePEc Author Service scripts to allow the declaration of a main affiliation or of affiliation shares. We have pondered that for some time now but find it very difficult to implement, especially as the main resource person for this project is not with us anymore. Thus we need to find some way to proxy the affiliations shares. I want to propose here one way to do this, open it for discussion, with the goal of having a formula in place for the January 2009 rankings.

The logic of the proposed formula is that there are many people affiliated with a particular institution, then it must be that most of them have courtesy or secondary affiliations. If person A is affiliated with institutions 1 and 2, institution 1 has many people registered and institution 2 few, then the ranking scores of person A should count more toward institution 2 than 1. Of course, such a distribution scheme pertains only to authors with multiple affiliations.

To be precise, let I be set set of affiliations of an author. For each i in I, let Si be the number of authors affiliated with institution i. Compute S as the sum of all Si. The weight of each affiliation is Ti=S/Si. These weights are then normalized to sum to one.

Take the following example. Economist A is affiliated with the Harvard Economics Department (46 registrants), the NBER (324 registrants) and the CEPR (262 registrants). The respective Ti would be 632/46=13.74, 632/324=1.95, and 632/262=2.41, given that 46+324+262=632. After normalizing the T‘s to one, Economist A’s ranking scores would count to 13.74/18.10=75.9% for the Harvard Economics Department, 1.95/18.10=10.8% for the NBER and 2.41/18.10=13.3% for the CEPR. For regional rankings, 86.7% (75.9% + 10.8%) of his scores would count in Massachusetts and 13.3% in the United Kingdom. Under current rules, scores are distributed fully to affiliated institutions and count fully in each region.

This is much simpler than I can manage to explain here… But a few additional details are in order: some variations in definitions can be discussed: Si can represent the number of registrants, the number of authors (registrants with works) or the numbers of works of authors. The latter would be to avoid institutions to discourage (erroneously) young faculty with few works to sign up. I favor the number of authors. Also, we need to deal with affiliations that are not listed in the database (EDIRC) and thus do not have a defined number of registrants. One solution is to just ignore such affiliations. The drawback is that the relevant authors may not get ranked in some regions where they are genuinely affiliated. Thus I propose to apply for those institutions the average Si of the other affiliations. If no affiliation is in the database, all get the same weight.

I now welcome comments on how to proceed and hope to implement the new scheme for the January 2009 rankings, which are released in the first days of February 2009.

January 18, 2009 Update: The new ranking method for institutions has now been programmed and is ready for the early February release. The formula discussed above has been adopted with two amendments. The first was discussed in the comments: 50% of the weight is allocated to the institution with the same domain name as the author’s email address. The remaining 50% is allocated over all affiliated institutions by the formula given above. The second amendment pertains to the weights of institutions that are not listed in EDIRC. As there is no author count for them, I put the default at the average number of authors per listed institution, currently 4.55.

February 3, 2009 Update: I am receiving many questions about the sudden changes in the rankings within countries. As authors with multiple affiliations do not count fully in each location any more, their ranking has worsened. Similarly, institutions that have many members with multiple affiliations now look worse. Note also that a few small errors have crept in, and they will be corrected for the February ranking.

The h-index

July 20, 2008

There are many ways to rank researchers, but rarely has one been adopted as fast as the h-index. It has been introduced by physicist Jorge E. Hirsch in August 2005, and is defined by h, with h works from an author having at least h citations. Compared to “raw” citation counts, which may put too much emphasis on a few much cited works, it highlights the trade-off between quantity and quality of research. Of course, like any research ranking criterion, it is imperfect in many ways and controversial to all but those who rank well. But it allows to highlight some aspects of research productivity.

RePEc has reported rankings according to the h-index since October 2005 for authors. There is also a variant for institutions and regions, where h is defined as the number authors with an h-index of at least h. Due to the large number of ties, how far an institution is from reaching the next h is also taken into account.

Quite naturally, the h-index can also be defined for journals and series. Starting this month, RePEc publishes such h-indexes: journals, working paper series (preprints), and all series combined. Obviously, journals and series with longer publishing histories are favored, and we hope this will have the side-effect of publishers making sure to have a complete listing on RePEc.

By the way, the overall h-index for all of RePEc is at 225 as of today.

Addendum (August 3): For authors, there is now also a Wu-index. This has been proposed by Qiang Wu and is defined in a similar way to the h-index, except that one needs 10 citations per paper. Due to the very large number of ties and zeros, this criterion is, however, not integrated in the overall rankings.

Fluctuations in author citation counts

June 11, 2008

Many authors may have rejoiced about the increase in their citation counts in their last monthly notification. At least part of this increase is due to an error that crept in while fixing a citation display issue for authors on IDEAS. This error is now fixed and next month’s mailing will show a substantial decrease in citation counts for some. While I got no complaints this time, I expect some in a few weeks…

In some cases, counts will be even lower than before the error crept in. This is because now extra care is taken not to double count citations to and from different versions of the same works. As always, self-citations are not counted in totals but still displayed on IDEAS.

My paper got published, what do I do?

May 20, 2008

A typical situation: An author registered on the RePEc Author Service has a working paper, listed on RePEc in his profile, that got published in a journal. Now that the publisher has provided the bibliographic information about this article to RePEc, the author can add it to his profile. What should he do about the working paper?

In an overwhelming majority of cases, the answer is: nothing! Indeed, most publishers accept that pre-prints, even post-prints, remain on authors´ home pages or institution repositories (what department working paper series are, for example). In case of doubt, see the SHERPA/RoMEO list. Thus, the author should not ask the paper to be removed from wherever it was put up.

Note: removing a paper from an author profile does not remove it from the database. It only makes the system learn that the author is not the author of this particular work. The consequences can be very annoying. For example, it becomes impossible for RePEc to recognize that these are two versions (pre-print and published) of the same work, as they appear to have different authors. Then, someone stumbling on the working paper will not find a link to the published version.

For authors caring about their ranking, there are even more adverse consequences from removing the working paper from the author profile. First, many working paper series have higher impact factors that journals. Second, the authors loose the download statistics of the working paper. Remember, working papers are much more downloaded than articles. And if the article is available only to subscribers, non-subscribers do not have the option of accessing the free working paper version.

And if it is really required that the working paper be removed, ask the RePEc series maintainer to only remove the link to the full text, not the whole record.

75% of the top 1000 economists are now registered with RePEc

January 8, 2008

The RePEc Author Service recently surpassed 15,000 registered authors, and the post relating this mentions the high coverage among top ranked economists. To document this, take one popular ranking, the one by Tom Coupé that is based on publications from 1990 to 2000. Tom Coupé has two rankings, one where publications are weighted by the impact factors of the journals, the other where citations are counted. According to the “publications” ranking, 75% of the 1000 economists are now registered with RePEc, according to the other 65%. The difference comes from the fact that the latter also includes non-economists (political scientists, statisticians, demographers, law scholars, and sociologists) that are cited in Economics journals.

One particularly interesting aspect of these rankings is how the proportions of registered authors decline with rankings:

Ranks registered,
publication ranking
citation ranking
1-100 93 77
101-200 81 72
201-300 78 69
301-400 73 76
401-500 77 66
501-600 71 61
601-700 73 54
701-800 77 55
801-900 62 62
901-1000 65 60
Total 750 652

How can we explain this pattern? Are registered authors more likely to publish well or be cited? This may be true for more recent measures of visibility, but in 1990-2000, the RePEc Author Service was not yet functional. Are then better ranked authors more likely to care more about their visibility and thus more likely to register?

What are the most cited recent papers in Economics?

December 22, 2007

RePEc has been publishing for several years now a list of the most cited papers and articles cataloged in its database according to three criteria, recently expanded to six. By popular demand, we now publish also a list of the most cited recent papers and articles. The selection criterion here is that the last know version has been published five or less years ago. That may sound like a long period, but considering the publication lags we suffer, I think it is reasonable. Thus, currently, articles (and papers) published in 2002 or thereafter qualify. Within a few days, those from 2002 will be dropped, so enjoy them while you can

At the same time, the list of the most cited items has been expanded. Previously, only the top 200 were released, now we show the top 1‰. This list thus gets longer as RePEc expands and stands currently at 559. Again, the list is available according to six different criteria. So, check out whether your favorite papers are listed. And remember, all this citation data is still experimental as we try to improve on its quality, but still quite informative.


Get every new post delivered to your Inbox.

Join 194 other followers