One feature of RePEc is its ability to rank researchers and the institutions they are affiliated with. Researchers create a list of affiliations when they register in the RePEc Author Service. However, this system was devised before rankings started to be computed, and some unforeseen consequences have emerged for authors with multiple affiliations. As there is no way to determine which affiliation is the main one, or what percentage economists would allocate to each, we are forced to treat each affiliation equally for ranking purposes. This leads in several cases institutional rankings to be “hijacked” by organizations that offer secondary affiliations. See, for example, the overall ranking of institutions. Another consequence can be found in the regional ranking, where individuals with a main affiliation from outside may take the place from legitimate insiders. Prime examples are Massachusetts, the United Kingdom and Germany.
What are the solutions? The obvious one is to modify the RePEc Author Service scripts to allow the declaration of a main affiliation or of affiliation shares. We have pondered that for some time now but find it very difficult to implement, especially as the main resource person for this project is not with us anymore. Thus we need to find some way to proxy the affiliations shares. I want to propose here one way to do this, open it for discussion, with the goal of having a formula in place for the January 2009 rankings.
The logic of the proposed formula is that there are many people affiliated with a particular institution, then it must be that most of them have courtesy or secondary affiliations. If person A is affiliated with institutions 1 and 2, institution 1 has many people registered and institution 2 few, then the ranking scores of person A should count more toward institution 2 than 1. Of course, such a distribution scheme pertains only to authors with multiple affiliations.
To be precise, let I be set set of affiliations of an author. For each i in I, let Si be the number of authors affiliated with institution i. Compute S as the sum of all Si. The weight of each affiliation is Ti=S/Si. These weights are then normalized to sum to one.
Take the following example. Economist A is affiliated with the Harvard Economics Department (46 registrants), the NBER (324 registrants) and the CEPR (262 registrants). The respective Ti would be 632/46=13.74, 632/324=1.95, and 632/262=2.41, given that 46+324+262=632. After normalizing the T‘s to one, Economist A’s ranking scores would count to 13.74/18.10=75.9% for the Harvard Economics Department, 1.95/18.10=10.8% for the NBER and 2.41/18.10=13.3% for the CEPR. For regional rankings, 86.7% (75.9% + 10.8%) of his scores would count in Massachusetts and 13.3% in the United Kingdom. Under current rules, scores are distributed fully to affiliated institutions and count fully in each region.
This is much simpler than I can manage to explain here… But a few additional details are in order: some variations in definitions can be discussed: Si can represent the number of registrants, the number of authors (registrants with works) or the numbers of works of authors. The latter would be to avoid institutions to discourage (erroneously) young faculty with few works to sign up. I favor the number of authors. Also, we need to deal with affiliations that are not listed in the database (EDIRC) and thus do not have a defined number of registrants. One solution is to just ignore such affiliations. The drawback is that the relevant authors may not get ranked in some regions where they are genuinely affiliated. Thus I propose to apply for those institutions the average Si of the other affiliations. If no affiliation is in the database, all get the same weight.
I now welcome comments on how to proceed and hope to implement the new scheme for the January 2009 rankings, which are released in the first days of February 2009.
January 18, 2009 Update: The new ranking method for institutions has now been programmed and is ready for the early February release. The formula discussed above has been adopted with two amendments. The first was discussed in the comments: 50% of the weight is allocated to the institution with the same domain name as the author’s email address. The remaining 50% is allocated over all affiliated institutions by the formula given above. The second amendment pertains to the weights of institutions that are not listed in EDIRC. As there is no author count for them, I put the default at the average number of authors per listed institution, currently 4.55.
February 3, 2009 Update: I am receiving many questions about the sudden changes in the rankings within countries. As authors with multiple affiliations do not count fully in each location any more, their ranking has worsened. Similarly, institutions that have many members with multiple affiliations now look worse. Note also that a few small errors have crept in, and they will be corrected for the February ranking.