Call for comments: modifications in the rankings of institutions

One feature of RePEc is its ability to rank researchers and the institutions they are affiliated with. Researchers create a list of affiliations when they register in the RePEc Author Service. However, this system was devised before rankings started to be computed, and some unforeseen consequences have emerged for authors with multiple affiliations. As there is no way to determine which affiliation is the main one, or what percentage economists would allocate to each, we are forced to treat each affiliation equally for ranking purposes. This leads in several cases institutional rankings to be “hijacked” by organizations that offer secondary affiliations. See, for example, the overall ranking of institutions. Another consequence can be found in the regional ranking, where individuals with a main affiliation from outside may take the place from legitimate insiders. Prime examples are Massachusetts, the United Kingdom and Germany.

What are the solutions? The obvious one is to modify the RePEc Author Service scripts to allow the declaration of a main affiliation or of affiliation shares. We have pondered that for some time now but find it very difficult to implement, especially as the main resource person for this project is not with us anymore. Thus we need to find some way to proxy the affiliations shares. I want to propose here one way to do this, open it for discussion, with the goal of having a formula in place for the January 2009 rankings.

The logic of the proposed formula is that there are many people affiliated with a particular institution, then it must be that most of them have courtesy or secondary affiliations. If person A is affiliated with institutions 1 and 2, institution 1 has many people registered and institution 2 few, then the ranking scores of person A should count more toward institution 2 than 1. Of course, such a distribution scheme pertains only to authors with multiple affiliations.

To be precise, let I be set set of affiliations of an author. For each i in I, let S_i be the number of authors affiliated with institution i. Compute S as the sum of all S_i. The weight of each affiliation is T_i=S/S_i. These weights are then normalized to sum to one.

Take the following example. Economist A is affiliated with the Harvard Economics Department (46 registrants), the NBER (324 registrants) and the CEPR (262 registrants). The respective T_i would be 632/46=13.74, 632/324=1.95, and 632/262=2.41, given that 46+324+262=632. After normalizing the T‘s to one, Economist A’s ranking scores would count to 13.74/18.10=75.9% for the Harvard Economics Department, 1.95/18.10=10.8% for the NBER and 2.41/18.10=13.3% for the CEPR. For regional rankings, 86.7% (75.9% + 10.8%) of his scores would count in Massachusetts and 13.3% in the United Kingdom. Under current rules, scores are distributed fully to affiliated institutions and count fully in each region.

This is much simpler than I can manage to explain here… But a few additional details are in order: some variations in definitions can be discussed: S_i can represent the number of registrants, the number of authors (registrants with works) or the numbers of works of authors. The latter would be to avoid institutions to discourage (erroneously) young faculty with few works to sign up. I favor the number of authors. Also, we need to deal with affiliations that are not listed in the database (EDIRC) and thus do not have a defined number of registrants. One solution is to just ignore such affiliations. The drawback is that the relevant authors may not get ranked in some regions where they are genuinely affiliated. Thus I propose to apply for those institutions the average S_i of the other affiliations. If no affiliation is in the database, all get the same weight.

I now welcome comments on how to proceed and hope to implement the new scheme for the January 2009 rankings, which are released in the first days of February 2009.

January 18, 2009 Update: The new ranking method for institutions has now been programmed and is ready for the early February release. The formula discussed above has been adopted with two amendments. The first was discussed in the comments: 50% of the weight is allocated to the institution with the same domain name as the author’s email address. The remaining 50% is allocated over all affiliated institutions by the formula given above. The second amendment pertains to the weights of institutions that are not listed in EDIRC. As there is no author count for them, I put the default at the average number of authors per listed institution, currently 4.55.

February 3, 2009 Update: I am receiving many questions about the sudden changes in the rankings within countries. As authors with multiple affiliations do not count fully in each location any more, their ranking has worsened. Similarly, institutions that have many members with multiple affiliations now look worse. Note also that a few small errors have crept in, and they will be corrected for the February ranking.

This entry was posted on Sunday, October 19th, 2008 at 1:21 am and is filed under RePEc rankings. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

21 Responses to Call for comments: modifications in the rankings of institutions

Christian Zimmermann says:

October 19, 2008 at 7:44 pm

I forgot to add that the general procedure is based on an idea by Thomas Krichel.
Richard Tol says:

October 20, 2008 at 2:51 am

The proposed solution works for the large virtual centres like NBER, CEPR, and IZA. It does not work for people who have two genuine jobs at real places — say someone in a research institute with a small part-time teaching appointment.

If I understand correctly, the problem is that NBER and the like rank very highly because they have so very many people. A simple solution, without making up any data, is to normalise all scores by the number of researchers. That is, divide Harvard’s score by 46, and NBER’s by 324.
Christian Zimmermann says:

October 20, 2008 at 11:34 am

Richard, there are always going to be cases that will not be reflected properly with any formula. What I am looking for is a way to make attributions in a reasonable way so that, for example, regional rankings become believable.

Dividing by the number of people, however, is the wrong way to go. The research contribution of an institution is also measured by its size. Also, your method would send the wrong incentive: only the top ranked person would register. We want to encourage everyone to participate.
Richard Tol says:

October 21, 2008 at 2:16 am

An alternative solution is to identify the “virtual” institutes and only apply the proposed rule to them.

A virtual place like the NBER is easily found as only a few have the NBER as their only affiliation. Ditto for CEPR, IZA, CESifo etc.
Richard Tol says:

October 21, 2008 at 6:23 am

By the way, the incentives work the other way too. UCD, for example, has Heckman on the payroll for a few hours a year. The more people from UCD register, the lower the affiliation share of Heckman, and the lower the UCD ranking.
Christian Zimmermann says:

October 24, 2008 at 12:17 am

I agree that this could happen, but on in such extreme cases. I am looking for a way to improve on the current situation, and no solution is going to be perfect.
thanasis stengos says:

November 2, 2008 at 9:24 am

What about simply have each person with multiple affiliations declare the weights that they feel best describe the right amount of time they spend at the respective affiliation? This would certainly penalize the “virtual” places which should be getting smaller weights frpm each person than “real” affiliations.
Christian Zimmermann says:

November 5, 2008 at 9:19 am

Thanasis: This is the ideal solution, I have suggested this internally for a long time. However, we seem to currently not to have the competency/time/resources to implement that. This means changing a few fundamental things in the internals of the RePEc Author Service. The formula I suggest is a proxy for this, clearly not perfect, but I hope better than the status quo.
bauwens says:

November 6, 2008 at 9:41 am

I have several affilaitions but essentially a main one. I have added the other for different reasons (information to everybody, and giving credit to institutions where I have a link, without penalizing my main affiliation). With the new system I am considering that I shall delete my secondary affiliations. I am noyt bothered that some instituions like CEPR, NBER etc get high rankings, since I know they are different from usual research centers or department. I can decide whether I discount them or not in the rankings.
My preferenceis: do not change the system.
truthteller says:

November 9, 2008 at 6:16 am

If I understand Christian’s proposal correctly it would give James Heckman twice as strong an affiliation to University College Dublin as to the University of Chicago, and would also rank James Smith and James Markusen as more strongly affiliated to UCD than to RAND and Colorado respectively. These are simple examples that demonstrate that a person’s primary affiliation is not necessarily to the smallest unit to which s/he is affiliated. I appreciate that you need a method that can be applied mechanically in the absence of the staffing necessary to do it more sensitively. Perhaps it would be easier simply to use factors from the address that the author provides, e.g. the domain to which their email points, the personal web page they provide or the postal address, and apply a standard secondary ranking to all other affiliations.
Christian Zimmermann says:

November 10, 2008 at 4:25 am

Truthteller’s comment is a very good one: I should be using the email address of the author as a hint where the main affiliation lies. I suggest thus to amend the formula in the following way. If the email address can be matched with an affiliation, 50% of the score is by default attributed to that affiliation (NB: it may be split among several equally in the cqse of multiple affiliations within an institution). The remaining 50% are split according to the original formula I suggested. Fair?

NB: This will not work for those with gmail or yahoo accounts: But remember, this only applies for those with multiple affiliations.
Christian Zimmermann says:

November 10, 2008 at 4:27 am

I am getting private emails in support of my formula or advocating some amendments. I want this to be a public discussion, so please post here.
John Foster says:

November 11, 2008 at 1:48 am

A simple solution would be to attribute to only the first named institition. This is generally the main institution and would be made so by authors if they knew the rule. The other affiliations would continue to be listed but for information only. This isn’t perfect because it wouldn’t fit those who have joint 50/50 appointments but it would be a lot more representative than the current system.
Christian Zimmermann says:

November 11, 2008 at 9:38 am

John, I agree that this would also be a step forward. However, there is nothing in the script that would guarantee that the order of the affiliations is maintained. Also, the affiliation page has no way for authors to change their order.
tstengos says:

March 5, 2009 at 4:21 am

I think that the new ranking methodology although has corrected some problems in the rankings of “virtual” institutions it has created many more serious anomalies in the rankings of individuals. For example it is absurd that Heckman and Phillips are only ranked barely in the top 5/% in Europe even though they are ranked world wide in the top 5. The fact that weights are lowered for institutions should not translate to individuals all of a sudden losing their ranking in the particular group comparison.
Christian Zimmermann says:

March 6, 2009 at 12:40 pm

Tanasios, this is the precise reason why the formula was modified: in several countries the first places were taken over by outsiders and this made the rankings useless or unbelievable. While there are still a few glitches, the rankings now much better reflect the rankings of those who primarily work in the respective region.
adnotten says:

March 19, 2009 at 3:43 pm

Dear Christian, I think there are two problems flowing from the new formula.

The first problem is the “virtual”organizations RePEc has tried to rank in a fairer way. Although trying to amend the undue advantage these organizations have is a good start, I would like to expand these virtual organizations with “umbrella organizations” such as faculties or departments which do not have their own direct research output but “acquire” this from their affiliated institutes. Having these “umbrella” organizations appear in the same ranking as those affiliated institutes seriously compromises the validity of this ranking; you just cannot have umbrella organizations and its affiliates in the same ranking as this would give a very skewed picture. My suggestion, further to Richard Tol’s comments, is to create separate rankings for “virtual” and “umbrella” organizations, and for individual research centers/institutes.

Second, for the author weightings I find there is something missing from
the proposition that “if person A is affiliated with institutions 1 and
2, institution 1 has many people registered and institution 2 few, then
the ranking scores of person A should count more toward institution 2
than 1”. This proposition does not reflect reality and is perhaps a bit
too binary in thinking. I agree with Thruthteller and Thanasis Stengos
that the new ranking has created serious anomalies where stronger
affiliations are automatically appended to smaller institutes. The idea
of “outsiders” taking over regional rankings is in most cases a
reflection of reality. Research is international. And, other then the
Heckman example, most international researchers have proper FTE
proportions at their foreign affiliations.
You could draw it even further and say that this then would be an indirect punishment for collaboration, such as for authors joining CEPR and NBER networks (as these networks promote collaboration). The suggestion of Thanasis I think is the best solution; where authors declare their FTE per affiliation. Adding an “FTE” field to the affiliation list in the Author Service would be fairest.

Looking at our region, the Netherlands, the new ranking drastically
changes the landscape and introduces pretty strong new biases. For
instance the Tinbergen Institute, the economics research institute
composed of researchers belonging to the two Amsterdam universities and the Rotterdam university now tumbles down the ranking due to the
multiple affiliations of its researchers, whereas CenTER, only associated with Tilburg, stays high. All Dutch umbrella faculties with predominantly researchers with only Dutch single affiliation, are now all listed on top of the Dutch institution ranking. With respect to the ranking of researchers in The Netherlands, multiple national affiliations, often a sign of a more “excellence” position in research, is now dramatically punished. UNU-MERIT’s leading researcher, Bronwyn Hall, ranked till last December number 2 is now ranked 23rd. So, I fear that the new weighting method does in effect no longer reflect the national
“excellence” in economics research which the ranking did project before.
Christian Zimmermann says:

March 22, 2009 at 1:06 am

adnotten: Regarding your first point, there is a problem of classifying institutions as “umbrella” and “not umbrella”. What objective criterion would you use? What do with people affiliated to the umbrella only? This needs to be practical and uniformly applicable and objective. I am not going to maintain a new database with exception rules.

For the second point: I agree adding a FTE field would be optimal. But I have so far been unable to find resources or a volunteer to modify the scripts accordingly. The data format used in exchanging author data between services needs to be changed as well. This is not easy. I do not think I penalize collaboration. While joining the CEPR reduces the weight for your home institution (and region), it also giving you the opportunity to publish in a well regarded working paper series.

The Netherlands is rather unique in that many researchers have genuinely split FTEs between institutions. This is certainly not the norm elsewhere. As said, getting those FTEs would be ideal, but at this moment not possible.

From the comments I have received so far, it is seems to me that the problem is not so much that some institutions get more weight than others, but that an author’s score now gets split between institutions. One cannot count fully everywhere any more. I am sorry here, but I cannot accept that some people where counting fully for up to six institutions.That is just unfair. And I realize that it is impossible to find a Pareto improvement here. So I have to take a stand.
tstengos says:

March 24, 2009 at 2:45 am

One way to proceed is to decouple institutional rankings from individual rankings. In other words let the individuals retain their “world” ranking and have the institutions receive the new amended weight. Individuals will retain their productivity score on a “world” scale, while institutions will be ranked “locally”. The point is that “local” individual rankings do not make sense any more, even though the institutional ones may be more accurate.

Thanasis Stengos
jkoniecz2 says:

April 27, 2009 at 3:39 pm

I understand the need to distinguish between primary and secondary affiliations, and the fact that the previous system could be gamed. The changes, however, do not succeed and the end result is, I think, worse than before. The new system creates paradoxes, ordering reversals and odd results; it is complex and, perhaps most importantly, not transparent, in the sense that it is impossible to figure things out. Here is why:

First, the distinction is artificial and fails whenever the secondary affiliation is smaller. An example (already mentioned): Jim Markusen who works at Boulder and is affiliated with –smaller- University College Dublin.

Second, any system can be gamed. The new system can be gamed by setting up a small center with members from large departments. That would make the center be counted (incorrectly) as their primary affiliation _as well as_ reduce the ranking of their primary affiliations. Here is an example found by Google: http://geary.ucd.ie/news/101-ucd-geary-institute-soars-in-rankings-of-university-economics-groups. The title of the Google entry is self explanatory: “UCD Geary Institute Soars in Rankings of University Economics Groups”.

Third, as the goals are not met, the changes should be evaluated on the basis of the trade-off between benefit and cost. It is unfavourable, in my opinion. As there are many issues I will be brief, For simplicity I assume all people are ranked according to a cardinal measure, for example number of works.

1. Monotonicity. A ranking of total output should meet the Monotonicity Axiom: the ranking of a subgroup cannot be higher than of the group. I think this is non-controversial.

The new ranking violates the axiom. Consider a department with two economists, E1 and E2. Now E1 sets up a center of which she is the only member. If her output is more than three times greater than the output of E2, the center is ranked higher than the department even though E2 has positive output.
This, I think, explains the high ranking of the World Bank Group (fourth). Given the specific nature of the Group, few if any members have multiple affiliations.
More generally, the problem is this: if a member of an institution acquires another affiliation, the ranking of her main institution falls. But it should not change output has not been affected.
Moreover, the effect on the own institution depends on how big the new affiliation is. This problem affects everyone with multiple affiliations and, by extension, everyone!

2. Reversals.
(a) departmental:
(i) Assume E3 and E4 are in different departments, which are their sole affiliations and both have four publications. Department 3 consists of only E3; department 4 consists of E4 plus one member with one publication. So ranking of department 4 is higher than of department 3. Now E3 and E4 form a research center of which they are the only members. As a result, department 3 is now ranked higher than department 4.

(ii) Consider two departments, D1 and D2 with very similar ranking but D1 is ranked higher than D2. Now one member of D1 is invited to join NBER. Most people would consider such invitation as a positive signal for D1. But the ranking of D1 may fall below D2. Ranking is reversed even though there is no change in publications and the department falling in rankings looks, if anything, better.

(b) personal: E5 and E6 are, say, UK economists with sole affiliations at different UK universities. E5 is ranked ahead of E6 in the UK (and in the world). They are both invited to become fellows of the NBER. If the department of E5 is sufficiently larger than the department of E6, the latter will now be ranked higher in the UK, but still lower in the world, even though their affiliations are similar (one UK department and NBER).

3. Oddities.
Personal reversals lead to oddities. For example, out of people with a UK affiliation P.C.B. Phillips has the highest world ranking yet in the UK he is ranked 206th, just ahead of Gilles Duranton who is ranked 186th in the world ranking.
James Heckman is not even ranked in Germany even though he is listed as a member of IZA.

4. Complexity and lack of transparency. The oddity above cannot be disentangled. Duranton is listed only at the University of Toronto (so why is he ranked in the UK?) and Phillips is listed in 4 places only; he must have more affiliations to end so low in UK rankings.

What does it mean that there are 200 economists in the UK ranked higher than Phillips? Can it be explained in one sentence without speculating on how much time he spends in the UK? What is the value of a ranking based on speculation?

I think these problems are fatal for the new ranking system.
Better solutions:
– return to the previous one
– add an asterix to research centers with explanation: for people listed, the main affiliation may be different (or: is usually different). This is a common solution: if something cannot be fixed, add a disclaimer.
– provide both rankings with an explanation of the difference.
Christian Zimmermann says:

April 28, 2009 at 3:28 am

jkoniecz2, let me reply to your long comment by quote parts of your message, that will make it easier to read.

First, the distinction is artificial and fails whenever the secondary affiliation is smaller. An example (already mentioned): Jim Markusen who works at Boulder and is affiliated with –smaller- University College Dublin.

Correct. But remember that half the score is reserved to the primary affiliation as identified from the email address or the personal homepage. It is only the second half that is being split.

Second, any system can be gamed. The new system can be gamed by setting up a small center with members from large departments. That would make the center be counted (incorrectly) as their primary affiliation _as well as_ reduce the ranking of their primary affiliations. Here is an example found by Google: http://geary.ucd.ie/news/101-ucd-geary-institute-soars-in-rankings-of-university-economics-groups. The title of the Google entry is self explanatory: “UCD Geary Institute Soars in Rankings of University Economics Groups”.

This is an example on how it could be gamed before the change. Then, any member would count for 100%, wherever the person is mainly affiliated. This allowed to “soar” by adding external people. This is not possible anymore.

1. Monotonicity. A ranking of total output should meet the Monotonicity Axiom: the ranking of a subgroup cannot be higher than of the group. I think this is non-controversial.

The new ranking violates the axiom. Consider a department with two economists, E1 and E2. Now E1 sets up a center of which she is the only member. If her output is more than three times greater than the output of E2, the center is ranked higher than the department even though E2 has positive output.

Your example would be correct if 1) the center is not a subentity of the department, otherwise it counts fully in the department; 2) the email or homepage of E1 does not indicate the department is her main affiliation. I am not saying counterexamples cannot be constructed.

This, I think, explains the high ranking of the World Bank Group (fourth). Given the specific nature of the Group, few if any members have multiple affiliations.

I think the World Bank Group is mainly helped by the sheer number of registered economists. A minority of authors have multiple affiliations, at the World Bank and elsewhere.

More generally, the problem is this: if a member of an institution acquires another affiliation, the ranking of her main institution falls. But it should not change output has not been affected.

But that is the whole point! If you are at both A and B, you cannot be 100% at A!

Moreover, the effect on the own institution depends on how big the new affiliation is. This problem affects everyone with multiple affiliations and, by extension, everyone!

Correct on the first point. This is mainly top account for those offering many courtesy appointments. But this does not affect everyone, only a minority has multiple affiliations. Those with single affiliations have their score count for 100% no matter what.

(i) Assume E3 and E4 are in different departments, which are their sole affiliations and both have four publications. Department 3 consists of only E3; department 4 consists of E4 plus one member with one publication. So ranking of department 4 is higher than of department 3. Now E3 and E4 form a research center of which they are the only members. As a result, department 3 is now ranked higher than department 4.

Correct. I am not denying that one can find such examples.

(ii) Consider two departments, D1 and D2 with very similar ranking but D1 is ranked higher than D2. Now one member of D1 is invited to join NBER. Most people would consider such invitation as a positive signal for D1. But the ranking of D1 may fall below D2. Ranking is reversed even though there is no change in publications and the department falling in rankings looks, if anything, better.

Correct. But the NBER member will have the opportunity to publish in the NBER working paper series, which has a high impact factor. D1 will ultimately gain from this.

(b) personal: E5 and E6 are, say, UK economists with sole affiliations at different UK universities. E5 is ranked ahead of E6 in the UK (and in the world). They are both invited to become fellows of the NBER. If the department of E5 is sufficiently larger than the department of E6, the latter will now be ranked higher in the UK, but still lower in the world, even though their affiliations are similar (one UK department and NBER).

Correct. But given the size of the NBER, this is very very unlikely.

3. Oddities.
Personal reversals lead to oddities. For example, out of people with a UK affiliation P.C.B. Phillips has the highest world ranking yet in the UK he is ranked 206th, just ahead of Gilles Duranton who is ranked 186th in the world ranking.
James Heckman is not even ranked in Germany even though he is listed as a member of IZA.

Previously, they were ranked first in the UK and Germany and everybody was complaining about this because they were not based there. This is precisely the main reason why the rankings were not credible. Now they show in the first ranks people who are correctly located with their main affiliation. These were oddities before, but not anymore.

4. Complexity and lack of transparency. The oddity above cannot be disentangled. Duranton is listed only at the University of Toronto (so why is he ranked in the UK?) and Phillips is listed in 4 places only; he must have more affiliations to end so low in UK rankings.

Duranton recently removed his UK affiliations, but his Toronto one was considered the main one. The rankings are redone only once a month.

I am not denying it is complex. I agree it is not very transparent. I will try to include the weights in the personal ranking analysis for those with multiple affiliations.

Better solutions:
– return to the previous one
– add an asterix to research centers with explanation: for people listed, the main affiliation may be different (or: is usually different). This is a common solution: if something cannot be fixed, add a disclaimer.
– provide both rankings with an explanation of the difference.

Both rankings are provided (the square parenthesis is the old ranking). An explanation is provided in the FAQ, I moved it there recently because the rankings became too wordy.

As I have mentioned several times, the new method is not perfect, but it makes the rankings more credible.

The RePEc Blog