The new CollEc: An interactive exploration of the economic literature’s co-authorship network

November 26, 2020

This blog post was written by Christian Düben.

The economic literature is a field comprised of tens of thousands of authors. The American Economic Association alone has more than 20,000 members. In September, the RePEc Author Service passed 60,000 registered users with published research. Around 48,000 of them published at least one co-authored paper with another registered user.

Co-authored research has been on the rise over the past decades forming collaborations over enormous geographic distances and many fields of research. It is a network that interconnects the vast majority of published economists around the world. While many researchers are aware of collaborations between their close colleagues and prominent figures in their field, it is a challenge to even have a rough idea of what the overall network looks like.

When Thomas Krichel released the CollEc RePEc service in 2011, the co-authorship network’s structure became accessible. With a few clicks users can evaluate which authors form the center of the discipline and who holds a more peripheral position. Each listed person is assigned a centrality value computed using methods from the field of graph theory.

Now in 2020, CollEc enters a new chapter of its existence. After years of maintaining the project and providing intriguing insights into the economic literature’s co-authorship network, Thomas Krichel transferred the RePEc service to me. I used the opportunity to come up with a completely new implementation, re-writing CollEc from scratch. The former network analysis written in Perl took the server hours at full capacity. Migrating it to C and C++ code wrapped in R functions boosted efficiency, cutting the required time and resources to a small fraction of what the previous implementation required, and facilitating extensions to the analysis.

I added weighted edges, bilateral distances, and other results going beyond the centrality measures. The interface through which users view the data changed from a static website to a web applications. Web applications are more complex and give me the necessary flexibility to fundamentally redefine how the data is presented. The new CollEc is highly interactive and puts results through combinations of plots and text into perspective. When a user inquires the distance between two authors, CollEc generates a figure comparing that bilateral distance to the distribution of distances to all other authors in the network. The following plot is the result of requesting the distance between Christian Düben and Thomas Krichel with edges weighted by an inverse transition function. The concept of transition functions and the interpretation of the plot are outlined in the application.

With CollEc’s functionalities you can explore who someone’s co-authors are, how far two people are apart in the network, what the shortest path between them looks like, how centrally located a researcher is etc. All of the resulting plots are accompanied by a short text stating further information, e.g. on the network size. The web application evolves around the same approach as GraphEc, another recently developed but not yet publicly available RePEc service, does. It is an interactive tool focused on easily interpretable graphical output presenting results and facilitating comparisons.

Over the course of the past months, Thomas Krichel and Christian Zimmermann repeatedly reviewed the new CollEc and requested extensions and modifications. Thomas allowed me to host and test the application on his technical infrastructure from an early stage and did not withdraw his permission when I accidentally took his web server offline. Thanks to their great support, the application gradually improved and is now publicly available. To get started simply visit and watch the tutorial or read the documentation. Either of the two options provides a brief, intuitive introduction into the basics of graph theory and the interpretation of CollEc’s results. Read the documentation on entry points, if you would like to generate a link to a certain output. Before you brag about your network centrality on Twitter, it should be noted, though, that author centrality is not a proxy for author quality. Successful authors can be central or remote. Consult the IDEAS website for a citation-based performance analysis of authors, journals, working paper series etc.

If you would like to contribute to CollEc, ask your colleagues to register with the RePEc Author Service. CollEc’s network only entails authors listed in the RePEc Author Service’s data base. The vast majority of published economists is already registered. But some people are still missing. Fill the gaps in the network and ensure the reliability of CollEc’s results by promoting registration with the RePEc Author Service.

You can also decide to support RePEc more generally. IDEAS lists some volunteering options. RePEc is a non-commercial initiative run by volunteers providing openly accessible services. Small contributions like adding RePEc Genealogy entries already help in maintaining and improving this public good. I am a junior researcher who is going to be on the job market next year. Like the rest of my peers, I am under a lot of pressure to produce high quality research. Nonetheless, I do not regret having spent months on developing CollEc. Open science initiatives like RePEc are important contributors to an equitable research environment.

Who are the authors registered with RePEc?

September 24, 2020

The number of authors registered with the RePEc Author Service has surpassed 60’000. We take this opportunity to take a look at some of the characteristics of this group.

For starters, one has to realize that this is a really large group. While anybody can register (for example to exploit some of personalized RePEc services like MyIDEAS), the 60’000 are those who have any sort of publication listed in RePEc. This group of published economists is much larger than the body of economists who are members of the three largest associations in the profession: The American Economic Association, the European Economics Association, and the Econometric Society. They have a total membership of about 25’000, including individuals who are members of several societies. Does this mean that RePEc is comprehensive? One indicator is to compare those registered to some other listing of economists. For example, a ranking of the top 1000 economists computed in 2000 shows now that about 91% have a RePEc account. Of course, we would welcome a more recent analysis, and RePEc membership is likely “top-heavy,” yet we hope you are impressed as us.

How did we get to 60’000? Here is a short-time line:

5’000 May 2004
10’000 June 2006
15’000 December 2007
20’000 April 2009
25’000 August 2010
30’000 October 2011
40’000 April 2014
50’000 May 2017
60’000 September 2020

Then, what is the composition of those 60’000? 25.5% are female, 1% are known to be deceased, another 2.5% have been lost, that is, their email address is bouncing and may have moved or died (update welcome!). In terms of geographic representation, we find economists in 167 countries and territories:

Africa 2.5% South Africa 0.5%, Nigeria 0.4%, Tunisia 0.4%, Ghana 0.2%
Asia 11% China 1.9%, Japan 1.9%, India 1.5%, Turkey 1.4%, Pakistan 0.7%
Europe 49% UK 6.1%, France 5.9%, Germany 5.7%, Italy 5.1%, Spain 3.7%, Russia 2.3%, Romania 1.9%, Netherlands 1.9%
Latin America/Caribbean 4.4% Brazil 1.3%, Colombia 1.1%, Chile 0.7%, Mexico 0.6%, Argentina 0.5%
North America 22.4% United States 19.6%, Canada 2.8%
Oceania 3.2% Australia 2.4%, New Zealand 0.5%
No affiliation/unknown 7.5%

Defining our authors by field is more tricky. They do not declare a field upon registration. We cannot use JEL codes as the coverage in the publisher-contributed data is lacking. We infer fields from the proportion of working papers announced in particular NEP reports. There are eligibility criteria in terms of number of works in a field to be counted. Measured that way for the 46% that qualify, the top fields are (an author may be in several fields, 100% is all qualifying authors):

Macroeconomics 25.2%
Urban and Real Estate 13.3%
Labor 11.2%
Central Banking 10.3%
Monetary 10.2%
Environment 9.6%
Dynamic General Equilibrium 8.8%
Agricultural 8.5%
International Trade 8.5%
Energy 8.3%
Banking 7.9%

A replication database for economics and social sciences: The ReplicationWiki

August 4, 2020

This is a guest post by Jan H. Höffler

The ReplicationWiki currently offers a database of 4,484 studies from the social sciences for which empirical methods were used. It lists which of the studies have data and code available online. In cases where replications are known they are classified by their type and results.

The topic of replication has become more and more prominent in the scholarly discourse in recent years. Yet, much needs to be done to make the availability of code and data more mainstream. To highlight how much work still lies ahead, even recent publications on the topic of replication in leading journals are not replicable and contain major flaws. For example, the authors of a study calling to make replication the norm that was published in Nature do not make their replication material available, ignoring the rules on data availability of the journal and the sponsor, the Berkeley Initiative for Transparency in the Social Sciences. Or, a study published in Research Policy came to the conclusion that work published in the top 5 economics general interest journals are less likely to attract replications published in leading journals, although the authors’ own data shows exactly the opposite.

So how can we get more replications to improve on the state of economics and discuss cases like the ones listed above? One important way is to include replication in the education of economists as was suggested by Daniel Hamermesh in his 2007 article on replication in the Canadian Journal of Economics. The ReplicationWiki followed this approach by setting up a teaching initiative that was presented, among others, at the Research Transparency Forum of the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Annual Meetings of the American Economic Association (2014, 2016). Seminars on replication were held at universities in Germany, Canada, China, and Switzerland and at a workshop in San Francisco with the Institute for New Economic Thinking Young Scholars Initiative, BITSS, and the Project Teaching Integrity in Empirical Research.

The advantage of the wiki approach lies especially in the fact that users can contribute to it without publishing a journal article. A working paper series was started for this purpose. Forum and blog posts can also be included as long as they have a verifiable author and make a contribution regarding the replicability of a published empirical study. On the studies’ discussion pages even very short comments can help other users like “To make the code work I had to add … at line …” or “The data has been moved to the following URL: …”.

For instructors, the wiki can help to identify examples for coursework as it allows searching for studies for which data and code are available, for which software was used that is accessible to the students, and for which a method was used that they should learn about. With the help of JEL codes and keywords preferred topics can also be searched for. Depending on the location of the students, it can also be motivating for them to see if research is available based on data from their home country (click here for an example). If it is not, they may be encouraged to compare results based on data from their country or region with the existing published research. For the students it can be an additional motivation if they can easily share their results with the research community via the ReplicationWiki.

The ReplicationWiki was described in more detail in a journal article. In the 2017 American Economic Review Papers and Proceedings an overview was given of economics journals’ data policies as well as of the distribution of the use of different software packages and of the geographical origin of the data used. In that article, some evidence was also presented that indicates that studies for which replication material is made available may attract more citations. This should be seen as a motivation for authors of empirical work who are willing to share their material to point this out by adding this information to the wiki. The ReplicationWiki has recently added a number of additional features. Now there are overviews of the methods, data sources and software used in the studies. In addition to replications the wiki now also provides information about corrections that have been published and whether studies have been retracted. Complex searches are now possible with a more user-friendly interface.

Initially the wiki covered studies mainly published in the Journal of Applied Econometrics, which already started an online data archive in 1995, the Journal of Political Economy, the American Economic Review and the four American Economic Journals. Now it covers studies published in 231 journals, 36 working paper series & blogs and 27 books. It lists 652 replications, 23 corrections and 14 retractions. As the wiki has been cited from a number of neighboring fields as an example to follow, it is becoming a hub for all social sciences. There have already been contributions in particular from political science and sociology.

The ReplicationWiki’s pages have been accessed more than 6.6 million times so far. It has been mentioned numerous times in the media, and more than 260 users from around the world have registered. As a wiki, it lives off the contributions of its users. We hope to encourage more users to contribute to this tool, or simply use it. In particular, one site feature that could become more valuable with higher participation is the ability to vote which studies should be replicated.

In July 2014, a cooperation with RePEc was started via a link exchange. For studies listed in the ReplicationWiki a link appears in the IDEAS section “Related works & more” under “Lists” like in this case, and on the authors’ pages under “Citations/Wikipedia mentions” like here.

Is your work listed? Check in and add it if not!

5000 working paper series on RePEc: working papers are still central to economics

May 31, 2019

RePEc now indexes now over 5000 working papers series, and we take this opportunity to highlight how these open-access pre-prints are central to RePEc and economics research in general. Indeed, the peer-review process in economics is particularly excruciating, as it is quite common for the process to take several years from submission to publication. Multiply this if a manuscript needs to be submitted to several journals (the best journals have acceptance rates below 10%), and you quickly understand that the published research often disseminates research that is several years old.

A reaction to these delays has been the introduction of working papers. Initially disseminated on paper among friends and colleagues, they quickly became the go-to medium if you wanted to know where the frontier of research was. Several institutions then institutionalized the practice by creating official working paper series one could subscribe to, in some cases against a fee to cover printing and shipping costs. Working papers, sometimes also called discussion papers, are considered preliminary work that is not definitive and disseminated for discussion and awareness. Yet, they are sometimes refereed within the issuing institutions, as in some ways their reputation rides on the papers. Also, authors often prefer their working papers to the corresponding published articles, as the latter are sometimes altered in unintended ways through the tyranny of referees as well as shortened by editors with space constraints.

RePEc was created to enhance the dissemination of research in economics, and specifically of working papers. Indeed, unlike journals, working papers were disseminated in an informal way, and one needed to be “in the know” to get them. RePEc has helped bridge that gap and make working papers available to everyone. While the dissemination of working papers is now much improved, the publication delays only got worse, hence working papers are still central to following the frontier of research. This is why RePEc disseminates new working papers through NEP and not new journal articles. And we also have noticed that if a working paper and a journal version are available in parallel, the working paper is downloaded many times more than the article (even after removing the NEP downloads).

If your working paper series is not yet available on RePEc, follow these instructions. To see which series are currently indexed, see the listings on EconPapers or IDEAS.

Help build the academic tree of Economics: the RePEc Genealogy

April 22, 2018

Beyond the open bibliography that lays the foundation of RePEc, various services have emerged that enhance the data collected with RePEc. One of them is the RePEc Genealogy. The goal of this initiative is to build an academic family tree for Economics, recording who was advised by whom, where and when. It thus tries to build links among the over 50,000 economists registered with the RePEc Author Service as well as the institutions listed in EDIRC. At the time of writing this, close to 13,000 economists from over 1000 programs are listed in the RePEc Genealogy.

The data is collected by the community: The RePEc Genealogy is a wiki, and all you need is a registration with the RePEc Author Service to add information to it. You can make sure your own record is complete, add your students or whose of your advisor, or ensure that your graduate program or alma mater are properly recorded. Over 3,000 economists have already contributed to it. Go to the RePEc Genealogy crowdsourcing tool to participate and see some statistics about the genealogy.

How is the collected data used? Of course, one can browse the site for information. But the data is also used in other ways: IDEAS uses it to complement author profiles, to compute rankings of graduate programs (publications from all years or last 10 years), a ranking of economist by graduation cohorts. Finally, data from the Genealogy is starting to be used for research, along with data from the rest of RePEc. You could be part of the data that you are analysing! For a listing of papers using RePEc data, see here.

Female representation in RePEc

November 21, 2017

Thanks to the ranking of female authors in RePEc, we have long known the share of women in the RePEc sample of more than 50K authors: 19%. We now know also the shares of women economists by country, US state, field of study and PhD cohort. The following table shows for the largest countries their relative size and the proportion of women in each. European countries are doing better than the world average, especially Latin and Eastern European countries, while Anglo-Saxons are the most masculine (is it that relatively higher salaries for the profession in Anglo-Saxon universities attract the most competitive men?). Latin America is generally below average (except for Colombia and Argentina) while Asia has very low shaes of female economists, with less than 6% in Japan, China and India, and 9% in Pakistan (you can sort by column in the link).

The figures by cohort year of doctoral students in economics do not allow much optimism for the future, as shown in the graph below. In terms of fields of study, women are more present than before in all disciplines, but there are more masculine fields than others: finance (10.9% of women), time series (11.4%), sports economics (12.2%)… The most feminine: demography (37.7%), tourism (34%), Eastern economies (33.4%, likely due to the higher share of females in those countries).

IDEAS also lists the RePEc economists active on Twitter (over 1000 registered). While women represent 19% of the RePEc authors, they are only 14% in the Twitter subsample. Looking at the Top 25% of this list of RePEc/Twitter economists by number of followers (3rd row), the proportion of women falls to less than 13%. In fact, the total audience of these women among the top 25% is a little over 3%. The table also provides the names of the most followed female economists on Twitter: only eight are ranked among the top 100. The lack of women at the top is also apparent in the rankings based on the quality of publications and citations that RePEc releases: only two appear among the world’s top 100, Carmen Reinhart and since recently Asli Demirguc-Kunt. The following figure gives some statistics about the top economists in some of the countries with the most registered economists. Again, European countries outperform Anglo-Saxon ones in terms of women among the top 100.

There is a lot of literature documenting this gender bias in the economics profession. These latest RePEc data complement what is well known, allowing international, thematic and temporal comparisons. See also the nice interactive representation of the RePEc network that Christian Mongeau makes by gender.

Ranking optimization

November 18, 2016

RePEc is all about the free dissemination of economic research, but for many economists it is most known for its rankings. While would really emphasize that the rankings are only a by-product and to some degree a motivator for people and publishers to have their works listed on RePEc, we want to acknowledge that the rankings have become important, as they are use for evaluations in funding agencies and for promotion or tenure. So here are some recommendations on how to optimize rankings, both for authors and institutions.

For authors

  1. Foremost, make sure your profile is current. Go to RePEc Author Service and log in. Click on research to see whether the system has found any suggestions. Make sure you have all the relevant name variations for you so that it can make the best suggestions. Check also if the system needs some help in attributing some citations.
  2. A few publishers still do not participate, particularly among book publishers. Encourage yours to index its works in RePEc.
  3. If you have advised graduate students and they are registered in RePEc, add them to your RePEc Genealogy record. Help your own advisor’s record as well. This is likely the lowest hanging fruit for many economists.
  4. RePEc sometimes fails to find the bibliography for some articles. If this makes you miss some citations, you can help by uploading those references. The full bibliography is required. The input form is here.
  5. Working papers get downloaded many more times than journal articles. Thus make sure to have them listed! Your institution can have its WP series indexed following these instructions. If that does not work out, upload them to MPRA. Most publishers allow it, as long as it is not the final version. See details at SHERAP/RoMEO.
  6. Finally, link to your profile on IDEAS or EconPapers from your webpage.

For institutions

  1. Foremost, make sure that all members of your institution are registered at the RePEc Author Service. You can look up who is already there by finding your record at EDIRC. Note that if someone is listed with a question mark, it means their email address is not valid, and they will not count towards your score. Please get it corrected (or tell us about the new address or whether this person may have died. It happens).
  2. If you have a graduate program, you want to have the graduates listed in the RePEc Genealogy. Your EDIRC record also lists who is already linked. There is already a ranking using these records.
  3. If you have a working paper series or some other serial, make sure it is indexed in RePEc. Instructions.
  4. Of course, have your members follow the recommendations for authors above.

“The Closed Marketplace of Economic Ideas,” a Rebuttal

January 8, 2016

In a Project Syndicate column, Federico Fubini makes the argument that the intellectual leaders in economics from ten years ago are still the leaders of today, and this despite the fact that we have had a financial crisis that was not predicted by the profession. I do not agree with this column on several fronts. As the argument was made using RePEc data, I feel obligated to set the record straight.

One may discuss whether the economics profession has really not seen the crisis coming. But even if this crisis was unforeseen, it is wrong to argue on principle that the best economists from ten years ago should not be considered to be the best today. Indeed, it is not the case that the whole profession is focussed on predicting financial or economic crises. Economics has much more to offer, just see the list of recent Nobel Prize winners or the large variety of fields covered by NEP, RePEc’s research alert service. Most of those fields have nothing to do with crises. For example, the leaders in auction theory from ten years ago are likely to be leaders now, the same applies to development economics, empirical labor economics, or environmental economics. Ten years is short in the evolution of scientists.

But beyond this mischaracterization of what the economics profession does, there is the issue with the use of the data to prove the point. Fubini uses the ranking of economists provided by IDEAS/RePEc, taking the December 2006 and the September 2015 rankings. But before looking at them, one has to understand what they measure. They are an aggregation of 23 (2006) or 35 (2015) criteria, almost all of which pertain to the lifetime output of the economists. Those that are not are four looking at readership statistics for the last months, and in the recent rankings six criteria discounting citations by their age. Basically, the accumulated research and citations that were considered in 2006 are also considered in 2015. There is obviously going to be high persistence among the best economists. And keep in mind that critiquing someone will earn him a citation.

What you want to do is using two datasets that do not overlap, one that considers publications until 2006, and one since that year. One can get very close by using another ranking, the one that considers only the publications from the last 10 years. In the following I will use the one that was published today and pertains to December 2015. We have thus only one year of overlap in the publication dates. And the results look quite different (I did this in a couple of hours, I cannot vouch the numbers are totally correct).

Incumbency rate by cohort
Cohort Fubini Better
Top 10 90% 70%
Top 20 95% 75%
Top 50 98% 72%
Top 100 94% 61%
Top 200 65% 42%

The rest of the arguments in the Fubini column also change quite a bit once you look at better data.

You must not be an economist. In fact, Lucas and Fama both moved up in the RePEc rankings during the period I examined, from 30 to nine and from 23 to 17, respectively. And the persistence at the top is striking across the board. Among the top ten economists in September 2015, six were already there in December 2006, and another two were ranked 11 and 13.

Lucas (Robert E. Jr.) and Fama actually dropped, Lucas from 30 to 33, Fama from 23 to … not being ranked in the top 10%. And in the top ten for December 2015, only three where there in 2006, another two ranked 11 and 13 in 2006 (the same).

Mobility in the RePEc rankings remains subdued even after widening the sample. For example, of the top 100 economists in September 2015, only 14 were absent from the much wider top 5% in 2006, and only two others had advanced more than 200 spots over the previous decade. Among those recently ranked from 101 to 200, just 24 were not in the top 5% in 2006, and only ten others had moved up by more than 200 places. The rate of renewal among the 200 most influential economists was as low as 25% – and just 16% among the top 100 – during a decade in which the explanatory power of prevailing economic theory had been found severely wanting.

Again with better data, this looks different. From the 2015 top 100, 41 were absent from the top 5% in 2006. Nine advanced more than 200 spots. For 101 to 200, 45 were not in the top 5%, and seven moved up more than 200 spots. Using Fubini’s definitions, the renewal rate is thus 51% in the top 200 and 50% in the top 100. Hardly a stagnation.

In the rankings of economists, by contrast, criteria such as gender or geographic origin confirm the overall inertia. Only four women made the RePEc top 200 in September 2015, compared to three in December 2006, and two were included on both lists. Likewise, emerging countries – which represent more than 90% of the world’s population, three-quarters of global GDP growth over the last decade, and nearly half of total income in current dollar terms – supplied just 11 of the top 200 economists in September 2015, up from ten in December 2006. And ten of those 11 – three Iranians, four Indians, two Turks, and one Chinese – have lived and worked in the US or the United Kingdom since their student days.

With better data, this changes as well. There are now seven women. Still too few, though. There are 18 economists from emerging countries: two Turks, one Egyptian, seven Indians, two Iranians, two Pakistani, one from Cameroun, two Chinese, one Bangladeshi. Eight of them live in an emerging economy.

I’ll let the reader judge whether there is still a “closed market place for ideas in economics.” But the picture certainly looks different from what Fubini seems to imply.

Economics Replication Wiki now on IDEAS

July 16, 2014

A major part of the scientific process is the replication of previous studies, something necessary to confirm that things were done right, that they are not sensitive to details and that results have not changed with the passage of time, either because the methods got better or the data has evolved. Unfortunately, there is little replication in economics, and if there is some, it is difficult to publish it. One can theorize why this may be the case, but it is clear replication studies are little valued and not particularly welcome in journals. It is also quite difficult to determine whether a particular study has been replicated.

To help with all that, the Center for Statistics at the University of Göttingen (Germany) has launched a Wiki to index replicated and replicating studies in economics, with funding from the Institute for New Economic Thinking. As it is a wiki, it is crowd-sourced in the sense that any registered person can amend the records, and in particular add replication studies. One can also add to a list of articles published in top journals that should warrant replication and vote (anonymously) from that list (current winners).

The listings on this Replication Wiki are now indexed on IDEAS as well. The principle is similar to the indexation of Wikipedia articles: if a study on the Wiki has a link to IDEAS (or EconPapers, IDEAS will link back. Those adding or amending entries on the Wiki are thus encouraged to link to the IDEAS abstract page to create the backlink on IDEAS.

As any crowd-sourced project, the Replication Wiki will only live from the participation from the public. If you know of replication studies, consider spending a few minutes and add to this wiki.

The Job Market Paper archive

September 19, 2013

A graduating economics PhD or doctoral student who is looking for a job in academia or policy circles is typically doing so with a “job market paper.” The JMP is the one that many recommendation letters from faculty focus on, it is the one that is mostly talked about in job interviews, and it is presented during campus visits. It is thus fair to say that the JMP is the best this student has done so far, and a lot of effort goes into this paper. Shouldn’t this work then be more widely disseminated than a few recruiting committees?

We are thus introducing the Job Market Paper archive on RePEc. Job candidates can upload their paper, which gets the standards treatment of any new working paper in RePEc: it gets listed on the many services using RePEc data, including the websites EconPapers and IDEAS, as well as the email notification service NEP. In addition, the papers are hosted by a RePEc server for posterity. This is important, as job market candidates tend to find jobs and often move their web page as a consequence, resulting in broken links. Finally, the presence of the papers in this series clearly identifies the author as a new economist one may want to look at for a hire. Recruiters can simply follow what is new in this archive.

As expected, certain restrictions apply. To learn more, see here.

Note for that for those who are not on the job market and do not have access to a local working paper series that participates in RePEc (instructions), MPRA is still available.