A replication database for economics and social sciences: The ReplicationWiki

August 4, 2020

This is a guest post by Jan H. Höffler

The ReplicationWiki currently offers a database of 4,484 studies from the social sciences for which empirical methods were used. It lists which of the studies have data and code available online. In cases where replications are known they are classified by their type and results.

The topic of replication has become more and more prominent in the scholarly discourse in recent years. Yet, much needs to be done to make the availability of code and data more mainstream. To highlight how much work still lies ahead, even recent publications on the topic of replication in leading journals are not replicable and contain major flaws. For example, the authors of a study calling to make replication the norm that was published in Nature do not make their replication material available, ignoring the rules on data availability of the journal and the sponsor, the Berkeley Initiative for Transparency in the Social Sciences. Or, a study published in Research Policy came to the conclusion that work published in the top 5 economics general interest journals are less likely to attract replications published in leading journals, although the authors’ own data shows exactly the opposite.

So how can we get more replications to improve on the state of economics and discuss cases like the ones listed above? One important way is to include replication in the education of economists as was suggested by Daniel Hamermesh in his 2007 article on replication in the Canadian Journal of Economics. The ReplicationWiki followed this approach by setting up a teaching initiative that was presented, among others, at the Research Transparency Forum of the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Annual Meetings of the American Economic Association (2014, 2016). Seminars on replication were held at universities in Germany, Canada, China, and Switzerland and at a workshop in San Francisco with the Institute for New Economic Thinking Young Scholars Initiative, BITSS, and the Project Teaching Integrity in Empirical Research.

The advantage of the wiki approach lies especially in the fact that users can contribute to it without publishing a journal article. A working paper series was started for this purpose. Forum and blog posts can also be included as long as they have a verifiable author and make a contribution regarding the replicability of a published empirical study. On the studies’ discussion pages even very short comments can help other users like “To make the code work I had to add … at line …” or “The data has been moved to the following URL: …”.

For instructors, the wiki can help to identify examples for coursework as it allows searching for studies for which data and code are available, for which software was used that is accessible to the students, and for which a method was used that they should learn about. With the help of JEL codes and keywords preferred topics can also be searched for. Depending on the location of the students, it can also be motivating for them to see if research is available based on data from their home country (click here for an example). If it is not, they may be encouraged to compare results based on data from their country or region with the existing published research. For the students it can be an additional motivation if they can easily share their results with the research community via the ReplicationWiki.

The ReplicationWiki was described in more detail in a journal article. In the 2017 American Economic Review Papers and Proceedings an overview was given of economics journals’ data policies as well as of the distribution of the use of different software packages and of the geographical origin of the data used. In that article, some evidence was also presented that indicates that studies for which replication material is made available may attract more citations. This should be seen as a motivation for authors of empirical work who are willing to share their material to point this out by adding this information to the wiki. The ReplicationWiki has recently added a number of additional features. Now there are overviews of the methods, data sources and software used in the studies. In addition to replications the wiki now also provides information about corrections that have been published and whether studies have been retracted. Complex searches are now possible with a more user-friendly interface.

Initially the wiki covered studies mainly published in the Journal of Applied Econometrics, which already started an online data archive in 1995, the Journal of Political Economy, the American Economic Review and the four American Economic Journals. Now it covers studies published in 231 journals, 36 working paper series & blogs and 27 books. It lists 652 replications, 23 corrections and 14 retractions. As the wiki has been cited from a number of neighboring fields as an example to follow, it is becoming a hub for all social sciences. There have already been contributions in particular from political science and sociology.

The ReplicationWiki’s pages have been accessed more than 6.6 million times so far. It has been mentioned numerous times in the media, and more than 260 users from around the world have registered. As a wiki, it lives off the contributions of its users. We hope to encourage more users to contribute to this tool, or simply use it. In particular, one site feature that could become more valuable with higher participation is the ability to vote which studies should be replicated.

In July 2014, a cooperation with RePEc was started via a link exchange. For studies listed in the ReplicationWiki a link appears in the IDEAS section “Related works & more” under “Lists” like in this case, and on the authors’ pages under “Citations/Wikipedia mentions” like here.

Is your work listed? Check in and add it if not!


How RePEc can help you in times of upheaval, and how you can help RePEc

March 31, 2020

The academic, business and policy worlds currently through quite a bit of upheaval as people work from home, classes have moved on-line or have been canceled. People have to adapt to working differently. In various ways RePEc can help.


Bibliographic tools available off-campus

EconPapers and IDEAS are bibliographic websites for Economics that are accessible from anywhere. No need to be on campus or connecting through VPN to access a proprietary bibliographic tool.

Links to open versions of gated articles

Similarly to the above, if you cannot access some articles behind a publisher’s pay-gate, IDEAS often offers you another version in the form of an open-access working paper. Relevant links are on the articles pages on EconPapers and IDEAS.

Covid-19 related material updated daily

Material on RePEc is updated daily with feeds from over 2000 publishers. You can find material about Covid-19 easily by searching EconPapers and IDEAS. For example, this search on IDEAS gives you all the listed material, sorted by most recently indexed. The match count increases hourly.

Get rapid dissemination of Covid-19 related material

You did a study and want it rapidly disseminated? If your institution has its publications already indexed in RePEc, you are fine. If not, you can upload your study at MPRA for rapid dissemination through the various RePEc services, including NEP.

Find topical material about pandemics

The RePEc Biblio has curated listings of the most relevant works in various fields, including a topic on the Economics of pandemics and its sub-topics.

The current situation may also imply that some people have more time than usual, or have a need for some distractions. This may be a good opportunity to help RePEc in various ways. Some opportunities are below.


  • Offer to create a RePEc Biblio topic in your area of specialization

  • Contribute information about your students, advisors, and former students in your graduate program to the RePEc Genealogy. Note that the collected information is used for the ranking of graduate programs, so in a way you are helping yourself.

  • Take a moment to check that your RePEc Author Service profile is still current, in particular that there are no works waiting to be claimed, contact details are OK (many personal homepages are not), and that affiliations are fine. And if you not yet have a profile, create one!

  • Correct broken links in the directory of economic institutions, EDIRC. They are all marked with a red broken chain link.

  • We lost contact with some of our registered authors. Give use their new email address! They are listed with a red question mark on IDEAS and EDIRC, or all together here. If they have unfortunately died, we want to record that, too!


How to contribute data to CitEc

November 26, 2019

CitEc is the RePEc citation indexing service. CitEc extracts reference data from documents directly or from data about references provided by publishers. Then CitEc links references to find citations between documents available in RePEc. All data produced by CitEc is freely available. It is distributed to other RePEc services to enrich the services provided to researchers and authors. It is used to build citation profiles for registered authors, for working paper series and journals. I created CitEc in 1998. Nowadays, it contains over 41 million references and 14 million citations. This may be impressive numbers. But the data comes from 1.376.000 papers. Note that there are 2.737.000 downloable items in RePEc. Thus, I have been able to process only the 50% of the downloable items.

There is still a lot of work to do. You can help. Let’s see how.

1.- Providing references

One important approach to get references is to use the full text of documents. In many cases, we can extract that from the PDF files when we have them. However, PDFs are often behind tool gated portals. That is often due to economic issues like payment licenses. Sometimes it is due technical barriers, like the archive maintainer not providing an URL with direct access to the PDF file.

In these cases, you can help by submitting the full list of references cited. You do not need to be the author of the paper to submit the references. You may, for instance, submit references for a document that cites your work.

Thus, go ahead and use the web user input form. Thanks!

2.- Providing citations

Citations are relationships between two documents. Both the citing and the cited document must have a RePEc handle, thatis, be indexed in RePEc. The majority of citations are identified automatically by CitEc software. In addition, there are several ways to contribute citations to the database when the system has failed to find them.

A.- Register with the RePEc Author Service and use the search engine to add citations to your profile

B.- If you already know the paper which cites your work, follow the instructions in our FAQ (3.5)

C.- Otherwise, you can use the main CitEc search engine to look for the cited work in the references database. Just enter one author’s
surname and publication date for the cited work. You will get two types of results: citations and references. When you see a citation, CitEc has been able to match the reference to the cited document in RePEc. When you see a reference, CitEc has not been unable to find the cited document in RePEc. In some cases, the cited document is in RePEc but the system has not found it. You can help us to solve the problem by providing the link to the cited document. Just click on the “add citation now” link and give the handle of the cited document.

Many thanks for your contribution. If you have any question contact us at citechelp. Also, if you would like to get more involved with citation analysis, contact us!


What a RePEc Author Service account is good for

October 3, 2019

A little more than 20 years ago, the RePEc Author Service was launched (then under the name of HoPEc) as a self-registering service. This allows economists to create an account with RePEc. What for? This blog post is trying to enumerate all the uses of this account that were created since.

Unique identification

Before all the other identification services for academics and researchers, we created the RePEc short-ID, a unique identifier attached to a registered person. This identifier is used throughout RePEc much in the same way other objects are identified through handles: series, journals, papers, articles, institutions, archives… They can references each other, they can be used to draw statistics (including rankings). The use is not limited to RePEc: we see it for example in Wikipedia, Wikidata, and elsewhere.

Research record

Creating an account in the RePEc Author Service also allows an economist to establish and maintain a record of their scholarly output. The RePEc Author Service tries to match works indexed in RePEc with name variations provided by the author and asks the author to validate the potential matches. Not only does this establish a research record for the person, it also allows to disambiguate homonyms or authors with the same initials and last names. The research records are public and used by other RePEc services like EconPapers and IDEAS. The RePEc Author Service also helps in the discovery of citations for CitEc, which also maintains author pages.

The records from the RePEc Author Service facilitate other data improvements in RePEc. For example, affiliation data is leveraged in EDIRC, the directory of economics institutions to provide member lists. In addition, if several works within an author’s record have very similar titles, we deem them to be different versions of each other and we can link across them in bibliographic records.

Access to personalized services

Everything on RePEc is available for free and without registration because we believe this is how you provide the widest dissemination of research. Yet, there are some enhanced services that are impossible without providing personalization. The following examples do not require one to be an author, only to have an account with the RePEc Author Service:


  • MyIDEAS allows to create a personalize bibliography while browsing IDEAS and then export it in various formats. It also allows to follow authors, serials, JEL codes or search keywords either through the website or weekly email digests.

  • MyCitEc allows an author to manage their citation profile and get alerts about new citations, including citations to other authors’ works.

  • Authors can get a personalized ranking analysis.

Authentication for other tools

The RePEc Author Service uses OpenID, which is a protocol that allows other websites to leverage the authentication on the RePEc Author Service to log in elsewhere. This is similar to using Google or Facebook credentials to identify yourself on other sites. This is used across RePEc wherever credentials are necessary to identify a person. Examples are:


New initiative to help with discovery of dataset use in scholarly work

September 1, 2019

Many of our readers will have heard of the push for evidence based policy making at the federal level in the United States. The recent Foundations of Evidence Based Policy Making Act and the Federal Data Strategy have provided social scientists in general and economists in particular with a new opportunity to highlight the value of their data and their empirical work. Similar opportunities have appeared in other countries.

A major challenge in highlighting the value of data, however, is that it is currently almost impossible to find out which datasets are used by which researchers on which topics. RePEc is partnering with a new initiative that is combining natural language processing and machine learning techniques to automate dataset search and discovery from social science and economics publications. Some authors will start receiving an email from Christian Zimmermann this month asking them to validate the results of machine learning models. They can also contribute any additional links to the corpus right away at this link.

We hope eventually to automate the search and discovery of datasets and highlight their value as a scholarly contribution in the same way we collect information about publications and citations. The results should help inform government agencies about the value of data that they produce and work with, empirical researchers to find and discover valuable datasets and data experts in their scientific fields, and policy makers realize the value of supporting investments in data.

Thank you in advance for your support!


Why authors should have an account with RePEc

March 27, 2019

Among the many services that RePEc provide, the RePEc Author Service (RAS) holds a special place. Indeed, this services provides multiple utilities for the authors, the other RePEc services, the general user community, and beyond. This blog post goes through some of these utilities.

Author identification

RAS is pretty much the first service in the scientific community at large that has been providing since 1999 self-serve author identification. Once an author is registered, a RePEc Short-ID is created. This unique and permanent code is then used throughout RePEc services as well as by others (for example, Wikipedia, WikiData) to uniquely identify authors.

Author disambiguation

When authors register, they claim as theirs the works that are suggested by RePEc. This function is important as the author name listed on a particular work may be shared by several people, especially if only the initial of the first name is provided. Even with a full first name, there are many homonyms in the profession, see a list of examples here.

Author profile

Thanks to author registrations, RePEc has information about the name, affiliation and work of authors. This information is used by the various RePEc services to create author profiles that allow to link people, institutions, and works with each other. This provides users more options when they are browsing through the bibliographic databases that are the core of RePEc.

Notifications

Registered authors are notified every month about newly found citations, along with various statistics about the visibility of their works. Users can also receive news about their favorite authors: MyIDEAS allows to to follow additions to authors’ profiles, among other things.

OpenID credentials

The credentials that authors have with RAS can be leveraged elsewhere thanks to the OpenID protocol. This is in particular used by other RePEc services wherever a login is required. For example, this is used by MyIDEAS for the user-specific services it provides, by CitEC for the submission of references, or by the RePEc Genealogy to crowd-source its content.

Author and institution rankings

All the data is collected (author profiles, affiliations, citations and more) are used to compute various rankings that have become quite popular. Of course, this means that authors need to keep their profiles current with any work additions and affiliation changes.

And more

Author data is used for determining co-authorship networks (CollEc project), create an academic genealogy tree for economics (RePEc Genealogy), as well as for research on the economics profession (works using RePEc data).

You can can help further

The vast majority of the data gathered in all of the above is supplied by authors and publishers. All activity is logged and reviewed. But mistakes can happen, and the RAS administrator welcomes emails with correction suggestions. In additions, authors with whom RAS has lost contact (listing) need to enter their new email address so that they can continue receiving their suggestions for newly discovered works. Note that those authors do not count towards institution and regional rankings, as having an expired email addresses indicates that the person has moved or died. In both cases, the RAS administrator welcomes notification of the new address or death. The maintenance of the profiles of deceased authors is taken over by an administrator (listing).


How to make sense of the RePEc alphabet soup

September 29, 2018

RePEc is a uniquely organized initiative that brings with it an alphabet soup that confuses a lot of people, and we cannot blame them. Is a paper listed on IDEAS or RePEc? How is NEP different from RePEc? What is the difference between IDEAS and EconPapers? Etc.

For starters, RePEc (Research Papers in Economics, hence the capitalization) is a way to organize the data about publications of all sorts in Economics, and make all that available. Note that there is no central database, as every contributing publishers makes the data available on its own website following the rules set by RePEc. Beyond those rules, RePEc only maintains the list of pointers to where the publishers have put their RePEc archives.

Then, basically anybody can come and use that data. Some have decided to do that more formally and have their service listed in the repec.org domain. Examples would be EconPapers, IDEAS, and NEP. Others prefer not to or integrate the data in a larger scheme that spans more fields. Examples are Econlit, WorldCat, EBSCO, Google Scholar or ResearchGate. A third type of service uses part of RePEc data to enhance it and feed it back to RePEc. Examples for that are CitEc, EDIRC, and the RePEc Author Service. For a full list of the RePEc services that we know of, see the RePEc site.

Thus, RePEc is the basis for these services, to varying degrees, but they are independently run and RePEc has no say how they should be run. In fact, RePEc is not even a formal organization. Thus IDEAS is using RePEc data just as EconPapers is using RePEc data, but they are in no way directed by RePEc. And, the big difference between IDEAS and EconPapers is that they were initiated by different people. In the spirit of healthy competition, use the one you prefer.


Help build the academic tree of Economics: the RePEc Genealogy

April 22, 2018

Beyond the open bibliography that lays the foundation of RePEc, various services have emerged that enhance the data collected with RePEc. One of them is the RePEc Genealogy. The goal of this initiative is to build an academic family tree for Economics, recording who was advised by whom, where and when. It thus tries to build links among the over 50,000 economists registered with the RePEc Author Service as well as the institutions listed in EDIRC. At the time of writing this, close to 13,000 economists from over 1000 programs are listed in the RePEc Genealogy.

The data is collected by the community: The RePEc Genealogy is a wiki, and all you need is a registration with the RePEc Author Service to add information to it. You can make sure your own record is complete, add your students or whose of your advisor, or ensure that your graduate program or alma mater are properly recorded. Over 3,000 economists have already contributed to it. Go to the RePEc Genealogy crowdsourcing tool to participate and see some statistics about the genealogy.

How is the collected data used? Of course, one can browse the site for information. But the data is also used in other ways: IDEAS uses it to complement author profiles, to compute rankings of graduate programs (publications from all years or last 10 years), a ranking of economist by graduation cohorts. Finally, data from the Genealogy is starting to be used for research, along with data from the rest of RePEc. You could be part of the data that you are analysing! For a listing of papers using RePEc data, see here.


IDEAS turns 20

September 27, 2017

IDEAS just turned 20. Launched in September 1997 on a web server sponsored by Université du Québec à Montréal and adapted from scripts written for WoPEc by José Manuel Barrueco Cruz (who is now in charge of citation analysis at CitEc), the site initially displayed 40,000 papers and articles. Now, there are sixty times more documents. A screen shot from the early days is below.

In 2002, IDEAS moved to the University of Connecticut, followed by the Federal Reserve Bank of St. Louis, where it is still hosted. Over time, the site served 3.6 billion pages, although the vast majority where requested by web spiders for the major search engines and some page skimmers (who should really use the API). Once all this robotic access is cleared, the abstract pages alone where read almost 300 million times (or an average of 120 times for each listed item) and 70 million downloads were recorded (or an average of 31 times for each document available for download).

A few dates relevant for the history of IDEAS:


  • September 1997: IDEAS opens for business at the Université du Québec à Montréal
  • June 1998: the first ranking is published, covering abstract views for items and serials
  • August 2000: the first author ranking
  • February 2001: the first institution ranking
  • October 2002: IDEAS is now at the University of Connecticut
  • June 2011: IDEAS moves to the St. Louis Fed
  • January 2013: MyIDEAS is available
  • December 2014: IDEAS becomes mobile friendly


Why linking to research on RePEc sites makes sense

August 30, 2017

If you participate in online discussions about economics research, if you have an online syllabus, or if you share some literature through email, you are likely providing a link to some full text on a publisher’s site. I want to argue here that it is a better idea to link to a RePEc service (abstract pages on EconPapers and IDEAS or links from NEP reports). The reasons are the following:


  1. Link to full texts go stale. RePEc URLs are permanent and contain updated links to full texts.
  2. If the full text link is gated behind a paywall, the RePEc link can still provide context and often a link to a free version.
  3. Alternatively, if the full text link is going to a working paper, a RePEc page may have a link to a version published in a journal.
  4. Clicking on a RePEc link will give the author(s) credit, this cannot happen if the link goes directly to the full text.
  5. A RePEc abstract page also provides related research (cites, references) and links to author profiles. The interested reader can thus explore for more.

EconPapers and IDEAS each have easy tools if you want to share a link through social media or email. Use them!