How to contribute data to CitEc

November 26, 2019

CitEc is the RePEc citation indexing service. CitEc extracts reference data from documents directly or from data about references provided by publishers. Then CitEc links references to find citations between documents available in RePEc. All data produced by CitEc is freely available. It is distributed to other RePEc services to enrich the services provided to researchers and authors. It is used to build citation profiles for registered authors, for working paper series and journals. I created CitEc in 1998. Nowadays, it contains over 41 million references and 14 million citations. This may be impressive numbers. But the data comes from 1.376.000 papers. Note that there are 2.737.000 downloable items in RePEc. Thus, I have been able to process only the 50% of the downloable items.

There is still a lot of work to do. You can help. Let’s see how.

1.- Providing references

One important approach to get references is to use the full text of documents. In many cases, we can extract that from the PDF files when we have them. However, PDFs are often behind tool gated portals. That is often due to economic issues like payment licenses. Sometimes it is due technical barriers, like the archive maintainer not providing an URL with direct access to the PDF file.

In these cases, you can help by submitting the full list of references cited. You do not need to be the author of the paper to submit the references. You may, for instance, submit references for a document that cites your work.

Thus, go ahead and use the web user input form. Thanks!

2.- Providing citations

Citations are relationships between two documents. Both the citing and the cited document must have a RePEc handle, thatis, be indexed in RePEc. The majority of citations are identified automatically by CitEc software. In addition, there are several ways to contribute citations to the database when the system has failed to find them.

A.- Register with the RePEc Author Service and use the search engine to add citations to your profile

B.- If you already know the paper which cites your work, follow the instructions in our FAQ (3.5)

C.- Otherwise, you can use the main CitEc search engine to look for the cited work in the references database. Just enter one author’s
surname and publication date for the cited work. You will get two types of results: citations and references. When you see a citation, CitEc has been able to match the reference to the cited document in RePEc. When you see a reference, CitEc has not been unable to find the cited document in RePEc. In some cases, the cited document is in RePEc but the system has not found it. You can help us to solve the problem by providing the link to the cited document. Just click on the “add citation now” link and give the handle of the cited document.

Many thanks for your contribution. If you have any question contact us at citechelp. Also, if you would like to get more involved with citation analysis, contact us!


What a RePEc Author Service account is good for

October 3, 2019

A little more than 20 years ago, the RePEc Author Service was launched (then under the name of HoPEc) as a self-registering service. This allows economists to create an account with RePEc. What for? This blog post is trying to enumerate all the uses of this account that were created since.

Unique identification

Before all the other identification services for academics and researchers, we created the RePEc short-ID, a unique identifier attached to a registered person. This identifier is used throughout RePEc much in the same way other objects are identified through handles: series, journals, papers, articles, institutions, archives… They can references each other, they can be used to draw statistics (including rankings). The use is not limited to RePEc: we see it for example in Wikipedia, Wikidata, and elsewhere.

Research record

Creating an account in the RePEc Author Service also allows an economist to establish and maintain a record of their scholarly output. The RePEc Author Service tries to match works indexed in RePEc with name variations provided by the author and asks the author to validate the potential matches. Not only does this establish a research record for the person, it also allows to disambiguate homonyms or authors with the same initials and last names. The research records are public and used by other RePEc services like EconPapers and IDEAS. The RePEc Author Service also helps in the discovery of citations for CitEc, which also maintains author pages.

The records from the RePEc Author Service facilitate other data improvements in RePEc. For example, affiliation data is leveraged in EDIRC, the directory of economics institutions to provide member lists. In addition, if several works within an author’s record have very similar titles, we deem them to be different versions of each other and we can link across them in bibliographic records.

Access to personalized services

Everything on RePEc is available for free and without registration because we believe this is how you provide the widest dissemination of research. Yet, there are some enhanced services that are impossible without providing personalization. The following examples do not require one to be an author, only to have an account with the RePEc Author Service:


  • MyIDEAS allows to create a personalize bibliography while browsing IDEAS and then export it in various formats. It also allows to follow authors, serials, JEL codes or search keywords either through the website or weekly email digests.

  • MyCitEc allows an author to manage their citation profile and get alerts about new citations, including citations to other authors’ works.

  • Authors can get a personalized ranking analysis.

Authentication for other tools

The RePEc Author Service uses OpenID, which is a protocol that allows other websites to leverage the authentication on the RePEc Author Service to log in elsewhere. This is similar to using Google or Facebook credentials to identify yourself on other sites. This is used across RePEc wherever credentials are necessary to identify a person. Examples are:


New initiative to help with discovery of dataset use in scholarly work

September 1, 2019

Many of our readers will have heard of the push for evidence based policy making at the federal level in the United States. The recent Foundations of Evidence Based Policy Making Act and the Federal Data Strategy have provided social scientists in general and economists in particular with a new opportunity to highlight the value of their data and their empirical work. Similar opportunities have appeared in other countries.

A major challenge in highlighting the value of data, however, is that it is currently almost impossible to find out which datasets are used by which researchers on which topics. RePEc is partnering with a new initiative that is combining natural language processing and machine learning techniques to automate dataset search and discovery from social science and economics publications. Some authors will start receiving an email from Christian Zimmermann this month asking them to validate the results of machine learning models. They can also contribute any additional links to the corpus right away at this link.

We hope eventually to automate the search and discovery of datasets and highlight their value as a scholarly contribution in the same way we collect information about publications and citations. The results should help inform government agencies about the value of data that they produce and work with, empirical researchers to find and discover valuable datasets and data experts in their scientific fields, and policy makers realize the value of supporting investments in data.

Thank you in advance for your support!


Why authors should have an account with RePEc

March 27, 2019

Among the many services that RePEc provide, the RePEc Author Service (RAS) holds a special place. Indeed, this services provides multiple utilities for the authors, the other RePEc services, the general user community, and beyond. This blog post goes through some of these utilities.

Author identification

RAS is pretty much the first service in the scientific community at large that has been providing since 1999 self-serve author identification. Once an author is registered, a RePEc Short-ID is created. This unique and permanent code is then used throughout RePEc services as well as by others (for example, Wikipedia, WikiData) to uniquely identify authors.

Author disambiguation

When authors register, they claim as theirs the works that are suggested by RePEc. This function is important as the author name listed on a particular work may be shared by several people, especially if only the initial of the first name is provided. Even with a full first name, there are many homonyms in the profession, see a list of examples here.

Author profile

Thanks to author registrations, RePEc has information about the name, affiliation and work of authors. This information is used by the various RePEc services to create author profiles that allow to link people, institutions, and works with each other. This provides users more options when they are browsing through the bibliographic databases that are the core of RePEc.

Notifications

Registered authors are notified every month about newly found citations, along with various statistics about the visibility of their works. Users can also receive news about their favorite authors: MyIDEAS allows to to follow additions to authors’ profiles, among other things.

OpenID credentials

The credentials that authors have with RAS can be leveraged elsewhere thanks to the OpenID protocol. This is in particular used by other RePEc services wherever a login is required. For example, this is used by MyIDEAS for the user-specific services it provides, by CitEC for the submission of references, or by the RePEc Genealogy to crowd-source its content.

Author and institution rankings

All the data is collected (author profiles, affiliations, citations and more) are used to compute various rankings that have become quite popular. Of course, this means that authors need to keep their profiles current with any work additions and affiliation changes.

And more

Author data is used for determining co-authorship networks (CollEc project), create an academic genealogy tree for economics (RePEc Genealogy), as well as for research on the economics profession (works using RePEc data).

You can can help further

The vast majority of the data gathered in all of the above is supplied by authors and publishers. All activity is logged and reviewed. But mistakes can happen, and the RAS administrator welcomes emails with correction suggestions. In additions, authors with whom RAS has lost contact (listing) need to enter their new email address so that they can continue receiving their suggestions for newly discovered works. Note that those authors do not count towards institution and regional rankings, as having an expired email addresses indicates that the person has moved or died. In both cases, the RAS administrator welcomes notification of the new address or death. The maintenance of the profiles of deceased authors is taken over by an administrator (listing).


How to make sense of the RePEc alphabet soup

September 29, 2018

RePEc is a uniquely organized initiative that brings with it an alphabet soup that confuses a lot of people, and we cannot blame them. Is a paper listed on IDEAS or RePEc? How is NEP different from RePEc? What is the difference between IDEAS and EconPapers? Etc.

For starters, RePEc (Research Papers in Economics, hence the capitalization) is a way to organize the data about publications of all sorts in Economics, and make all that available. Note that there is no central database, as every contributing publishers makes the data available on its own website following the rules set by RePEc. Beyond those rules, RePEc only maintains the list of pointers to where the publishers have put their RePEc archives.

Then, basically anybody can come and use that data. Some have decided to do that more formally and have their service listed in the repec.org domain. Examples would be EconPapers, IDEAS, and NEP. Others prefer not to or integrate the data in a larger scheme that spans more fields. Examples are Econlit, WorldCat, EBSCO, Google Scholar or ResearchGate. A third type of service uses part of RePEc data to enhance it and feed it back to RePEc. Examples for that are CitEc, EDIRC, and the RePEc Author Service. For a full list of the RePEc services that we know of, see the RePEc site.

Thus, RePEc is the basis for these services, to varying degrees, but they are independently run and RePEc has no say how they should be run. In fact, RePEc is not even a formal organization. Thus IDEAS is using RePEc data just as EconPapers is using RePEc data, but they are in no way directed by RePEc. And, the big difference between IDEAS and EconPapers is that they were initiated by different people. In the spirit of healthy competition, use the one you prefer.


Help build the academic tree of Economics: the RePEc Genealogy

April 22, 2018

Beyond the open bibliography that lays the foundation of RePEc, various services have emerged that enhance the data collected with RePEc. One of them is the RePEc Genealogy. The goal of this initiative is to build an academic family tree for Economics, recording who was advised by whom, where and when. It thus tries to build links among the over 50,000 economists registered with the RePEc Author Service as well as the institutions listed in EDIRC. At the time of writing this, close to 13,000 economists from over 1000 programs are listed in the RePEc Genealogy.

The data is collected by the community: The RePEc Genealogy is a wiki, and all you need is a registration with the RePEc Author Service to add information to it. You can make sure your own record is complete, add your students or whose of your advisor, or ensure that your graduate program or alma mater are properly recorded. Over 3,000 economists have already contributed to it. Go to the RePEc Genealogy crowdsourcing tool to participate and see some statistics about the genealogy.

How is the collected data used? Of course, one can browse the site for information. But the data is also used in other ways: IDEAS uses it to complement author profiles, to compute rankings of graduate programs (publications from all years or last 10 years), a ranking of economist by graduation cohorts. Finally, data from the Genealogy is starting to be used for research, along with data from the rest of RePEc. You could be part of the data that you are analysing! For a listing of papers using RePEc data, see here.


IDEAS turns 20

September 27, 2017

IDEAS just turned 20. Launched in September 1997 on a web server sponsored by Université du Québec à Montréal and adapted from scripts written for WoPEc by José Manuel Barrueco Cruz (who is now in charge of citation analysis at CitEc), the site initially displayed 40,000 papers and articles. Now, there are sixty times more documents. A screen shot from the early days is below.

In 2002, IDEAS moved to the University of Connecticut, followed by the Federal Reserve Bank of St. Louis, where it is still hosted. Over time, the site served 3.6 billion pages, although the vast majority where requested by web spiders for the major search engines and some page skimmers (who should really use the API). Once all this robotic access is cleared, the abstract pages alone where read almost 300 million times (or an average of 120 times for each listed item) and 70 million downloads were recorded (or an average of 31 times for each document available for download).

A few dates relevant for the history of IDEAS:


  • September 1997: IDEAS opens for business at the Université du Québec à Montréal
  • June 1998: the first ranking is published, covering abstract views for items and serials
  • August 2000: the first author ranking
  • February 2001: the first institution ranking
  • October 2002: IDEAS is now at the University of Connecticut
  • June 2011: IDEAS moves to the St. Louis Fed
  • January 2013: MyIDEAS is available
  • December 2014: IDEAS becomes mobile friendly