Likely the most frequent request RePEc is getting is an author who wants us to add some publications to the database and wonders why our “spider” has not picked them up. The second most frequent is a publisher wondering why RePEc is neglecting to disseminate its output. The problem is that this is not at all the way RePEc functions. This short post provides the basics of how the metadata (the data describing the research documents) gets into RePEc.
The principle is that metadata comes directly from the providers. By providers we mean commercial publishers for their books and journals, or university departments for their working papers, or research centers for their papers, or policy institutions for their various publications. Thus, RePEc does not have a spider that surfs the entire Internet and tries to infer what it is that it stumbles upon. Rather, RePEc knows exactly where to look for the information that has been formatted in a way to optimize its usefulness. And if an author finds some publications are missing, it is either because the provider is not (yet) participating in RePEc, in which case it can follow these instructions, or because the provider has incomplete data, in which case a technical contact is listed on the RePEc page of the relevant journal or series and can help.
Why is RePEc data collection organized in such a way? We want RePEc to be free for all, so it needs to be set up in a way that does not generate costs. Thus, we put the burden of indexing on those who benefit the most from it, the providers. And close to 1700 are willing to do so. Any remaining central duties are picked up by the RePEc team.