Why is RePEc 25 years old?

May 5, 2022

I once read a quote that claimed that the reason why humanity has never reached its full potential, and never will reach it, are meetings. Interesting enough, RePEc was “made” at a meeting. That meeting took place on 12 May 1997. It is considered the birthday of RePEc. Now that is 25 years ago.

RePEc really started with the NetEc project. An account of February 1997 is in my note “About NetEc, with special Reference to WoPEc” at http://openlib.org/home/krichel/hisn.html. This gives a reasonable idea of the state of play before the meeting. In some ways that piece is an infomercial. It highlights the role that JISC funding played at that time.

What it does not mention are the the plans to build Swedish branch of WoPEc. The idea arose at a meeting in London where I met Frans Lettenström. He worked for Swedish Royal Library. I suggested they fund Sune Karlsson for a project to bud a Swedish economics working paper system.
On January 16 of 1997, Sune reported

“We had a meeting with our potential funders today and have reached a preliminary agreement on what to do. The idea is that we, as a pilot project, should get all the economics working paper series in Sweden on-line and into WoPEc.”

On March 1997, I received a cold email from Thomas W. Place of the library of Tilburg University. He was the technical lead for the DEGREE project. This project coordinated the publication of economics working papers by Dutch universities. I was aware of the project. I had tried to contact them on several occasions before, but never read from them. He proposed to furnish me data directly in the internal format used by WoPEc. This was an unprecedented act. As far as I can remember, until that point, Jose Manuel Barrueco Cruz (henceforth: JMBC) and I always has to take data from a provider and do conversions ourselves. But rather than accepting this offer with extreme enthusiasm it deserved, I wrote

“In the medium term I think we need to think over the whole structure of a distributed, mirrored archive system. I have already proposed that we use the list wopec-admin@mailbase to discuss a successor format to the WoPEc format. That would allow for administrative metadata, series descriptor, archive descriptions, permissions to mirror etc. This is longer term effort. I will publish some reflections soon.”

In fact, the email from Thomas W. Place gave me the impact to actually proceed in the direction outlined above. On 15 April I wrote to him

“My plan is to radically overhaul the structure of what we are doing, and I am writing a document that contains proposals for doing this. I have shown a draft to JMBC and he thinks it is very unclear at this stage … It is called the Guildford protocol.”

Thomas W. Place indicated he would be in London for a meeting on the May 13, so the 12 or 14 May would be good for him. Sune expressed a preference for 12 May. He used travel funds from the Swedish project and couched surfed at my flat in Martyr Court, Guildford. Sune arrived on the 8th at about 16:00. We went out for a walk to St. Martha’s Hill. On the hike, I popped the question to him. What did he think about my drafts? I was much relieved when he revealed that he thought they were reasonable.

The meeting as such was rather uneventful. The attendees were Corry Stuyts, who was the head of DEGREE, JMBC, Sune, Thomas, and myself. My office was too small and had too many computers in it, so we met in David Hawden’s office across the corridor. We basically set down and worked through the documents I had prepared. That’s all we did. We did not finish them. Thomas and Cory had to leave early. We went in great details through the two documents I had prepared. They are ReDIF specification and the Guildford protocol. Both documents are still the basis of RePEc. Sune contributed important corrections on May 16.

RePEc is a grass-roots initiative. Typically, grass-root initiatives take time to grow. Thus the precise start of such initiatives is not that easy to fix. The date of May 12, 1997 is generally accepted as the birthday of RePEc. But 25 years later, we need new directions. I have ideas but unfortunately, I am funded at this time to work on other business.


Quality control committee: looking for volunteer

June 13, 2016

The RePEc community is looking for a volunteer to head a committee on quality control for journals admitted to be indexed in RePEc. Here is some background.

There is a growing number of journal-like outlets that pretend to be normal open access journals. But in reality, all they do is take authors’ money, and put the content up on a web site. They do no quality  control. They have no editorial board that does any work. In fact, many times people on the board do not even know that they are on it.

Traditionally, RePEc has not done any quality control prior to listing additional journals. We believe that quality can best be assessed by users of the RePEc dataset. However, we have been criticized for helping these deceitful outlets gain a mantle of respectability through their RePEc listing. Therefore we take this step forward. We expect quality control also to be an issue with toll-gated journals.

The volunteer we are looking for will determine the exact name of the committee and its remit. (S)he would recruit a few committee members. (S)he would run the mailing list and maintain some web pages for the committee. RePEc can provide both. Anybody who is interested in this work should contact repec@repec.org.

We expect that this will not be a lot of work. We are sure that this as a duty that any academic can itemize as a professional service on their CV.


New linkages with RePEc

October 23, 2014

In my previous post, I have alluded to the fact that the value of RePEc comes from linkages between identified elements. In the next post, I will set out a working example of linkage usage in the CollEc project. In this post, I’m discussing a direction for future work. It’s about creating new linkage type. Much of this is already implemented at SocioNet. SocioNet is a RePEc service that originated in Russia in the 1990s. They hold RePEc data and combine it with local data.

Recently, login data from the RePEc Author Service has become available to other RePEc service via a protocol known as openID. Soon RAS-registered users will be able to login to SocioNet without having to create a SocioNet account, just simply by using their RAS account. SocioNet then knows that you are an identified author. When you are logged into SocioNet in this way, SocioNet knows that you have written a bunch of papers, that I will now call “your papers”. Based on the knowledge of your authorship, it can assume that you know your work and the surrounding literature. It can give you get a personalized web interface based on RAS data. In that interface you will be able to conveniently supply further details about your work.

First, SocioNet can enquire about the role of your collaborators in a given research paper. In conventional abstracting and indexing data, all contributors to a paper are placed into a list of authors. But usually, the co-authors each have different roles in the papers writing process. You can indicate the roles using a simple controlled vocabulary.

Second, using SocioNet you will be able to provide linkages between papers. One of the linked papers has to be yours. The other paper may be yours, but it may not be.

Let’s look at cases where you wrote both papers that you want to link. One thing you may want to tell users is how papers relate to each other. So you can say that one paper is an abridged version of the other, that a third paper is a development of the fourth. Eventually, such relationships could be picked up by RePEc services to create commented links between your papers. This is particularly useful if you have a version of a paper you don’t like any more. You can point users to a better version of the paper.

When you only wrote one of the papers, the other paper has to be on the reference list of one of your papers. In that case you can bring in a vocabulary containing terms like “develops model from”, or “uses software from” or “uses data from”. There are two aspects to these document to document relations.

One is that guessing the context of a citation is really difficult using the automated ways in which the citation is actually being produced. If users can take a small amount of time to classify citations according to a simple menu than we would be able to get more valuable information about the structure of ideas across papers.

The other is that building relationship with sources of data and software would advertise the data and software and promote the sharing of these resources. RePEc already works with software.  It would be great if it could work with datasets, i.e. as and when reusable datasets would be considered as publications in their own right, then users could point to a dataset used in the publication right in the metadata. It could then be possible to create a list of all the publications using a certain dataset. That would be a great way to unify papers on a certain topic and of course, to promote the dataset maintenance as an additional academic endeavour.


The value of RePEc — an introduction

September 13, 2014

I am Thomas Krichel the principal founder of RePEc. This is my second  contribution here. I plan to write more in the com on fundamental aspects of RePEc. And I’ll give some explanation about RePEc history. My particular expertise is how RePEc came about.

Today let me try to say something about the value of RePEc. In some, though not all aspects, RePEc is a digital and open equivalent of what librarians have long been calling abstracting and indexing (A&I) databases. A&I data is must common of academic journal literature. It lists descriptive information about journal articles past and present. These days, such databases appear to be of declining value. Librarians have been canceling with the argument that users want full text, not just an abstract. Here the description of the paper is a poor (wo)man’s version of the document itself, which of course would have that description. For WoPEc‐-the forerunner of RePEc‐-I took the opposite view. The full-text location was simply an attribute of the description of the paper.

In the early 90s, when I started the work on WoPEc, the fact that anything was freely available on the web was seen with some suspicion. I recall a radio comment at that time, about some company, and the comment about them was something like “They are now on the Internet, which is a euphemism for saying that they gone out of business”. Among economists in particular, the notion that free means cheap and cheap means bad, seemed to have a lot of appeal. Therefore I was keen that RePEc should not just be cheaper, but also be better than existing A&I databases. In 1998, I started to work on the key component of that vision, the RePEc Author Service. I designed the service and my student Markus J.R. Klink implemented it. At that point, I was not aware of any A&I product that implemented author identification. And for such there was no way that anybody would have implemented any service that would allow authors to claim papers. Of course the fact that Christian had worked on collection institutional data already was of great help to make this even more attractive.

Well, enough about pioneering works. I did promise to write about the value of RePEc, didn’t I? The key value I see is in identifying documents, authors and institutions and build linkages based on these identifications. Thus even if all papers in economics would be freely available, in open access journals or working papers sites of institutions and they would be staying there, we still would not have implemented the value of RePEc. The value does not come from individuals using a search engine and finding something of interest. Our value comes in the linkages like “this working paper was never published” or “this paper is cited by this other paper”, or “these two authors are co-authors”. If the coverage of economics through RePEc is complete, we can make such assertions with certainty. And we can make the assertions without further human work. For example through the fact that we have two papers that have identified authors, we can say that the two authors are co-authors.  Since the data is freely available that can be used in a co-authorship system. Or if we know that one paper cites another, we can export this into a system that solicits information about why the citation took place.  Linkages and open information go hand in hand in RePEc.


Cloud computing and RePEc

December 13, 2012

Hosting RePEc services has been both a technical and an organizational challenge. Historically, the first hosting of what was to become RePEc goes back to late 1992. Manchester Computing Center, as it was known then, agreed to create WAIS indexed Gopher for the BibEc and WoPEc projects created by Thomas Krichel. The site was converted to the web in 1993. Manchester Computing Center were a national center for academic computing, providing services the UK academic community. They were fortunately forward-looking in their outlook when they started to with NetEc. It was broadly within their remit as Thomas Krichel worked in UK academia at the time. They continued to sponsor RePEc-related sites until the end of the decade. But they were not the only one. Washington University of St. Louis, where EconWPA was living, contributed a NetEc mirror, and so did Hitotsubashi University where Satoshi Yasuda kept as server in his documentation centre for Japanese economic statistics. So generally, it was for sponsoring institutions, where a RePEc volunteer lived to take up the hosting. If they agreed, there were usually stringent conditions. Machines are locked in a facility closed after hours, there are rules on firewalls. Or when the machine was based in somebody’s office, a cleaner could unplug a cable, electricity cuts could cause damage to the motherboard, failing air conditioning would damage disks. The list may look comical now, but at the time each incident was a disaster. There was not much of an alternative. Commercial solutions were too expensive to be paid for by an individual, and project funding would come to an end.

Things are looking better now. Cloud computing has become much cheaper. In 2006, the RePEc OAI gateway, sponsored by the Central Library of Economics (ZBW) in Germany was the first sponsored RePEc service. The CollEc service has become the second sponsored RePEc service. The server runs at a hosting company. The server is a dedicated machine, with 8 CPUs. They are running 100% constantly as the calculations for CollEc are very heavy, at this time. One single sponsor covers a 50 euros a month fee for the machine. In November 2012 the ZBW sponsorship moved to a similar machine. In December 2012, the NEP service followed. It uses a similar machine. The NEP team had several offers of sponsorship and chose the one by Victoria University of Wellington, mainly because they were the first to offer. We think the CitEc service will follow suit, but we still have to find a sponsor. We also could move the main RePEc site to a similar machine. While a single site may not require the use of a powerful computer we still need backup. Case in point, in 2008 staff at the hosting company discovered that the server sponsored by ZBW did not have a stick on it. They proceeded to dismantle the machine. No data was recoverable. Fortunately Thomas Krichel kept a backup.

We expect that RePEc will be using more sponsored hosting. It is a very good thing. RePEc volunteers have spent countless hours on broken disks, falling power supply systems, loose network cable than you can shake a stick at. Using sponsored hosting can leave more time to improve service.


CitEc machine moves

November 11, 2009

On 2009-11-10, the Instituto Valenciano de Investigaciones Económicas took over mutabor, the machine that makes CitEc, from the Universidad Politécnica de Valencia. The RePEc community is grateful to Fernando Ferrer, who helped running the machine at the Universidad Politécnica de Valencia. We cheer Rodrigo Aragón Rodríguez who will be helping to maintain the machine at its new location.

CitEc is the citation analysis project within RePEc. At the time of this writing, it has analysed 230.279 documents, finding 5.130.205 references and 2.176.994 citations. The software side of the project is maintained by José Manuel Barrueco Cruz.