RePEcFB – An integration of your RePEc data into your Facebook profile

September 9, 2009

Following a suggestion on this blog and the creation of a RePEc Facebook group, we are happy to announce that a new service went online last week. The Facebook application RePEcFB allows Facebook users to integrate their RePEc data into Facebook. Economists on Facebook can create a small profile box listing their recent work, or a “My research” tab in the Facebook profile giving information about their working papers, publications and other research output. Users can list their affiliations and professional contact data, announce recent papers authored by their Facebook friends, or inform about conferences and other academic events they are going to attend. New papers or affiliations can be directly posted to the Wall and can be commented on by friends.

To use the application you both need a Facebook account and a RePEc author profile with RePEc Author Service. Detailed instructions can be found on the Notes tab of the application’s homepage.

RePEcFB was written by Ben Greiner with the help of László Kóczy, Sune Karlsson, and Thomas Krichel. The software is hosted on Sune’s server at Örebro University. The software is under ongoing development, so feel free to send comments to the author.


MPRA, the Munich Personal RePEc Archive

August 27, 2009

The Munich Personal RePEc Archive (MPRA) has been started three years ago. It has developed into one of the largest archives within the RePEc network, comprising roughly 9000 items at the time of writing. Christian Zimmermann has suggested that I share some toughs about its history and functioning.

The initial idea occurred to me when I heard that the Economics Working Paper Archive (EconWPA), run by Bob Parks, was discontinued in 2005. EconWPA offered the possibility for individual authors to make their contributions accessible to the community through the RePEc network, given that only institutions can set up RePEc archives. Although we have in Munich our discussion paper series integrated into RePEc, not all economists are so fortunate, and the need for a personal archive (as distinct from an institutional archive) was apparent.

Given that we had successfully established our department’s discussion paper series with the EPrints software, it appeared technically feasible to clone the software and use it for a personal RePEc archive. Discussion on the internal RePEc list led to the name “Munich Personal RePEc Archive,” the main concern being to clarify that the archive was intended as a RePEc service, rather something  original, and that the name would not exclude other personal RePEc archives in other locations. (If one of the other Munich universities wants to start another personal archive, we may get into a problem…)

I asked Volker Schallehn from the University Library, who has implemented the EPrints software for our university archives, about the possibility to help with such a project. He agreed to help. The next step was to convince the president of the university as well as the director of the library to agree dedicating some resources to the endeavor that would not serve people from Munich at all. They were in favor, and so we got started on September 19, 2006.

From a technical point of view the main problem was to automatize as much as possible, as we could not supply manpower: The generation of title pages, the  creation of metadate in the ReDif format required by the RePEc harvester, and the linking to the RePEc author service. With the help of  Thomas Krichel, Christian Zimmermann, Kit Baum, Sune Karlsson, Ivan Kurmarov, and others we manged to solve these problems and set up the website. We found editors. They do the main job now. The English editors handle often more than 50 submissions per day.

As the Eprints software permits to establish series in different languages, we decided to use these feature and to offer the service in all languages for authors who deal with country-specific issues and want to make their research available in their local language. However we require for all submissions English abstracts such that all users can obtain an impression what economists writing in other languages do and, if necessary, contact them. This feature has lead to quite a number of submissions in languages like Spanish or French, and to some smaller sets in Turkish, Arabic, and others. (Some of them look extremely pretty.) Maybe this feature creates a sense that all economists world-wide see themselves as members of a community with the common purpose of helping to improve living conditions around the globe.

A central motivation for establishing a pre-print archive like MPRA was to enable authors to secure the copyrights for their pre-print versions in case the copyright for the final article goes to the publisher. This permits open access to their work, even if publishers try to make the final work inaccessible for the non-paying public. This is a great convenience for academics and, I hope, generates a countervailing power that keeps a check on journal prices. Further, this arrangement provides a means for the authors to make their work accessible to others through the RePEc services.

As an unintended by-product some authors have obtained requests from publishers to publish their contribution in a volume or journal. This may indicate a trend for the future: While authors submitted their works to publishers (and paid for it), in the future simply put your stuff on the net, and publishers approach you in order to create collections that generate value added beyond mere publication, such that people and libraries a willing to pay for it. If MPRA could contribute to such a development, this would be nice.

It is quite astonishing to me how many good papers we obtain, in spite of the fact that we do no refereeing at all. (The editors check only some formal aspects, making sure that the submission is of academic nature, and a certain convention has emerged in this respect.)

MPRA offers a public forum for publishing papers, but not only that: It offers the possibility to publish comments on papers in the archive. This feature is not used. Maybe somebody has a suggestion how to organize discussions around papers such that people actually feel inclined to use such a feature.

So much about MPRA. If you have any suggestions, please feel free to communicate and discuss them on this blog.


On versioning in RePEc

August 21, 2009

RePEc carries research in various formats. While journal articles are unique (with very few exceptions), working papers, as they are pre-prints, may be duplicates of listed articles, and they may even appear in different versions, either because they are published in different series, or because there may be updates within a series. We believe that is important to carry all versions, not just the last one, for the following reasons.


  1. Time-stamps: A working paper allows to establish when some research was conducted and thus determines preeminence of research ideas. Given publication delays in Economics, this can be important.
  2. Open access: Many journal articles have gated access. Such restrictions can be bypassed by reading working papers, which are mostly open access.
  3. Link to published version: It is still preferred to use published versions in citations, especially once a paper is accepted in a journal. The originally cited working paper is often linked to its published version.
  4. Visibility: Working papers are much more read than journal articles, both because they are more current and they are freely available. In addition, working papers are disseminated through NEP.

The process of linking the various versions of the same work is not obvious, however. With about 800,000 works in RePEc, performing matches on titles is a daunting task, especially as fuzzy matching is necessary due to slight variations in punctuation and spelling. For this reason, we do the matching only across the works listed in an author’s profile. This ensures that the likelihood of two works being different versions of the same one to be very close to 100%. But this also means that such matching cannot be done for works where none of the authors is registered, or where a registered authors did not add all versions to the profile, thereby indicating he/she is not the author of this particular version, rightly of wrongly.

In some cases, titles change across versions, or journal editors require a title change. In such cases, a manual link between versions can be added, just contact a member of the RePEc team with the relevant RePEc handles.


Suggestion box

May 23, 2009

RePEc is entirely driven by volunteers, who are also users. Most current volunteers came to RePEc because either they wanted to help with a current project or because they had some idea they wanted implemented in RePEc. We are opening this suggestion box for several reasons: as way to encourage feedback, to encourage more volunteers to come forward and pick a suggestion, and finally have users and RePEc team members discuss the proposed suggestions.

At RePEc, we like to be open. After all, we are creating open bibliographies using open source software, and we encourage open access. RePEc is there for you, so tell us how you want it to be. So, make your suggestion in the comment section below.


Institutional data in RePEc

December 19, 2008

RePEc gathers information not only about publications and authors, but also institutions. Specifically, the EDIRC project (Economics Departments, Institutes and Research Centers) catalogues since 1995 all academics and government institutions that employ a significant share of economists, including think tanks and associations. For-profit organizations (banks, consultants, etc.) are listed if they contribute their publications to RePEc. As of today, 11,000 institutions are listed, including over 600 associations. Over 4000 have at least one registered author and about 1000 have some publication in RePEc.

The collected institutional data is used and displayed in various ways throughout RePEc. Authors use it when
they register to determine their affiliations. So do RePEc archives for their publications. Author and institution data are combined on EDIRC to compile the publication output of all institutions. Combine this with citation data from CitEc and download data from LogEc to determine institutional rankings.

Note that all the information about institutions has been gathered with the help of a lot of people.


RePEc as a bibliographic tool

September 14, 2008

RePEc is a scheme to collect bibliographic information about publication and pre-publications in Economics. Publishers provide all the relevant information, which is then displayed in various ways by RePEc services. This allows users to have access to this data. While it is useful to find items of research while browsing or searching through these services, it is even better when one can upload the relevant bibliographic data directly into one’s bibliographic tool.

Every abstract page on IDEAS has links that allow to download such bibliographic information in various formats: as a HTML citation, a plain text citation, the BibTeX entry familiar to LaTeX users, the RIS format used in various software like EndNote, and the ReDIF format used by RePEc. For registered authors, it is also possible to obtain these records for all their publications in one download. If other formats are used in the research community, they can be provided as well. Just ask.


NEP alerts now available through RSS

August 13, 2008

NEP (New Economics Papers) is an email service that alerts subscribers to new online working papers in their area of interest. About 80 fields are currently available, and the roughly weekly emails are sent free of charge. While the RePEc team thought email dissemination was sufficient, there also appears to be demand for RSS feeds as for this and other blogs. This is now available, and the RSS feeds can be subscribed to by clicking on the relevant field report on the NEP home page.

This new feature was added in typical RePEc fashion: David Hugh-Jones inquired with Marco Novarese why there was no RSS feed, Thomas Krichel encouraged David to set it up, and two days later, it was up.

If you think new features should be added to RePEc, we always welcome suggestions, especially if you are willing to do it yourself… much like many of the available NEP editors have been volunteers who just wanted a particular field to be covered.


Using RePEc for syllabi, bibliographies and publication lists

July 13, 2008

As highlighted in a recent post, we encourage deep linking in RePEc services. This is particularly useful for reading lists and syllabi. In fact, IDEAS provides simple tools to create such lists on its web site.

The first one allows to create reading lists by providing code that is similar to HTML and includes handles of items listed in RePEc. Each of these items is then automatically matched with other versions, thus allowing to find a free version of a password protected article, or find the latest version of a working paper as published in a journal. Different layouts are possible: one for a course syllabus, one for reading lists.

The second one allows to create a list of publications from a set of authors registered on RePEc. Existing examples include ex-pats from some countries, graduates from programs, winners of prizes, etc. Note that such lists are automatically computed for members of research units or departments. See listing on EDIRC. For other lists, this tools comes handy.


Why hotlinking to a RePEc service makes sense

June 27, 2008

Hotlinking is the practice of linking to a web page deep in a web site, instead of its front page. This practice is discouraged by many news sites, both because they prefer users to browse through the site and because links may become obsolete.

At RePEc, we actually encourage hotlinking. Links in RePEc services are designed to stay current (in principle). Also, instead of linking to a PDF file on a researcher’s web page, which may disappear, abstract pages on EconPapers or IDEAS are much more stable. In addition, these abstract pages may provide links to other versions of the paper. This proves particularly useful if the user does not have access to a password protected article from a commercial publisher, or if the user wishes to know whether the paper has been published. Other links on the abstract page can also be valuable, like those to author profiles, references, citations and related works. Finally, authors always appreciate when paper downloads are counted towards their statistics. Indeed, RePEc can only monitor traffic routed through its services.

Therefore, we encourage hotlinks to RePEc services on blogs, online syllabi, personal web pages, online bibliographies, etc.


The citation extraction process in CitEc

January 16, 2008

CitEc is an experimental autonomous citation index, that is, it is a software system which is able to automatically extract references out of the full texts of documents and create links between citing references and cited papers.

With its last update, the CitEc database has reached almost three million references and more than one million citations between documents available in RePEc. This is an important threshold but still is far of being a complete set of citations. There are some limits in the references extracion process:

First, the system needs to have open access to a electronic version of the documents full text. Many journals listed in RePEc have restricted access and therefore are excluded of CitEc unless they grant special access or push the citations to RePEc in other ways. We are working with some publishers that kindly provide us with metadata about references. We try to get on board as many publishers as possible but unfortunately not all of them are willing to collaborate with us at this time. As a result, the data set is still made up mainly of references extracted from working papers. This has the advantage of provide the most updated data about citations since working papers contains the most recent research results.

Second, the URL provided by the RePEc archive maintainer must be correct and must point to the PDF file containing the document full text and not to an intermediate abstract page or similar. Some archives provides this kind of links to force the researchers to pass through their institutional web pages. The system is unable to follow the links to the hidden papers and they are missed in the references extraction process.

The third limit is more technical. In order to extract references, the PDFs files need to be converted into plain ASCII text. This step is key to successfully complete the process, since a good quality text representation of the document makes easier the identification of references. There are a wide variety of PDF files created in different ways and not all of them can be converted.

Finally, the systems does a parsing of the references section, which first needs to be isolated, to identify each reference and split it in its parts: title, author, year, etc. The parsing is done using pattern matching techniques which in some cases are not able to identify the full list of existing references.

As the las update as of December 31, 2007, the CitEc numbers are: 527,357 articles and working papers available in RePEc. Of them, 343,441 cannot be processed by the system due to limitations mentioned in the first two points above, namely:

101,886 have not an electronic representation

216,110 have restricted access

19,174 have not a direct link to the docuent full text

6,271 have wrong url

That leaves an amount of 183,916 documents available to be processed by CitEc. Of them, the process was successfully completed in 134,130 papers, that is the 73% of the available documents. The complete list of sources and the number of processed documents for each series or journal is available here.

All the previous considerations should be taken into account when CitEc data is used for scientific evaluation purposes. We still consider the data to be experimental.

From the point of view of RePEc archive maintainers there are a few basic steps they can take to improve the situation. For example:

  • provide direct and correct URLs to the documents full text
  • make use of the X-File-Ref to give the system an ASCII version of the references section of a particular document
  • help us to lobby the publishers and editors of the restricted journals asking them to send us metadata about references.