Why discussion paper archives should not allow the removal of items

August 20, 2011

The archives listed in RePEc differ in their policies regarding withdrawal of items, or replacement of an old item by a newer one. Some archives, like NBER, permit withdrawals and replacements, while others, like  IZA  or MPRA do permit neither withdrawals nor replacements. (ArXiv, the leading archive for physics, has adopted a no withdrawal policy as well.)

I am managing MPRA, which publishes unrefereed discussion papers in economics. In the following, I detail the reasoning underlying MPRA’s policy choice.  As the case for prohibiting withdrawals seems to be strong, it is hoped that other RePEc archives adopt a similar policy if they have not done so already.

Discussion papers are preliminary versions of articles that may appear in their final form in the future. Discussion of these preliminary versions serves to improve them.

Discussion of a discussion paper requires that it can be cited. Citation requires that you can find the cited item, and even the cited phrase at the page given in the citation. In short: The cited item must remain reliably unchanged and retrievable.

In the old days, you mailed typed manuscripts to colleagues, and successively revised your papers in response to their suggestions and criticism. This entailed the problem that your colleagues would refer to different versions. In order to correctly grasp their points, you had to keep track of the different versions you had mailed around. (I never managed.) With a stable Internet address for each version, this tracking can be done over the Internet with ease. Permitting substitution of old versions by new version under the same Internet address would invide confusion and would make citations unreliable.

So the alternative seems to be: Either you keep your papers private and have your discussion in form of private correspondence, or you put them on the Net for public discussion. The second alternative is implied by placing the paper in a discussion paper archive, and this seems to require that identifiable versions remain accessible concurrently.

In addition, there are further reasons for favoring a “no withdrawal” policy by archive maintainers.

– If the final version of a paper ends up in a toll-gated journal, this excludes the majority of economists from reading the final version. The presence of a preliminary version mitigates the problem.

– If the preliminary version is referred to by a hyperlink, the reference becomes largely useless. NEP reports will, for instance, show dead links in such cases. This is a nuisance.

– If problems about priority of findings arise, these may be settled more easily if all versions are available on the Net.

– For archive maintainers, the manual handling of withdrawals requires considerable work. This speaks against the possibility of withdrawals as well. (For large archives, this reason is overwhelming. At MPRA we initially permitted withdrawals, but this proved impracticable and provided the proximate cause for adopting the no-withdrawal policy.)

– Further, the fight against plagiarism is eased by adopting a non-withdrawal policy. Typically, plagiarizers ask for removal of their contribution if detection is imminent. This tends to shade the case. If a plagiary remains in the archive, the case remains transparent. If an item is identified as a plagiary, it is to be marked as such, and the original source indicated. This has additional advantages:

– the interested reader is referred to the original source

– the plagiarizer cannot make his plagiary undone, thereby hiding the offense from scrutiny by potential future employers

– because of that threat, plagiarism becomes more risky and is discouraged.

– problems with plagiarism may be settled more easily and be handled more transparently if all versions are available on the Net. Otherwise, a paper may be plagiarized, the original paper substituted by a revised  version, and priority will go to the plagiary, while the revised version will be counted as a result of plagiarism! This ought to be avoided.

The common objection against a no withdrawal policy is that authors would prefer readers to read the newest version. Yet RePEc provides information about all versions, and the metadata at IDEAS or EconPapers provide alerts about other existing versions. So the readers may choose the most recent one. (Such problems occur all the time, but it would be impractical to introduce the possibility of withdrawing everything, including published papers. For example, I have recently updated a paper published in a journal in 2008 and would like to refer the reader to the new version in the format of a discussion paper which contains important improvements and new material, but there is no way to do that, other than hoping that the reader searches through RePEc or sees the different versions in Google.)

There is, thus, a conflict between the interest of the author to have only his or her favorite version on the Net, and the public that is interested in transparency and unmanipulated documentation. At MPRA, we try to take account for that by indicating if a paper is superseded by a newer version. Further, we offer the possibility to watermark papers as withdrawn by the author, but leave them in the archive.


Three new fields covered by NEP

July 25, 2011

NEP (New Economics Papers) is the RePEc service in charge of disseminating recent working papers that are available online. This dissemination occurs through email lists and RSS feeds. Given the large number of them, about 400-500 a week, they are split into field specific reports, each headed by an editor who chooses what is relevant to the field of interest, aided by an expert system. About 90 fields are currently covered, and volunteers are welcome to edit any area that is currently not represented.

We take this opportunity to highlight three new reports of SEO services that have recently been opened:

  • NEP-DEM (Demographic Economics), edited by Clarence Nkengne Tsimpo (Université de Montréal and World Bank). Note that there are also a report for migration (NEP-MIG).
  • NEP-IUE (Informal and Underground Economics), edited by Catalina Granda Carvajal (Universidad de Antioquia).
  • NEP-LMA (Labor Markets: Supply, Demand, and Wages), edited by Erik Jonasson (Lunds University). There is also a general labor economics report (NEP-LAB) and one dedicated to unemployment, inequality and poverty (NEP-LTV).

Subscriptions are of course free, as everything in RePEc. Details are available at NEP, including for the many other reports.


NEP: 30000 reports and going

November 24, 2010

NEP (New Economics Papers) is an important element in the collection of services that use RePEc data. It disseminates through email and RSS weekly reports about new working papers in 85 different fields, each compiled by volunteer editors. This project has recently surpassed 30000 reports sent since 1998 to currently over email 60000 subscriptions from close to 30000 unique email addresses, announcing over 150000 papers on average to two field reports.

The quantity of information digested by this project has grown considerably over the years. Currently about 500 new papers a week are analyzed, a number too large for editors to manage. Thus several years ago an expert system has been put in place that learns on the choices of the editors and offers them every week the complete list of papers for selection, but placing the most likely choices first. It is remarkable how well this works, thereby saving our volunteers considerable time.

Volunteers are still welcome, for example to help with the general management of the project, help with existing reports or open new reports in fields not yet covered. Interested people should contact Marco Novarese.


NEP: Dissemination of new research through email and RSS

September 26, 2010

NEP (New Economics Papers) is a free, email-based service that disseminates weekly new working papers appearing in RePEc, currently 300-500 a week. It has 85 different, field-specific mailing lists, each managed by a volunteer editor who determines which papers are relevant for his/her field, with the help of an expert system.

So far, 150,000 papers have disseminated in about 30,000 reports, each paper being on average presented in two different reports. The subscriber base is close to 30,000 people, with over 60,000 subscriptions. NEP also offers RSS feeds, as well as some blogs discussing research (see sidebar). Of course, as everything in RePEc, everything is free and supported by volunteers.

We encourage you to use these services, and also to volunteer to help with running NEP, for example, if your field is not yet covered. NEP is currently headed by Marco Novarese and hosted by SUNY Oswego. Thomas Krichel and William Goffe offer technical support.


A few new features on RePEc services

August 30, 2010

RePEc services display the data collected through RePEc to end-users, be it through the web or email. These services constantly improve with new features. We recently reported about some that users may have overlooked. Here are some new ones:

  • On Econpapers, abstract pages now include a “share” button, which allows to easily share or save the page with hundreds or other services.
  • There are now RSS feeds for new citations, for specific articles, papers, series and journals. EconPapers and IDEAS provide links on their pages to the feeds. There are no feeds for authors, as they have been receiving this information for years through email (if registered).
  • A big part of RePEc is driven by user submissions, and here is a nice example. A script that allows to parse NEP reports, downloads the pdf files and put references in a BibTeX file.
  • Another user contribution, not new but I forgot it last time: RePEcfb, a Facebook application that allows to displays your latest works in your profile.


Improved usage statistics for RePEc

August 6, 2010

Usage statistics for RePEc services are collected by LogEc. Producing meaningful statistics for accesses to web servers is a difficult task, especially so since we are merging data from several different sites. Rather than just counting the number of times a page or file is accessed (by a human or a piece of software indexing the web) the goal is to get as close as possible to a measure of the number of people showing an interest in a paper by reading the abstract page or downloading the full text file.

We have always been applying very strict criteria for what should be counted as a download or abstract view but over time it has become clear that the simple filtering for robots and removal of double clicks is not enough. Many new practices has developed on the web, some for a good purpose, some for a more questionable purpose. There are spam-bots, referer spamming (a stupid idea if there ever was one), anti-malware software that checks links on a webpage and warn users about dangerous links and much, much more that should not be counted. And, yes, there appears to be the occasional attempt to manipulate the statistics.

Starting from July 2010 we apply an additional set of heuristics to filter out these accesses. In conjunction with this we have also recalculated the statistics going back to January 2008. The overall effect is relatively small but there are substantial reductions in the number of accesses for a small number of papers.

More information at LogEc.


RePEc Author Service reaches major mark

August 4, 2010

The RePEc Author Service has just welcomed the 25,000th author! This service allows economists to build an online profile with all the works they have authored and that are listed in RePEc. A part from having this profile displayed and linked to from individual works on RePEc services like EconPapers and IDEAS, this allows authors to obtain monthly statistics about the popularity of their works, along with new citations discovered by the CitEc project. Collected data is also used to computed various rankings. Note that the 25,000 count only includes registered people who have at least one work listed in the profile. There are about 7,000 other registrations with empty profiles from people who have either overlooked this feature or not yet published some works. A listing of all registered authors is available on EconPapers and IDEAS.

RePEc currently lists 940,000 works from close to 3000 working paper series and 1150 journals, among others, contributed by over 1200 archives. It has become the standard bibliographic database in Economics, with RePEc services recording the 50 millionth download during July 2010. All RePEc activities are driven by volunteers as RePEc is not funded.


Follow

Get every new post delivered to your Inbox.

Join 387 other followers