Why discussion paper archives should not allow the removal of items

August 20, 2011

The archives listed in RePEc differ in their policies regarding withdrawal of items, or replacement of an old item by a newer one. Some archives, like NBER, permit withdrawals and replacements, while others, like  IZA  or MPRA do permit neither withdrawals nor replacements. (ArXiv, the leading archive for physics, has adopted a no withdrawal policy as well.)

I am managing MPRA, which publishes unrefereed discussion papers in economics. In the following, I detail the reasoning underlying MPRA’s policy choice.  As the case for prohibiting withdrawals seems to be strong, it is hoped that other RePEc archives adopt a similar policy if they have not done so already.

Discussion papers are preliminary versions of articles that may appear in their final form in the future. Discussion of these preliminary versions serves to improve them.

Discussion of a discussion paper requires that it can be cited. Citation requires that you can find the cited item, and even the cited phrase at the page given in the citation. In short: The cited item must remain reliably unchanged and retrievable.

In the old days, you mailed typed manuscripts to colleagues, and successively revised your papers in response to their suggestions and criticism. This entailed the problem that your colleagues would refer to different versions. In order to correctly grasp their points, you had to keep track of the different versions you had mailed around. (I never managed.) With a stable Internet address for each version, this tracking can be done over the Internet with ease. Permitting substitution of old versions by new version under the same Internet address would invide confusion and would make citations unreliable.

So the alternative seems to be: Either you keep your papers private and have your discussion in form of private correspondence, or you put them on the Net for public discussion. The second alternative is implied by placing the paper in a discussion paper archive, and this seems to require that identifiable versions remain accessible concurrently.

In addition, there are further reasons for favoring a “no withdrawal” policy by archive maintainers.

– If the final version of a paper ends up in a toll-gated journal, this excludes the majority of economists from reading the final version. The presence of a preliminary version mitigates the problem.

– If the preliminary version is referred to by a hyperlink, the reference becomes largely useless. NEP reports will, for instance, show dead links in such cases. This is a nuisance.

– If problems about priority of findings arise, these may be settled more easily if all versions are available on the Net.

– For archive maintainers, the manual handling of withdrawals requires considerable work. This speaks against the possibility of withdrawals as well. (For large archives, this reason is overwhelming. At MPRA we initially permitted withdrawals, but this proved impracticable and provided the proximate cause for adopting the no-withdrawal policy.)

– Further, the fight against plagiarism is eased by adopting a non-withdrawal policy. Typically, plagiarizers ask for removal of their contribution if detection is imminent. This tends to shade the case. If a plagiary remains in the archive, the case remains transparent. If an item is identified as a plagiary, it is to be marked as such, and the original source indicated. This has additional advantages:

– the interested reader is referred to the original source

– the plagiarizer cannot make his plagiary undone, thereby hiding the offense from scrutiny by potential future employers

– because of that threat, plagiarism becomes more risky and is discouraged.

– problems with plagiarism may be settled more easily and be handled more transparently if all versions are available on the Net. Otherwise, a paper may be plagiarized, the original paper substituted by a revised  version, and priority will go to the plagiary, while the revised version will be counted as a result of plagiarism! This ought to be avoided.

The common objection against a no withdrawal policy is that authors would prefer readers to read the newest version. Yet RePEc provides information about all versions, and the metadata at IDEAS or EconPapers provide alerts about other existing versions. So the readers may choose the most recent one. (Such problems occur all the time, but it would be impractical to introduce the possibility of withdrawing everything, including published papers. For example, I have recently updated a paper published in a journal in 2008 and would like to refer the reader to the new version in the format of a discussion paper which contains important improvements and new material, but there is no way to do that, other than hoping that the reader searches through RePEc or sees the different versions in Google.)

There is, thus, a conflict between the interest of the author to have only his or her favorite version on the Net, and the public that is interested in transparency and unmanipulated documentation. At MPRA, we try to take account for that by indicating if a paper is superseded by a newer version. Further, we offer the possibility to watermark papers as withdrawn by the author, but leave them in the archive.


Three new fields covered by NEP

July 25, 2011

NEP (New Economics Papers) is the RePEc service in charge of disseminating recent working papers that are available online. This dissemination occurs through email lists and RSS feeds. Given the large number of them, about 400-500 a week, they are split into field specific reports, each headed by an editor who chooses what is relevant to the field of interest, aided by an expert system. About 90 fields are currently covered, and volunteers are welcome to edit any area that is currently not represented.

We take this opportunity to highlight three new reports of SEO services that have recently been opened:

  • NEP-DEM (Demographic Economics), edited by Clarence Nkengne Tsimpo (Université de Montréal and World Bank). Note that there are also a report for migration (NEP-MIG).
  • NEP-IUE (Informal and Underground Economics), edited by Catalina Granda Carvajal (Universidad de Antioquia).
  • NEP-LMA (Labor Markets: Supply, Demand, and Wages), edited by Erik Jonasson (Lunds University). There is also a general labor economics report (NEP-LAB) and one dedicated to unemployment, inequality and poverty (NEP-LTV).

Subscriptions are of course free, as everything in RePEc. Details are available at NEP, including for the many other reports.


RePEc now indexes over one million works

January 25, 2011

RePEc has reached over the last week-end a historic mark: one million works in Economics and neighboring sciences are now indexed, of which 87.5% are available for download. The bibliographic database is comprised by 59.2% of journal articles, 38.5% of working papers, 1.3% of book chapters, 0.8% of books, and 0.2% of software components. All this material has been indexed by volunteers maintaining close to 1300 archives. As RePEc bears no costs, all the data is made available for free.

When RePEc started in June 1997, it built on a stock of metadata with 40,000 entries from its precursor NetEc, which started in 1992. Since then, data holdings have increased in an ever increasing fashion:

ItemsDate
100,000August 2000
200,000July 2003
300,000January 2005
400,000July 2006
500,000September 2007
600,000June 2008
700,000January 2009
800,000September 2009
900,000April 2010
1,000,000January 2011

The data collected by RePEc is used by a large number of free core services, including EconPapers, EconomistsOnline, IDEAS, NEP and Socionet. Other services that use RePEc data, however without reporting back usage statistics include, among others, Econlit, Google Scholar, Inomics, Microsoft Academic Search, and Worldcat.


NEP: 30000 reports and going

November 24, 2010

NEP (New Economics Papers) is an important element in the collection of services that use RePEc data. It disseminates through email and RSS weekly reports about new working papers in 85 different fields, each compiled by volunteer editors. This project has recently surpassed 30000 reports sent since 1998 to currently over email 60000 subscriptions from close to 30000 unique email addresses, announcing over 150000 papers on average to two field reports.

The quantity of information digested by this project has grown considerably over the years. Currently about 500 new papers a week are analyzed, a number too large for editors to manage. Thus several years ago an expert system has been put in place that learns on the choices of the editors and offers them every week the complete list of papers for selection, but placing the most likely choices first. It is remarkable how well this works, thereby saving our volunteers considerable time.

Volunteers are still welcome, for example to help with the general management of the project, help with existing reports or open new reports in fields not yet covered. Interested people should contact Marco Novarese.


NEP: Dissemination of new research through email and RSS

September 26, 2010

NEP (New Economics Papers) is a free, email-based service that disseminates weekly new working papers appearing in RePEc, currently 300-500 a week. It has 85 different, field-specific mailing lists, each managed by a volunteer editor who determines which papers are relevant for his/her field, with the help of an expert system.

So far, 150,000 papers have disseminated in about 30,000 reports, each paper being on average presented in two different reports. The subscriber base is close to 30,000 people, with over 60,000 subscriptions. NEP also offers RSS feeds, as well as some blogs discussing research (see sidebar). Of course, as everything in RePEc, everything is free and supported by volunteers.

We encourage you to use these services, and also to volunteer to help with running NEP, for example, if your field is not yet covered. NEP is currently headed by Marco Novarese and hosted by SUNY Oswego. Thomas Krichel and William Goffe offer technical support.


About author rights

February 17, 2010

Authors are always very happy when their paper is accepted for publication in a journal, as this shows that their work was deemed important but editors and referees. But they also want to make sure that their work gets read and does not disappear behind a subscription wall. There are several steps an author can take here.

Retain copyright

The author is the copyright holder until this is transfered to someone else. Publishers asks very soon after a paper is accepted for publication that the copyright be transfered to them. Typically, the form asks for all rights, which implies that the author cannot use her own work in other publication or in presentations, even in her own classroom. There are two ways to avoid this: 1) ask for the “other” copyright form, which publishers provides upon request only. This form allows the author to retain certain rights. 2) amend the copyright form. SPARC has developed a standard form that is available here [pdf]. See further details regarding this procedure.

Keep pre-prints online

In many cases, a paper was previously made available online as a working paper. Do not remove it. Indeed, you are the copyright holder and do not have to relinquish this. Even if you did not follow the steps above, in most cases, you can still keep your working paper online. Many publishers have made public that they tolerate, to various degrees, that these pre-prints remain in place. You can check this at SHERPA/RoMEO.

Provide post-prints

You can even archive so-called post-prints. These are accepted versions on your article. Many universities and research funders actually require that post-prints be publicly archived, for example in an institutional repository. In Economics, it is also common to publish an accepted work in a working paper series. Again, to see what publishers officially allow in this respect, see SHERPA/RoMEO. You have more rights, of course, if you took steps to retain them.


Why Journals?

December 16, 2009

When I started studying economics in the ‘sixties, there was no Xerox. Journals were printed, and then mailed. Because printing (type-setting by hand, no computer at that time) was expensive, only selected articles were distributed through journals, and journal editors had to select carefully. Researchers and even students subscribed to journals in order to have articles of interest available; otherwise they had to copy them by hand, or excerpt them, or go to the library to have a look. Distribution by print was the cheapest and most economic way of distributing research.

Hence the journals had a dual function: 1.) They selected research articles and 2.) they distributed them. The first function (selection) was necessary because printing–especially printing of mathematical formulae–was quite expensive. So the bundling of selection and distribution had an economic reason.

This reason has vanished. It is possible to distribute practically for free (through MPRA for instance). So the question is: Do we need journals, simply for the purpose of selecting articles, as the function of distributing articles is redundant nowadays. Let me share some thoughts on the issue. I concentrate on research journals, whether open access or not. Survey journals like the Journal of Economic Perspectives or commentary journals like Economists’ Voice are another matter.

Do We Need Quality Stamps?

Some people argue that journals provide a “quality stamp” for scientific contributions, just like rating agencies assess firms or assets. We know that rating agencies may induce unwarranted herding effects, yet the point that journals perform a rating function is true. But is it needed? And if needed, can’t it be provided more cheaply?

As to whether a quality stamp is needed: This may be different for different groups of users. So look at different groups that may benefit from a quality stamp.

Researchers

In my fields of research, I certainly do not use journal names for selecting articles. I search the Web and have my subscription to NEP. Most articles (99%) in top-ranking journals are of no concern to me because they discuss issues I don’t work on and seem too specialized, technical and boring as to make it worthwhile to read more than the abstract. But I get the abstract much earlier through NEP and other services. And further, I obtain the articles I am really interested in much earlier (one or two years earlier) over the net than through the journals. (Note that the articles in good journals are typically available on the Net at the time of publication.) If I find an article on the net that I like, typically a pre-print, and see it later published in a good journal, I feel a kind of satisfaction about the journal, but this does not seem to justify the existence of journals.

Further, I am not interested in seeing only the good papers that some referees approve of. As I know my field, I do not think that referees know better. Actually many papers in top journals are not so good, and mediocre journals publish excellent papers. Further, many rejected papers are rejected for reasons such as being badly written, ill organized, or employing faulty reasoning, but they often do contain useful references and interesting ideas, and therefore they interest me as much as a superbly crafted paper elaborating on rather sterile detail.

Yet there may be the benefit coming with having an article revised during the refereeing process. The probability that the mathematics are correct is slightly increased. Typically, the exposition is improved, too. Further, the references are enlarged by adding some quite relevant stuff, but also by adding things suggested by the referees for sundry reasons that hurt overall consistency. But this does not hurt much.

The benefits going with having an article refereed carry side-effects, however: Sometimes the editors’ and referees’ demands make papers worse. In the same vein, have a look at Bruno Frey’s amusing paper, and especially at what he reports about Robert Frank.

Regarding the publishing of my own research I see that publishing in a journal does not affect citations, but making a paper available on the net does so. Hence journal publication is of very limited value to me (but I don’t have to care about the journals I am publishing in because I am close to retirement).

So, overall, I think that researchers do not benefit significantly from journals that publish research papers.

Hiring Committees

A benefit from having quality stamps is that this helps hiring committees to select candidates under conditions of ignorance. This may be true, but I would consider this a dysfunction: In the first place, hiring committees should comprise knowledgeable members; otherwise you would not need hiring committees and leave the decision to bureaucrats; and second, citation numbers are much better indicators for the impact of an author’s work than the journals the author has published in. So ignorant hiring committees may better resort to RePEc citation scores, rather than being enthused by journal titles. (But then they will end up with hiring candidates who work in fields many people work in. So they end up with conventional candidates, rather than creative ones. But this will be the case whenever you have incompetent hiring committees.) In any case, hiring committees won’t need journals, as RePEc citation scores are independent of journal names and do not rely on the existence of journals.

However, the reliance of hiring committees on journal rankings may entail strictly negative consequences. I read, for instance, that Notre Dame University intends to dissolve the department of economic history because the economic historians do not publish in mainstream journals.

It seems to me that hiring committees do not benefit from the existence of journals either.

Libraries

It is sometimes said that journals permit journal rankings, and this is a help for librarians for deciding which journal to subscribe to. This is, of course, not an argument for supporting journals. Without journals, there would be no problem of selecting journals, and the librarians could concentrate on selecting books.

So I conclude that libraries would perform better if we had no journals.

Economics Without Journals

Imagine we had no economics journals. What would happen? Presumably people would write more books. I would consider this an advantage, as knowledge is much too fragmented at the moment. Further, institutions would be in demand to channel the flow of information better than possible through journals, such as blogs specializing on some topic or another, and meta-blogs like Econ Academics. I could imagine that collections of papers on certain topics would emerge. The Special Issues feature of the economics E-journal provides an example.

A Suggestion for a Next Step

My impression is that the existence of journals is a feature of the past. Journals will die, and this will be an improvement for academic economics. The process will be sped up if new ways of channeling information are devised. So here is just one idea:

I could think, for RePEc, to devise a feature that lists related papers to any given paper. Google Scholar has a feature like that, but I think that could be improved tremendously for our specific purposes. An easy way would be to look at the citations of any given paper and give all papers with similar citations. This could, theoretically, be achieved by building on the citation data created by the CitEc project. If someone with programming expertise could adopt such a project, this would be a great help for economists world-wide. (As a side effect, such a feature would put pressure on Elsevier to release its citation data.)

There are certainly many more suggestions. I am looking forward to see them, perhaps in comments to this blog. And certainly my general point must be controversial. I must have overlooked some important aspects. The world can not be as inefficient as I portray it. Otherwise we would have no journals right now.

Maybe we can have an exchange of ideas.


Follow

Get every new post delivered to your Inbox.

Join 250 other followers