Openness of Economic Data and Code

August 9, 2013

The publication of an article or a working paper is only part of the scientific process. Scrutiny by the scientific community during the peer-review process and later through replication attempts and extensions of the original work should be part of it. Unfortunately, very little of that is happening in economics. Indeed, a significant hurdle is that very often the computer code and/or the data used for the analysis are not disseminated. While some journals now make this a requirement for publication, there is otherwise very little incentive for researchers to make this available. In part, this is also a question of culture, as we are not used to cite datasets, for example, and prefer to acknowledge their use in a footnote.

To change this culture and push for making code and data more readily available, the Open Knowledge Foundation and put together a set of Principles on Open Economics. Read them and sign on if you think you are willing to endorse them.

On the RePEc front, we are working to get datasets indexed as well. If interested in participating in this, contact me.

EconStor: A RePEc Archive for Research from Germany

September 15, 2011

This guest post was written by Jan Weiland.

EconStor is a subject-based repository for economics and business administration maintained by the German National Library of Economics / Leibniz Information Centre for Economics (ZBW). It provides free access to all kinds of scholarly publications, including working and discussion papers, conference papers, journal articles, research reports, and dissertations. The main content so far comes from German research institutions and university departments. But acting as a disciplinary repository EconStor, of course, welcomes any research institution worldwide seeking for a reliable storage and publishing infrastructure for its research papers in the field of economics and business administration – especially those institutions without access to a local repository infrastructure.

EconStor’s main objectives are

  • to offer scholarly publications without access restrictions (‘Open Access’),
  • to assure free and durable accessibility via fixed and stable links (‘Persistent Identifier’),
  • to provide consistent bibliographic data (‘Metadata’) like author, title, abstract, keywords, and JEL codes, and
    to disseminate the publications via databases, search engines and social media.

In order to achieve these goals we decided to make a “Full-Service Offer” to the editors of publications being considered to be published at EconStor, i.e. the EconStor team organizes the full text upload and metadata recording – free of charge, but based on a publication agreement [pdf] which is required for copyright reasons.

Besides complete working paper series or e-journals, EconStor is also open for single authors wishing to self-archive their own publications like pre- and post-prints, research reports, or theses. For this purpose we have prepared the ‘special community’ EconStor Direct, separated into collections covering common document types.

For the dissemination of scholarly output in economics, RePEc is an ideal service. Therefore we started in 2006 with feeding publications from our repository
into the RePEc database. Further requests followed from other institutions, so by and by the idea was developing to build up a national RePEc input service – similar to DEGREE for the Netherlands or S-WoPEc for Scandinavia. And although some institutions from Germany already were (and still are) providing its research series to RePEc themselves, there was still enough demand for setting up such a national service. But at that time at first a more flexible repository system had to be implemented. ZBW decided for DSpace, still the most widely-used repository software in the world. What were the reasons that led to this decision? First of all DSpace offers an interface for bulk ingest. This is very helpful when some metadata is already available in a structured format, like Excel or CSV files, e.g. from conference management tools. Furthermore it is able to handle Unicode/ UTF-8 encoding (very important for non-Latin characters, e.g. Cyrillic), it uses the Handle System from CNRI as persistent identifier system by default, and its inherent community&collections structure fits best to our needs: covering series, journals, and conference proceedings. So it is no surprise that AgEcon Search, a very similar approach in agricultural economics, uses the same software!

The idea of building up a ‘national RePEc input service’ was convincing for the German Research Foundation (DFG), that decided to supply some extra funding for the implementation in 2009. The funding enabled us to transfer the RePEc export interface to DSpace and to prepare additional publications for the integration into EconStor. This includes several ‘back files’ from the early 1990s, which in some cases had been originally published in formats like Postscript, DVI/TeX, or pure HTML – and are now available in PDF on EconStor and in RePEc.

In the meantime EconStor is hosting the full texts of more than 100 ‘series’ (including conferences and journals) from 75 German research institutions and university departments in RePEc. And with more than 7,500 downloadable items EconStor is now a major contributor to RePEc. The demand shows, that the ‘RePEc input service’ constitutes an important incentive for an institution to participate in EconStor.

But also publications from single authors are provided to RePEc, e.g. doctoral theses are listed within ZBW’s series ‘EconStor Theses‘. And as ‘theses’ are tagged as ‘books’ within this series, those documents will be displayed correspondingly within a personal RePEc author profile. So if you wish to add your PhD thesis to your RePEc profile, listed separately from your papers and articles (see example), you are very welcome to submit your work to EconStor!

Although RePEc is a very important dissemination point for EconStor content, there are some more distribution channels making it potentially interesting to participate in EconStor: All records are fed into EconBiz (ZBW’s search engine for economics and business studies), Google Scholar, BASE (Bielefeld Academic Search Engine) and OAIster. A certain portion of content from EconStor is provided to Economists Online and the Social Science Research Network (SSRN).

Why discussion paper archives should not allow the removal of items

August 20, 2011

The archives listed in RePEc differ in their policies regarding withdrawal of items, or replacement of an old item by a newer one. Some archives, like NBER, permit withdrawals and replacements, while others, like  IZA  or MPRA do permit neither withdrawals nor replacements. (ArXiv, the leading archive for physics, has adopted a no withdrawal policy as well.)

I am managing MPRA, which publishes unrefereed discussion papers in economics. In the following, I detail the reasoning underlying MPRA’s policy choice.  As the case for prohibiting withdrawals seems to be strong, it is hoped that other RePEc archives adopt a similar policy if they have not done so already.

Discussion papers are preliminary versions of articles that may appear in their final form in the future. Discussion of these preliminary versions serves to improve them.

Discussion of a discussion paper requires that it can be cited. Citation requires that you can find the cited item, and even the cited phrase at the page given in the citation. In short: The cited item must remain reliably unchanged and retrievable.

In the old days, you mailed typed manuscripts to colleagues, and successively revised your papers in response to their suggestions and criticism. This entailed the problem that your colleagues would refer to different versions. In order to correctly grasp their points, you had to keep track of the different versions you had mailed around. (I never managed.) With a stable Internet address for each version, this tracking can be done over the Internet with ease. Permitting substitution of old versions by new version under the same Internet address would invide confusion and would make citations unreliable.

So the alternative seems to be: Either you keep your papers private and have your discussion in form of private correspondence, or you put them on the Net for public discussion. The second alternative is implied by placing the paper in a discussion paper archive, and this seems to require that identifiable versions remain accessible concurrently.

In addition, there are further reasons for favoring a “no withdrawal” policy by archive maintainers.

– If the final version of a paper ends up in a toll-gated journal, this excludes the majority of economists from reading the final version. The presence of a preliminary version mitigates the problem.

– If the preliminary version is referred to by a hyperlink, the reference becomes largely useless. NEP reports will, for instance, show dead links in such cases. This is a nuisance.

– If problems about priority of findings arise, these may be settled more easily if all versions are available on the Net.

– For archive maintainers, the manual handling of withdrawals requires considerable work. This speaks against the possibility of withdrawals as well. (For large archives, this reason is overwhelming. At MPRA we initially permitted withdrawals, but this proved impracticable and provided the proximate cause for adopting the no-withdrawal policy.)

– Further, the fight against plagiarism is eased by adopting a non-withdrawal policy. Typically, plagiarizers ask for removal of their contribution if detection is imminent. This tends to shade the case. If a plagiary remains in the archive, the case remains transparent. If an item is identified as a plagiary, it is to be marked as such, and the original source indicated. This has additional advantages:

– the interested reader is referred to the original source

– the plagiarizer cannot make his plagiary undone, thereby hiding the offense from scrutiny by potential future employers

– because of that threat, plagiarism becomes more risky and is discouraged.

– problems with plagiarism may be settled more easily and be handled more transparently if all versions are available on the Net. Otherwise, a paper may be plagiarized, the original paper substituted by a revised  version, and priority will go to the plagiary, while the revised version will be counted as a result of plagiarism! This ought to be avoided.

The common objection against a no withdrawal policy is that authors would prefer readers to read the newest version. Yet RePEc provides information about all versions, and the metadata at IDEAS or EconPapers provide alerts about other existing versions. So the readers may choose the most recent one. (Such problems occur all the time, but it would be impractical to introduce the possibility of withdrawing everything, including published papers. For example, I have recently updated a paper published in a journal in 2008 and would like to refer the reader to the new version in the format of a discussion paper which contains important improvements and new material, but there is no way to do that, other than hoping that the reader searches through RePEc or sees the different versions in Google.)

There is, thus, a conflict between the interest of the author to have only his or her favorite version on the Net, and the public that is interested in transparency and unmanipulated documentation. At MPRA, we try to take account for that by indicating if a paper is superseded by a newer version. Further, we offer the possibility to watermark papers as withdrawn by the author, but leave them in the archive.

About self-archiving your research

May 15, 2009

When you write a paper, you typically pursue several goals. One is to publish it in a good journal in order to get recognition for your work. The other is to get read and have an impact (and get citations). While publishing in a good journal may help you achieve the second goal, this is not necessarily so as the access to most journal articles is restricted by subscriptions. One way around this is to make some version of your work available in other ways. This is referred to as self-archiving.

This can be done in several ways, greatly helped by the availability of the Internet:

  1. Have a copy on your web page.
  2. Have a copy in your local working paper series.
  3. Have a copy in your institutional repository, usually managed by the library.
  4. Host a copy elsewhere.
The first solution is clearly not efficient, as people would only find your work there by chance. This would also be the case for the other solutions, but there are good ways to make such works more widely available, RePEc being a major one. Indeed, once a working paper series is indexed in RePEc, it will be available in thematic search engines dedicated to Economics (EconPapers and IDEAS), disseminated through mailing lists and RSS (NEP) and further pushed to other indexers (Econlit, Google Scholar, OAISTER), etc.). But for this to happen, the working paper series would need to be indexed in RePEc (instructions). The same applies to an institutional repository (see more about that).

If these options are not available, the paper can be hosted elsewhere. For RePEc, the Munich Personal RePEc Archive is ready to accept uploads, and has in a couple of years accepted more 8000 papers, including quite a few older ones that researchers wanted to make available to anyone. Another option is SSRN, but this archive does not participate in RePEc.

Regarding self-archiving, the most frequent asked questions is: am I violating a copyright when uploading somewhere a working paper? The short answer is that in the vast majority of cases, no copyright that you may have signed away to a publisher is violated by uploading a pre-print, i.e., a previous version of your work. In many cases, it is sufficient that the working paper simply does not have the published layout, or that it not be the final version. Many publishers even allow post-prints, that is, uploads of final versions onto institutional repositories, as these are more and more mandated by institutions and sponsors. To check what the policy of each publisher is, consult SHERPA/RoMEO. Only in very rare cases does a working paper need to be withdrawn once published in a journal.

Note that when both a self-archived and a published version of a paper are listed in RePEc with the same title, and both are present in an author’s profile, RePEc will link between them. This allows the reader to find where a working paper was ultimately published, or to read a paper hidden behind a journal’s subscription wall. Thus authors: never remove from your profile works that you have authored.

Finally, for more about self-archiving, check out the Self-Archiving FAQ hosted by e-prints.

The Economics of Open Access Publishing

April 24, 2009

Open Access Publishing is the free distribution of research, whether it is as a pre-print (working paper) or a peer-reviewed article. Since the creation of the web, more and more journal are choosing open access as their business model. One of them was recently Economic Analysis and Policy, published by the Economic Society of Australia (Queensland). To celebrate this, EAP has just published a special issue dedicated to the Economics of Open Access Publishing. Articles are written by economists discussing their experience with open access as well as by others involved in open access publishing. They cover the transition the publishing industry is currently undergoing, the surprisingly low cost of publishing an open access journal, the impact of open access and various open source aspects of the open access.

1000 archives participating in RePEc

March 10, 2009

With last week’s additions, RePEc is now carrying bibliographic data from over 1000 archives. This is a good opportunity to give a reminder how data actually makes it to RePEc. Indeed, there is no staff at RePEc that would be inputing data, this is all provided directly from the publishers. These, be it commercial publishers, Economics department, research centers or central banks, put files at a predetermined address on their web or ftp server, following the Guildford protocol. These files follow a set syntax codified by ReDIF (Research Documents Information Format). The RePEc services then gather this bibliographic data on a regular schedule (typically every night) and display it to the public.

Thus, if you are a publisher and want something listed on RePEc, follow our step-by-step instructions. If you are an author unhappy that some of your works are missing, encourage your publisher(s) to participate. Alternatively have your institutions participate with its working papers (most publishers allow pre-prints or post-prints to be posted) or load your works up at the Munich Personal RePEc Archive.

Update (March 13): We have now also supassed 2500 working paper series…

RePEc archives: AgEcon Search

February 18, 2009

This guest post was written by Julie Kelly and Louise Letnes.

Over the past few months, the papers that make up AgEcon Search have been added to RePEc. All papers are available in full text, and they include working papers, conference papers and articles from smaller journals.

AgEcon Search includes a wide range of topics in applied economics, including agricultural, development, energy, environmental, and resource economics. Over 170 groups from 20+ countries contribute their work. As of early 2009, 27 journals are included.

The journals that are included in AgEcon Search are mostly small press journals with limited circulation, and for many it is the only electronic access that is available. Some have volumes back to the 1940s, and a number obtained small grants for the digitizing of older materials. A few have one or two year embargoes on the newest issues, but most do not. Recently, several of the journals have dropped their embargoes.

AgEcon Search began in 1994 as a local solution for the applied and agricultural economics working papers from the University of Minnesota and the University of Wisconsin. It is housed at the University of Minnesota, and co-sponsored by the Agricultural and Applied Economics Association (AAEA).

The involvement of the large professional associations has been critical to the success of AgEcon Search. Economists presenting Contributed Papers at the annual AAEA meeting must submit their full papers to AgEcon Search prior to the meeting, or they will be dropped from the program. The European Association of Agricultural Economists and the International Association of Agricultural Economists have adopted similar procedures.

Two librarians, Louise Letnes and Julie Kelly, serve as coordinators of AgEcon Search. They work at the University of Minnesota in the Department of Applied Economics and the University Libraries, respectively. Among their duties, they attend agricultural and applied economics conferences to promote AgEcon Search and recruit new material.

RePEc archives: BC Statistical Software Components archive

January 16, 2009

One of the most actively accessed RePEc series is the Boston College Statistical Software Components (SSC) archive. This was the first RePEc series to list software, rather than working papers, journal articles, books or chapters. It currently contains 1,275 items in a number of programming languages, over 1,000 of which relate to the Stata statistical package. Stata has the unique capability to download user-written components and install them over the web, and its developers have in fact written a ‘ssc’ command that accesses the archive. Users may search the SSC Archive from within Stata or from the web interface of IDEAS or EconPapers.

The series is the 7th most popular series (in terms of total downloads) over the past 12 months, as documented by LogEc for downloads through RePEc services. Downloads of Stata components (“ado-files”), including directly from Stata, are tracked separately, and total over 100,000 per month. A custom perl script is used to translate the RePEc template for each package into the Stata .pkg file format used by web-aware Stata. The availability of a single, reliable site from which user-written routines can be easily downloaded has made the SSC Archive a very important part of the Stata user community.

October 14, 2008, Open Access Day

October 14, 2008

October 14, 2008, has been declared Open Access Day to increase the awareness of Open Access. RePEc, and its predecessors, have been promoting open access for 15 years now, by enhancing the dissemination of preprints, which in Economics are usually called working papers or discussion papers. A quarter million of them are now listed, with many of them being close versions of published articles that are hidden behind a publisher’s paywall. Whenever possible, we link the two versions. The conditions are that the titles be very similar and the author be registered in the RePEc Author Service, having claimed all version in the research profile. RePEc also indexes numerous open access journals, with their article labeled to recognize free downloads.

In this respect, it is important to note that the vast majority of publishers allow authors to publish working papers, in many cases even as post-prints (after publication of the journal article). Through the linking between versions we do in RePEc, this essentially comes to make pay-journals open access. For a list of publishers and there policies, see SHERPA/RoMEO.

Are Open Access works popular? We have not systematically studied this so far, but consider the following. A working paper available online has been downloaded on average 1.77 times in September 2008 (after numerous corrections to eliminate robots and multiple downloads), while the figure stands at 0.97 for journal articles (including those that are open access). Also many working paper series have impact factors superior to many journals, highlighting that researchers in Economics do not hesitate to cite pre-prints.

Call for Papers: The Economics of Limited and Open Access Publishing

May 29, 2008

Economic Analysis and Policy (EAP) is a 38 year old journal published by the Economic Society of Australia (Queensland branch) that has just adopted an open access policy. To celebrate this important step, EAP intends to publish in 2009 a special issue on the Economics of publishing, with special reference to different business models, like the commercial, university press, open access and pre-print models. Academic publishing is undergoing a profound transformation that we wish to better understand.

EAP particularly seeks to publish passionate, critical, and controversial articles. It is open to orthodox but also unorthodox approaches.

We expect to publish 5 to 8 articles. They will be peer-reviewed under the guest editorship of Christian Zimmermann (University of Connecticut). Please submit your manuscript in PDF format through the journal’s online submission.

Update: The submission deadline is set for November 1, 2008.


