NEP: 30000 reports and going

November 24, 2010

NEP (New Economics Papers) is an important element in the collection of services that use RePEc data. It disseminates through email and RSS weekly reports about new working papers in 85 different fields, each compiled by volunteer editors. This project has recently surpassed 30000 reports sent since 1998 to currently over email 60000 subscriptions from close to 30000 unique email addresses, announcing over 150000 papers on average to two field reports.

The quantity of information digested by this project has grown considerably over the years. Currently about 500 new papers a week are analyzed, a number too large for editors to manage. Thus several years ago an expert system has been put in place that learns on the choices of the editors and offers them every week the complete list of papers for selection, but placing the most likely choices first. It is remarkable how well this works, thereby saving our volunteers considerable time.

Volunteers are still welcome, for example to help with the general management of the project, help with existing reports or open new reports in fields not yet covered. Interested people should contact Marco Novarese.


NEP: Dissemination of new research through email and RSS

September 26, 2010

NEP (New Economics Papers) is a free, email-based service that disseminates weekly new working papers appearing in RePEc, currently 300-500 a week. It has 85 different, field-specific mailing lists, each managed by a volunteer editor who determines which papers are relevant for his/her field, with the help of an expert system.

So far, 150,000 papers have disseminated in about 30,000 reports, each paper being on average presented in two different reports. The subscriber base is close to 30,000 people, with over 60,000 subscriptions. NEP also offers RSS feeds, as well as some blogs discussing research (see sidebar). Of course, as everything in RePEc, everything is free and supported by volunteers.

We encourage you to use these services, and also to volunteer to help with running NEP, for example, if your field is not yet covered. NEP is currently headed by Marco Novarese and hosted by SUNY Oswego. Thomas Krichel and William Goffe offer technical support.


A few new features on RePEc services

August 30, 2010

RePEc services display the data collected through RePEc to end-users, be it through the web or email. These services constantly improve with new features. We recently reported about some that users may have overlooked. Here are some new ones:

  • On Econpapers, abstract pages now include a “share” button, which allows to easily share or save the page with hundreds or other services.
  • There are now RSS feeds for new citations, for specific articles, papers, series and journals. EconPapers and IDEAS provide links on their pages to the feeds. There are no feeds for authors, as they have been receiving this information for years through email (if registered).
  • A big part of RePEc is driven by user submissions, and here is a nice example. A script that allows to parse NEP reports, downloads the pdf files and put references in a BibTeX file.
  • Another user contribution, not new but I forgot it last time: RePEcfb, a Facebook application that allows to displays your latest works in your profile.


Improved usage statistics for RePEc

August 6, 2010

Usage statistics for RePEc services are collected by LogEc. Producing meaningful statistics for accesses to web servers is a difficult task, especially so since we are merging data from several different sites. Rather than just counting the number of times a page or file is accessed (by a human or a piece of software indexing the web) the goal is to get as close as possible to a measure of the number of people showing an interest in a paper by reading the abstract page or downloading the full text file.

We have always been applying very strict criteria for what should be counted as a download or abstract view but over time it has become clear that the simple filtering for robots and removal of double clicks is not enough. Many new practices has developed on the web, some for a good purpose, some for a more questionable purpose. There are spam-bots, referer spamming (a stupid idea if there ever was one), anti-malware software that checks links on a webpage and warn users about dangerous links and much, much more that should not be counted. And, yes, there appears to be the occasional attempt to manipulate the statistics.

Starting from July 2010 we apply an additional set of heuristics to filter out these accesses. In conjunction with this we have also recalculated the statistics going back to January 2008. The overall effect is relatively small but there are substantial reductions in the number of accesses for a small number of papers.

More information at LogEc.


RePEc Author Service reaches major mark

August 4, 2010

The RePEc Author Service has just welcomed the 25,000th author! This service allows economists to build an online profile with all the works they have authored and that are listed in RePEc. A part from having this profile displayed and linked to from individual works on RePEc services like EconPapers and IDEAS, this allows authors to obtain monthly statistics about the popularity of their works, along with new citations discovered by the CitEc project. Collected data is also used to computed various rankings. Note that the 25,000 count only includes registered people who have at least one work listed in the profile. There are about 7,000 other registrations with empty profiles from people who have either overlooked this feature or not yet published some works. A listing of all registered authors is available on EconPapers and IDEAS.

RePEc currently lists 940,000 works from close to 3000 working paper series and 1150 journals, among others, contributed by over 1200 archives. It has become the standard bibliographic database in Economics, with RePEc services recording the 50 millionth download during July 2010. All RePEc activities are driven by volunteers as RePEc is not funded.


500,000 journal articles listed on RePEc

February 25, 2010

The number of articles indexed on RePEc has recently surpassed half a million, with 88% linked to an online version. All these articles have been published in over 1000 journals listed on RePEc.

Journal articles now comprise the majority of the research material on RePEc, but this has not always been so. In fact, in the early days of RePEc, working papers (pre-prints) constituted the vast majority. But as commercial publishers started noticing how popular RePEc and its services were becoming, they wanted to be listed as well. They made the effort of converting their meta-data to our format and make it available at no charge. A few years back, this would have been unimaginable. Little by little, all major publishers opened RePEc archives. Nowadays, it is small independent publishers who join, along with various open access journals that look for free and efficient dissemination of their content through RePEc.


How abstract views and downloads are counted

September 19, 2009

Authors and RePEc archive maintainers receive monthly emails with various statistics, and among the most anticipated statistics are our abstract views and download counts. It is important to understand how those statistics are collected and what they measure (and do not measure). Full statistics are available on the LogEc website managed by Sune Karlsson from Örebro University (Sweden).

Participating RePEc services (EconPapers, IDEAS, NEP and Socionet) keep a log of all activity on their sites. This allows us to count page views for the abstract pages of each items in the database (excluding NEP, as abstracts are listed in emails). Logs also record outclicks as users leave the RePEc services to the sites containing the full texts they seeks to download. This allows us to count “downloads”. Quotation marks are required as it is impossible to record whether the download was successful, for example in the case of gated publisher sites. Note also that this means that downloads that have not transited through a RePEc cannot be counted, as we do not have access to local logs.

LogEc gathers the logs from the participating services and aggregates the statistics. This involves much more than bean counting, though. Indeed, one first needs to exclude robot activity, as only human activity is of interest. Some robots declare themselves as such, but other hide their identity. One has thus to infer from various patterns what IP addresses are likely robots. This is an important step, as robots represent typically 75% of raw abstract views. Robots include spiders from many search engines as well as other initiatives on the Internet.

One needs also to weed out multiple views or downloads by the same user. This brings us to detecting attempts at increasing counts by authors. Obviously, we cannot reveal here how this is done, but let it be known that we have detected fraud even by authors using multiple Internet service providers. The methods used lead to some undercounting, though. Multiple users behind the same cache server may be counted only once, as it may for example happen to employees of the US Federal Reserve Banks that use RePEc.

And we are still not done pruning. LogEc then checks for additional patterns that need to be vetted by a human eye. Unusual activity is then checked and often reconciled with traffic from popular blogs, magazines and newspapers. But on other occasions, traffic surges cannot be explained in licit ways and need to be cleaned out.

After all these manipulations, statistics are published and disseminated. And despite substantial pruning, RePEc services still get over 2,000,000 abstract views and 600,000 downloads every month. See LogEc for details.


RePEcFB – An integration of your RePEc data into your Facebook profile

September 9, 2009

Following a suggestion on this blog and the creation of a RePEc Facebook group, we are happy to announce that a new service went online last week. The Facebook application RePEcFB allows Facebook users to integrate their RePEc data into Facebook. Economists on Facebook can create a small profile box listing their recent work, or a “My research” tab in the Facebook profile giving information about their working papers, publications and other research output. Users can list their affiliations and professional contact data, announce recent papers authored by their Facebook friends, or inform about conferences and other academic events they are going to attend. New papers or affiliations can be directly posted to the Wall and can be commented on by friends.

To use the application you both need a Facebook account and a RePEc author profile with RePEc Author Service. Detailed instructions can be found on the Notes tab of the application’s homepage.

RePEcFB was written by Ben Greiner with the help of László Kóczy, Sune Karlsson, and Thomas Krichel. The software is hosted on Sune’s server at Örebro University. The software is under ongoing development, so feel free to send comments to the author.


MPRA, the Munich Personal RePEc Archive

August 27, 2009

The Munich Personal RePEc Archive (MPRA) has been started three years ago. It has developed into one of the largest archives within the RePEc network, comprising roughly 9000 items at the time of writing. Christian Zimmermann has suggested that I share some toughs about its history and functioning.

The initial idea occurred to me when I heard that the Economics Working Paper Archive (EconWPA), run by Bob Parks, was discontinued in 2005. EconWPA offered the possibility for individual authors to make their contributions accessible to the community through the RePEc network, given that only institutions can set up RePEc archives. Although we have in Munich our discussion paper series integrated into RePEc, not all economists are so fortunate, and the need for a personal archive (as distinct from an institutional archive) was apparent.

Given that we had successfully established our department’s discussion paper series with the EPrints software, it appeared technically feasible to clone the software and use it for a personal RePEc archive. Discussion on the internal RePEc list led to the name “Munich Personal RePEc Archive,” the main concern being to clarify that the archive was intended as a RePEc service, rather something  original, and that the name would not exclude other personal RePEc archives in other locations. (If one of the other Munich universities wants to start another personal archive, we may get into a problem…)

I asked Volker Schallehn from the University Library, who has implemented the EPrints software for our university archives, about the possibility to help with such a project. He agreed to help. The next step was to convince the president of the university as well as the director of the library to agree dedicating some resources to the endeavor that would not serve people from Munich at all. They were in favor, and so we got started on September 19, 2006.

From a technical point of view the main problem was to automatize as much as possible, as we could not supply manpower: The generation of title pages, the  creation of metadate in the ReDif format required by the RePEc harvester, and the linking to the RePEc author service. With the help of  Thomas Krichel, Christian Zimmermann, Kit Baum, Sune Karlsson, Ivan Kurmarov, and others we manged to solve these problems and set up the website. We found editors. They do the main job now. The English editors handle often more than 50 submissions per day.

As the Eprints software permits to establish series in different languages, we decided to use these feature and to offer the service in all languages for authors who deal with country-specific issues and want to make their research available in their local language. However we require for all submissions English abstracts such that all users can obtain an impression what economists writing in other languages do and, if necessary, contact them. This feature has lead to quite a number of submissions in languages like Spanish or French, and to some smaller sets in Turkish, Arabic, and others. (Some of them look extremely pretty.) Maybe this feature creates a sense that all economists world-wide see themselves as members of a community with the common purpose of helping to improve living conditions around the globe.

A central motivation for establishing a pre-print archive like MPRA was to enable authors to secure the copyrights for their pre-print versions in case the copyright for the final article goes to the publisher. This permits open access to their work, even if publishers try to make the final work inaccessible for the non-paying public. This is a great convenience for academics and, I hope, generates a countervailing power that keeps a check on journal prices. Further, this arrangement provides a means for the authors to make their work accessible to others through the RePEc services.

As an unintended by-product some authors have obtained requests from publishers to publish their contribution in a volume or journal. This may indicate a trend for the future: While authors submitted their works to publishers (and paid for it), in the future simply put your stuff on the net, and publishers approach you in order to create collections that generate value added beyond mere publication, such that people and libraries a willing to pay for it. If MPRA could contribute to such a development, this would be nice.

It is quite astonishing to me how many good papers we obtain, in spite of the fact that we do no refereeing at all. (The editors check only some formal aspects, making sure that the submission is of academic nature, and a certain convention has emerged in this respect.)

MPRA offers a public forum for publishing papers, but not only that: It offers the possibility to publish comments on papers in the archive. This feature is not used. Maybe somebody has a suggestion how to organize discussions around papers such that people actually feel inclined to use such a feature.

So much about MPRA. If you have any suggestions, please feel free to communicate and discuss them on this blog.


On versioning in RePEc

August 21, 2009

RePEc carries research in various formats. While journal articles are unique (with very few exceptions), working papers, as they are pre-prints, may be duplicates of listed articles, and they may even appear in different versions, either because they are published in different series, or because there may be updates within a series. We believe that is important to carry all versions, not just the last one, for the following reasons.


  1. Time-stamps: A working paper allows to establish when some research was conducted and thus determines preeminence of research ideas. Given publication delays in Economics, this can be important.
  2. Open access: Many journal articles have gated access. Such restrictions can be bypassed by reading working papers, which are mostly open access.
  3. Link to published version: It is still preferred to use published versions in citations, especially once a paper is accepted in a journal. The originally cited working paper is often linked to its published version.
  4. Visibility: Working papers are much more read than journal articles, both because they are more current and they are freely available. In addition, working papers are disseminated through NEP.

The process of linking the various versions of the same work is not obvious, however. With about 800,000 works in RePEc, performing matches on titles is a daunting task, especially as fuzzy matching is necessary due to slight variations in punctuation and spelling. For this reason, we do the matching only across the works listed in an author’s profile. This ensures that the likelihood of two works being different versions of the same one to be very close to 100%. But this also means that such matching cannot be done for works where none of the authors is registered, or where a registered authors did not add all versions to the profile, thereby indicating he/she is not the author of this particular version, rightly of wrongly.

In some cases, titles change across versions, or journal editors require a title change. In such cases, a manual link between versions can be added, just contact a member of the RePEc team with the relevant RePEc handles.


Follow

Get every new post delivered to your Inbox.

Join 230 other followers