NEP: 30000 reports and going

November 24, 2010

NEP (New Economics Papers) is an important element in the collection of services that use RePEc data. It disseminates through email and RSS weekly reports about new working papers in 85 different fields, each compiled by volunteer editors. This project has recently surpassed 30000 reports sent since 1998 to currently over email 60000 subscriptions from close to 30000 unique email addresses, announcing over 150000 papers on average to two field reports.

The quantity of information digested by this project has grown considerably over the years. Currently about 500 new papers a week are analyzed, a number too large for editors to manage. Thus several years ago an expert system has been put in place that learns on the choices of the editors and offers them every week the complete list of papers for selection, but placing the most likely choices first. It is remarkable how well this works, thereby saving our volunteers considerable time.

Volunteers are still welcome, for example to help with the general management of the project, help with existing reports or open new reports in fields not yet covered. Interested people should contact Marco Novarese.

NEP: Dissemination of new research through email and RSS

September 26, 2010

NEP (New Economics Papers) is a free, email-based service that disseminates weekly new working papers appearing in RePEc, currently 300-500 a week. It has 85 different, field-specific mailing lists, each managed by a volunteer editor who determines which papers are relevant for his/her field, with the help of an expert system.

So far, 150,000 papers have disseminated in about 30,000 reports, each paper being on average presented in two different reports. The subscriber base is close to 30,000 people, with over 60,000 subscriptions. NEP also offers RSS feeds, as well as some blogs discussing research (see sidebar). Of course, as everything in RePEc, everything is free and supported by volunteers.

We encourage you to use these services, and also to volunteer to help with running NEP, for example, if your field is not yet covered. NEP is currently headed by Marco Novarese and hosted by SUNY Oswego. Thomas Krichel and William Goffe offer technical support.

A few new features on RePEc services

August 30, 2010

RePEc services display the data collected through RePEc to end-users, be it through the web or email. These services constantly improve with new features. We recently reported about some that users may have overlooked. Here are some new ones:

  • On Econpapers, abstract pages now include a “share” button, which allows to easily share or save the page with hundreds or other services.
  • There are now RSS feeds for new citations, for specific articles, papers, series and journals. EconPapers and IDEAS provide links on their pages to the feeds. There are no feeds for authors, as they have been receiving this information for years through email (if registered).
  • A big part of RePEc is driven by user submissions, and here is a nice example. A script that allows to parse NEP reports, downloads the pdf files and put references in a BibTeX file.
  • Another user contribution, not new but I forgot it last time: RePEcfb, a Facebook application that allows to displays your latest works in your profile.

Improved usage statistics for RePEc

August 6, 2010

Usage statistics for RePEc services are collected by LogEc. Producing meaningful statistics for accesses to web servers is a difficult task, especially so since we are merging data from several different sites. Rather than just counting the number of times a page or file is accessed (by a human or a piece of software indexing the web) the goal is to get as close as possible to a measure of the number of people showing an interest in a paper by reading the abstract page or downloading the full text file.

We have always been applying very strict criteria for what should be counted as a download or abstract view but over time it has become clear that the simple filtering for robots and removal of double clicks is not enough. Many new practices has developed on the web, some for a good purpose, some for a more questionable purpose. There are spam-bots, referer spamming (a stupid idea if there ever was one), anti-malware software that checks links on a webpage and warn users about dangerous links and much, much more that should not be counted. And, yes, there appears to be the occasional attempt to manipulate the statistics.

Starting from July 2010 we apply an additional set of heuristics to filter out these accesses. In conjunction with this we have also recalculated the statistics going back to January 2008. The overall effect is relatively small but there are substantial reductions in the number of accesses for a small number of papers.

More information at LogEc.

RePEc Author Service reaches major mark

August 4, 2010

The RePEc Author Service has just welcomed the 25,000th author! This service allows economists to build an online profile with all the works they have authored and that are listed in RePEc. A part from having this profile displayed and linked to from individual works on RePEc services like EconPapers and IDEAS, this allows authors to obtain monthly statistics about the popularity of their works, along with new citations discovered by the CitEc project. Collected data is also used to computed various rankings. Note that the 25,000 count only includes registered people who have at least one work listed in the profile. There are about 7,000 other registrations with empty profiles from people who have either overlooked this feature or not yet published some works. A listing of all registered authors is available on EconPapers and IDEAS.

RePEc currently lists 940,000 works from close to 3000 working paper series and 1150 journals, among others, contributed by over 1200 archives. It has become the standard bibliographic database in Economics, with RePEc services recording the 50 millionth download during July 2010. All RePEc activities are driven by volunteers as RePEc is not funded.

500,000 journal articles listed on RePEc

February 25, 2010

The number of articles indexed on RePEc has recently surpassed half a million, with 88% linked to an online version. All these articles have been published in over 1000 journals listed on RePEc.

Journal articles now comprise the majority of the research material on RePEc, but this has not always been so. In fact, in the early days of RePEc, working papers (pre-prints) constituted the vast majority. But as commercial publishers started noticing how popular RePEc and its services were becoming, they wanted to be listed as well. They made the effort of converting their meta-data to our format and make it available at no charge. A few years back, this would have been unimaginable. Little by little, all major publishers opened RePEc archives. Nowadays, it is small independent publishers who join, along with various open access journals that look for free and efficient dissemination of their content through RePEc.

How abstract views and downloads are counted

September 19, 2009

Authors and RePEc archive maintainers receive monthly emails with various statistics, and among the most anticipated statistics are our abstract views and download counts. It is important to understand how those statistics are collected and what they measure (and do not measure). Full statistics are available on the LogEc website managed by Sune Karlsson from Örebro University (Sweden).

Participating RePEc services (EconPapers, IDEAS, NEP and Socionet) keep a log of all activity on their sites. This allows us to count page views for the abstract pages of each items in the database (excluding NEP, as abstracts are listed in emails). Logs also record outclicks as users leave the RePEc services to the sites containing the full texts they seeks to download. This allows us to count “downloads”. Quotation marks are required as it is impossible to record whether the download was successful, for example in the case of gated publisher sites. Note also that this means that downloads that have not transited through a RePEc cannot be counted, as we do not have access to local logs.

LogEc gathers the logs from the participating services and aggregates the statistics. This involves much more than bean counting, though. Indeed, one first needs to exclude robot activity, as only human activity is of interest. Some robots declare themselves as such, but other hide their identity. One has thus to infer from various patterns what IP addresses are likely robots. This is an important step, as robots represent typically 75% of raw abstract views. Robots include spiders from many search engines as well as other initiatives on the Internet.

One needs also to weed out multiple views or downloads by the same user. This brings us to detecting attempts at increasing counts by authors. Obviously, we cannot reveal here how this is done, but let it be known that we have detected fraud even by authors using multiple Internet service providers. The methods used lead to some undercounting, though. Multiple users behind the same cache server may be counted only once, as it may for example happen to employees of the US Federal Reserve Banks that use RePEc.

And we are still not done pruning. LogEc then checks for additional patterns that need to be vetted by a human eye. Unusual activity is then checked and often reconciled with traffic from popular blogs, magazines and newspapers. But on other occasions, traffic surges cannot be explained in licit ways and need to be cleaned out.

After all these manipulations, statistics are published and disseminated. And despite substantial pruning, RePEc services still get over 2,000,000 abstract views and 600,000 downloads every month. See LogEc for details.


Get every new post delivered to your Inbox.

Join 308 other followers