Improved usage statistics for RePEc

August 6, 2010

Usage statistics for RePEc services are collected by LogEc. Producing meaningful statistics for accesses to web servers is a difficult task, especially so since we are merging data from several different sites. Rather than just counting the number of times a page or file is accessed (by a human or a piece of software indexing the web) the goal is to get as close as possible to a measure of the number of people showing an interest in a paper by reading the abstract page or downloading the full text file.

We have always been applying very strict criteria for what should be counted as a download or abstract view but over time it has become clear that the simple filtering for robots and removal of double clicks is not enough. Many new practices has developed on the web, some for a good purpose, some for a more questionable purpose. There are spam-bots, referer spamming (a stupid idea if there ever was one), anti-malware software that checks links on a webpage and warn users about dangerous links and much, much more that should not be counted. And, yes, there appears to be the occasional attempt to manipulate the statistics.

Starting from July 2010 we apply an additional set of heuristics to filter out these accesses. In conjunction with this we have also recalculated the statistics going back to January 2008. The overall effect is relatively small but there are substantial reductions in the number of accesses for a small number of papers.

More information at LogEc.