On RePEc traffic numbers

RePEc provides traffic statistics to authors, editors, and archive and series maintainers. LogEc displays them to the public. These downloads and abstract views are obtained from some, but not all, RePEc services which compile their server log files: Economists Online, EconPapers, IDEAS, NEP and Socionet. The raw logs are purged from anything that does not look like human traffic (robots, spiders), repeat traffic as well as anything that does not seem licit. This typically divides traffic numbers by four or five. More details are available at LogEc.

Several people have noticed that RePEc traffic numbers have exhibited a downward trend over the last couple of years. We do not have a definite answer, but we have a few possible explanations for this trend.

Tightening of criteria

The criteria for what is considered licit traffic have been tightened in July 2010, retroactively to January 2008 (see blog post). This can explain a one-time decrease in reported traffic, but not the trend.

Server caches

We noticed that several institutions run server caches of IDEAS. This means that any request to an IDEAS page comes from this server, and thus requests from separate people count as one. Worse, if there are many such requests, the server is considered a robot and none count. While this has an impact on traffic numbers, it is not believed to be a major impact.

Proxy servers

These are servers through which all web traffic of an institution is routed. The main goal is security: all computers behind the proxy server appear to have the same IP address, and a potential attacker cannot find the individual IP addresses. The impact is the same as with the above: all traffic from an institution (for example the hundreds of economists from the US Federal Reserve System) are considered to be from a single person, and possibly a illicit robot. This can explain stagnating traffic and, if the adoption of proxy servers is increasing, even decreasing traffic.

Non-reporting RePEc services

Several RePEc services are not reporting traffic logs to LogEc. If they are diverting traffic away from reporting services, the visible statistics suffer. As RePEc is getting used more and more by bibliographic services, the declining recorded traffic gives then a false image of the evolution.

Google Scholar priorities

Substantial traffic is coming from Google Scholar, which used to rely substantially on RePEc for its initial launch, at least for economics material. As Google established partnerships with commercial publishers, commercial material is now privileged in search results and RePEc is often confined to the much less prominent “other versions.”

Better usage

It could also be that users are getting more efficient at finding what they are looking for, either because they are getting better at it, or because the RePEc services have improved their websites. If one needs to read fewer abstracts until one finds the right works, this will lower traffic and increase user satisfaction.

Reduced popularity

Finally, it could be also that RePEc services are becoming less popular, given the existence of several good alternatives (which are almost all using RePEc data). The fact that raw traffic, which includes all that LogEc eliminates, is still going up would invalidate this argument, unless this increase is entirely due to an increase in robot traffic.

All in all, we are not sure why we have the decrease in the “visible” traffic numbers. Maybe our loyal readers have further suggestions.

This entry was posted on Saturday, May 19th, 2012 at 12:06 pm and is filed under Use of RePEc data. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The RePEc Blog