Why reported traffic is declining

September 28, 2014

Back in May 2012, we were complaining that reported traffic on RePEc sites was declining. This trend has continued and we need to revisit the issue.

Looking at the graphs at LogEc it is quite obvious that traffic is not increasing as you would expect, including accounting for the the fact that there is actually more and more material indexed on RePEc. Before looking for the reasons, we need to explain how these statistics are computed.

Only a limited set of RePEc services reported the detailed traffic statistics needed to compute this: EconPapers, IDEAS, NEP and Socionet. Aggregate numbers are not sufficient for other RePEc services to report their statistics, one needs a lot of details to determine whether traffic is robotic or human, to remove duplicates and to detect fraud attempts. If fact about 90% of total traffic is rejected for statistical purposes on those grounds. This complexity makes that several sites that use RePEc data are not reporting anything about their traffic. This includes: EconLit, EconStor, Google Scholar, Inomics, Microsoft Academic Search, OAISter/WORLDCAT, Scirus, Sciverse and very likely more. The fact that the data collected by RePEc is used in many places is not contrary to our mission. We want to improve the dissemination of research in Economics. But we seem to be able to track only a fraction of its use. As the number of RePEc services reporting statistics has not increased, while the number of sites using RePEc data has, we could explain the decrease in reported traffic as cannibalization. The overall use may have increased, and user satisfaction too, but we cannot demonstrate it.

Of course, given that we are filtering the traffic statistics, we may be filtering too much, and increasingly so. We have indeed tightened some rules over time, mostly to avoid counting new traffic patterns that are visibly not legitimate. For example, IDEAS threw out 3.4 millions abstract views (or two per listed abstracts) in July 2014 thanks to a single pattern rule that was introduced about a year ago. But this pattern was previously not problematic, so it is difficult to conclude that such tightening can explain a reduction in traffic. It remains a fact that the proportions of traffic that is excluded is steadily increasing. In raw numbers, IDEAS keeps breaking records. It filtered numbers, traffic is declining. Is it because there are really more and more robots out there?

The same applies to other potential explanations: Several institutions are caching our websites. several have all their members access the web through a single IP address and are thus undistinguishable to us. In both cases, downloads by different users look to us like they are coming from the same person and are counted only once. Is this more prevalent than before? Yes in both cases, but caching is very minor, and IP bundling pertains mostly to governmental institutions and corporate networks. How much this matters is difficult to evaluate.

The big elephant in the house is traffic coming from search engines, and most importantly Google. Google has changed its ranking criteria over time. Google Scholar has started privileging the original source over aggregators like RePEc several years ago, and the impact has been increasing as more publishers give Google Scholar direct access to their repositories. This pertains also to the general Google search engine. For example, traffic from Google to IDEAS dropped by a third from one day to the next on May 22, 2014, after Google decided to penalize the search ranking of aggregator web pages.

Finally, we cannot exclude that RePEc services are indeed less popular, which is bad. But if this is because people are more easily finding what they are looking for, then this is good, as the core missing of RePEc is to improve the dissemination of research in economics.


The value of RePEc — an introduction

September 13, 2014

I am Thomas Krichel the principal founder of RePEc. This is my second  contribution here. I plan to write more in the com on fundamental aspects of RePEc. And I’ll give some explanation about RePEc history. My particular expertise is how RePEc came about.

Today let me try to say something about the value of RePEc. In some, though not all aspects, RePEc is a digital and open equivalent of what librarians have long been calling abstracting and indexing (A&I) databases. A&I data is must common of academic journal literature. It lists descriptive information about journal articles past and present. These days, such databases appear to be of declining value. Librarians have been canceling with the argument that users want full text, not just an abstract. Here the description of the paper is a poor (wo)man’s version of the document itself, which of course would have that description. For WoPEc‐-the forerunner of RePEc‐-I took the opposite view. The full-text location was simply an attribute of the description of the paper.

In the early 90s, when I started the work on WoPEc, the fact that anything was freely available on the web was seen with some suspicion. I recall a radio comment at that time, about some company, and the comment about them was something like “They are now on the Internet, which is a euphemism for saying that they gone out of business”. Among economists in particular, the notion that free means cheap and cheap means bad, seemed to have a lot of appeal. Therefore I was keen that RePEc should not just be cheaper, but also be better than existing A&I databases. In 1998, I started to work on the key component of that vision, the RePEc Author Service. I designed the service and my student Markus J.R. Klink implemented it. At that point, I was not aware of any A&I product that implemented author identification. And for such there was no way that anybody would have implemented any service that would allow authors to claim papers. Of course the fact that Christian had worked on collection institutional data already was of great help to make this even more attractive.

Well, enough about pioneering works. I did promise to write about the value of RePEc, didn’t I? The key value I see is in identifying documents, authors and institutions and build linkages based on these identifications. Thus even if all papers in economics would be freely available, in open access journals or working papers sites of institutions and they would be staying there, we still would not have implemented the value of RePEc. The value does not come from individuals using a search engine and finding something of interest. Our value comes in the linkages like “this working paper was never published” or “this paper is cited by this other paper”, or “these two authors are co-authors”. If the coverage of economics through RePEc is complete, we can make such assertions with certainty. And we can make the assertions without further human work. For example through the fact that we have two papers that have identified authors, we can say that the two authors are co-authors.  Since the data is freely available that can be used in a co-authorship system. Or if we know that one paper cites another, we can export this into a system that solicits information about why the citation took place.  Linkages and open information go hand in hand in RePEc.


Get your latest economic research through Twitter: NEP now tweets

August 24, 2014

RePEc has been disseminating almost from its start in 1997 new working papers through NEP (New Economics Papers). When RSS feeds became popular, that means of dissemination was added. And now that scholars have adopted Twitter, NEP is there, too.

Each of the about 90 field-specific reports is now tweeting. The papers are hand-picked by academic volunteer editors among all new working papers that are available online. As the email reports may contain several dozen papers in each weekly mailing, the tweets are throttled to no more than one an hour for each report. To subscribe, log into Twitter, go to the NEP home page, click on the report(s) relevant to you, then click on the Twitter link and finally the follow button.

As all RePEc projects, this service is of course free.


Conference on the challenges on economic information and data

June 17, 2014

The Research Division of the Federal Reserve Bank of St. Louis is hosting a conference on the challenges of economic information and data. The event will take place September 29 and 30, 2014, and will feature Hal Varian (Chief economist, Google) and Neil Fantom (Open Data manager, the World Bank). Submissions are invited until July 9, 2014. The conference’s website is here


Let the Reader Decide!

May 27, 2014

I just learned about  a new journal with a new concept that sounds interesting: Royal Society Open Science. It has a review process and will publish all articles which are scientifically sound, leaving judgement of importance and impact to the reader.

This seems apposite because printing costs and distribution costs are practically absent in the Internet age. So there is no big point in rationing publication space (that is not scarce anymore) by  “importance” or “impact”.

Unfortunately this  journal does not cover economics.


How is RePEc collecting its data?

March 26, 2014

Likely the most frequent request RePEc is getting is an author who wants us to add some publications to the database and wonders why our “spider” has not picked them up. The second most frequent is a publisher wondering why RePEc is neglecting to disseminate its output. The problem is that this is not at all the way RePEc functions. This short post provides the basics of how the metadata (the data describing the research documents) gets into RePEc.

The principle is that metadata comes directly from the providers. By providers we mean commercial publishers for their books and journals, or university departments for their working papers, or research centers for their papers, or policy institutions for their various publications. Thus, RePEc does not have a spider that surfs the entire Internet and tries to infer what it is that it stumbles upon. Rather, RePEc knows exactly where to look for the information that has been formatted in a way to optimize its usefulness. And if an author finds some publications are missing, it is either because the provider is not (yet) participating in RePEc, in which case it can follow these instructions, or because the provider has incomplete data, in which case a technical contact is listed on the RePEc page of the relevant journal or series and can help.

Desperate authors can also upload their works at the Munich Personal RePEc Archive, sponsored by the Library of the University of Munich, as long as they have the rights to do so (check here).

Why is RePEc data collection organized in such a way? We want RePEc to be free for all, so it needs to be set up in a way that does not generate costs. Thus, we put the burden of indexing on those who benefit the most from it, the providers. And close to 1700 are willing to do so. Any remaining central duties are picked up by the RePEc team.


How to follow what is new in economics research

February 20, 2014

RePEc offers various tools to keep abreast of latest research developments in economics. Keep in mind that due to the unusually long refereeing and publication process in this field, following what is coming out in journals is often not the best way to keep current. The research frontier is advancing with working papers, and this is why RePEc puts a special focus on those. Note that all resources below are free, as always for RePEc services.

NEP

NEP (New Economics Papers) offers email lists and RSS feeds that disseminate approximately every week the latest online working papers across over 90 fields. Field-relevance is determined by volunteer editors who pick the appropriate papers among all working papers newly listed on RePEc during the previous week. Note that if you think a topic is not appropriately covered, you can volunteer as editor of a new report.

MyIDEAS

MyIDEAS allows you to follow new additions to JEL codes, author profiles, series and journals. This is done through the creation of an account on the IDEAS website. Once logged in, you can add the relevant items while navigating the site.

EconPapers Search

EconPapers allows to limit the search results to documents added recently to RePEc. Use the “Modified last” selection at the bottom left of the search form. One can also limit the list of items by JEL code and recency here.

IDEAS Search

Similarly, IDEAS allows to restrict search results to specific years. When looking up by JEL code, items are sorted with the most recent first.

EconAcademics

EconAcademics follows the latest discussion of research on the blogosphere. While it does not necessarily mean this is the most recent research, it is often the case.


Follow

Get every new post delivered to your Inbox.

Join 308 other followers