Improved usage statistics for RePEc

August 6, 2010

Usage statistics for RePEc services are collected by LogEc. Producing meaningful statistics for accesses to web servers is a difficult task, especially so since we are merging data from several different sites. Rather than just counting the number of times a page or file is accessed (by a human or a piece of software indexing the web) the goal is to get as close as possible to a measure of the number of people showing an interest in a paper by reading the abstract page or downloading the full text file.

We have always been applying very strict criteria for what should be counted as a download or abstract view but over time it has become clear that the simple filtering for robots and removal of double clicks is not enough. Many new practices has developed on the web, some for a good purpose, some for a more questionable purpose. There are spam-bots, referer spamming (a stupid idea if there ever was one), anti-malware software that checks links on a webpage and warn users about dangerous links and much, much more that should not be counted. And, yes, there appears to be the occasional attempt to manipulate the statistics.

Starting from July 2010 we apply an additional set of heuristics to filter out these accesses. In conjunction with this we have also recalculated the statistics going back to January 2008. The overall effect is relatively small but there are substantial reductions in the number of accesses for a small number of papers.

More information at LogEc.


RePEc Author Service reaches major mark

August 4, 2010

The RePEc Author Service has just welcomed the 25,000th author! This service allows economists to build an online profile with all the works they have authored and that are listed in RePEc. A part from having this profile displayed and linked to from individual works on RePEc services like EconPapers and IDEAS, this allows authors to obtain monthly statistics about the popularity of their works, along with new citations discovered by the CitEc project. Collected data is also used to computed various rankings. Note that the 25,000 count only includes registered people who have at least one work listed in the profile. There are about 7,000 other registrations with empty profiles from people who have either overlooked this feature or not yet published some works. A listing of all registered authors is available on EconPapers and IDEAS.

RePEc currently lists 940,000 works from close to 3000 working paper series and 1150 journals, among others, contributed by over 1200 archives. It has become the standard bibliographic database in Economics, with RePEc services recording the 50 millionth download during July 2010. All RePEc activities are driven by volunteers as RePEc is not funded.


Little known features on RePEc sites

July 28, 2010

Various sites display information collected by RePEc, and they do so in ways that are not always similar. In particular, there are features that may not be noticeable to the casual user. Here are some featured on EconPapers, EconomistsOnline and IDEAS.


  1. EconPapers and IDEAS allow users to download bibliographic records in various formats, such as BibTeX, RIS (used by EndNote, ProCite and RefMan), plain text or HTML. IDEAS also provides this for all works of a registered author.
  2. RePEc services link different versions of a paper and article, as long as at least one of the authors has them listed in his/her profile and the titles are close. Contact RePEc for cases where titles differ.
  3. URLs on RePEc services are permanent, and can thus safely be used for referencing.
  4. EconomistsOnline allows to dynamically refine search results.
  5. Some services have advanced search features: EconPapers, IDEAS.
  6. One can navigate EconomistsOnline in four languages.
  7. IDEAS has tools to create reading lists and publication compilations of a group of people.
  8. EDIRC lists all publications of authors affiliated with an indexed institution.
  9. EconPapers provides a syntax and URL checker for the metadata submitted to RePEc.
  10. Both EconPapers and IDEAS provide links to citing and cited papers on each abstract page.
  11. Download statistics for series, journals, papers and authors are available at LogEc.


RePEc in June 2010

July 3, 2010

We just concluded a very calm month, with people visibly watching more the World Cup than doing bibliographic searches. We counted 608,512 file downloads and 2,029,548 abstract views, the lowest monthly numbers in two years. Still, 12 new RePEc archives joined: Universität Duisburg-Essen, SEACEN, La Trobe University, Oxford University (III), Sociedad Española de Historia Agraria, South Asian Network for Development and Environmental Economics, Universidad de Zaragoza, Australian National University (V), Universidad Politécnica de Valencia, Economic Research Forum (Egypt), US Environmental Protection Agency, Auburn University.

We also passed few important thresholds, the next month looks even more promising:

800000 full texts listed
350000 listed working papers
300000 items with citations
70000 articles with references


The FRED Network, a social network for economists

June 12, 2010

Guest post by Richard Anderson

On June 3, 2010, the Research Division of the Federal Reserve Bank of St. Louis introduced the first Internet social networking web site for economics and business. The new web site <www.thefrednetwork.com> is a namesake of the Bank’s popular FRED data service, the most widely used free source of United States economic data on the Internet.

The FRED Network will permit economists and the public to communicate more easily with the data analysts that support FRED, as well as with economists both at St. Louis and elsewhere in the world. The web site is a “dual-threaded” design, meaning each user can select both the topics of interest and the site’s users with whom they wish to communicate. Unlike unfiltered Internet blogs, users will not have to sort through commentary of little interest to locate useful information.

Social networking web sites help people find others with similar interests, exchange knowledge about both data and research projects, solve problems, and develop new ideas. Also, companies learn from their customers by reading customer comments.

“The FRED Network will improve communication between the Bank staff and our customers,” said one staff member. “In the past, we answered most questions via email and only one person saw the information. Now, we can answer within The FRED Network and all customers with that interest will see the information.”

Social networking brings together like-minded people, whether users of FRED data or top-level professionals pursuing complex research.

A unique feature of The FRED Network is the ability for each user to write their own blog. Each user’s blog is available to all other users who sign up to receive that user’s commentary. The unique design of The FRED Network allows each user to read blogs from selected users while excluding ones they do not care to read. This feature allows users who are passionate to write about their interests and expertise, while allowing users who do not wish to receive that commentary need not do so.


RePEc in May 2010

June 6, 2010

May was a rather unusual month. Traffic has be lighter than usual, with 751,319 file downloads and 2,512,833 abstract views, but a large number of bibliographic items were added to RePEc, about 14,000. At this pace, we should be reaching a million within the year!

Also, 12 new archives joined RePEc: Swiss Economics, National Bank of Serbia, Toulouse School of Economics, University of Copenhaguen (II), Universität Basel, US Department of Justice, Kasetsart University, Center for Strategic Research and Analysis, Institutul de Economie Mondiala, University of Warwick (III), Universität Münster (II) and the European Commission (II).

Finally, these are the thresholds we passed during this past month:
1000000 book abstract views
550000 listed articles
400000 book chapter downloads
250000 articles with abstracts
24000 registered authors
12000 online chapters


Improving metadata

May 21, 2010

The bibliographic data used by RePEc services can only be as good as what publishers provide. While a post last month discussed how to improve citation coverage, the present one gives some advice to RePEc archive maintainers on how to improve their metadata to optimize their use in RePEc services. For starters, here is how a well formed RePEc template would look like:

Template-Type: ReDIF-Paper 1.0
Author-Name: Daniel Rais
Author-Name: Peter Lawater
Author-Email: p.lawater@grandiose.edu
Author-Workplace-Name: Department of Economics, Grandiose University
Author-Name: Jonathan Goldman
Author-Workplace-Name: Department of Finance, Grandiose University
Author-Name: Zhiwei Chui
Title: Phases of Imitation and Innovation in a North-South Endogenous Growth Model
Abstract: In this paper, we develop a North-South endogenous growth model to examine three phases of development in the South: imitation of Northern products, imitation and innovation and finally, innovation only. In particular, the model has the features of catching up (and potentially overtaking) which are of particular relevance to the Pacific Rim economies. We show that the possible equilibria depend on cross-country assimilation effects and the ease of imitation. We then apply the model to analyze the impact of R&D subsidies. There are some clear global policy implications which emerge from our analysis. Firstly, because subsidies to Southern innovation benefit the North as well, it is beneficial to the North to pay for some of these subsidies. Secondly, because the ability of the South to assimilate Northern knowledge and innovate depends on Southern skills levels, the consequent spillover benefits on growth make the subsidizing of Southern education by the North particularly attractive.
Length: 26 pages
Creation-Date: 1996-07
Revision-Date: 1998-01
Publication-Status: Published in Review of Economics, March 1999, pages 1-23
File-URL: ftp://ftp.grandiose.edu/pub/econ/WorkingPapers/surrec9602.pdf
File-Format: Application/pdf
File-Function: First version, 1996
File-URL: ftp://ftp.grandiose.edu/pub/econ/WorkingPapers/surrec9602R.pdf
File-Format: Application/pdf
File-Function: Revised version, 1998
Number: 9602
Classification-JEL: E32, R10
Keywords: North-South, growth model, innovation assimilation
Handle: RePEc:aaa:wpaper:9602

This is just an example, and there are more fields that can be used. See the step-by-step instructions for opening a RePEc archive for much more details. But let me point out a few recommendations:


  1. The more fields are filled, the better it is. But these fields must be legal, as defined in the documentation, or the entire template is rejected.
  2. One put information relevant to a field. For example, there should be no affiliation in the Author-Name:, or no paper number in Title:.
  3. The Author-Name field should contain only one author. Repeat the field for multiple authors! This is important, other wise we have difficulties attributing papers to authors on the RePEc Author Service.
  4. If there are multiple versions of a paper within the same series, repeat the File-* block as in the example above instead of creating a new tempate.
  5. Handle is a unique identifier that should not be changed, and in particular that should not be recycled. The latter point is very important, as handles provide links between papers, authors, references, citations, statistics, etc. Handle recycling introduces errors that are very, very cumbersome to correct.
  6. Do not confuse fields. Too often, an abstract is put in Title:.
  7. If you are not sure, check your template here.
  8. And check your monthly emails for any errors we may have detected, or check your archive here. There is even a URL checker to help your work!


RePEc is not a spider

May 13, 2010

We frequently get requests for inclusions in RePEc, and often these are complaints that some papers on a university department web page or a personal home page are not being picked up. RePEc is not Google. RePEc does not have a web spider that wanders the web and looks for research in Economics. I do not even think it would be possible to do so, as identifying research and Economics on an automatic basis is very difficult.

Material listed on RePEc is submitted, either by about 1200 participating archives, that each have followed our instructions, or by authors themselves at the Munich Personal RePEc Archive (MPRA). No need to send us links or papers. Just make sure your publisher participates, and if not upload your papers at MPRA (what you can in general also do for published material, see this previous blog entry).

And if you are really interested in a web spider for Economics, there is the Economic Search Engine (ESE), which uses in part RePEc data to search and index the subset of the web most likely related to Economics.


RePEc in April 2010

May 5, 2010

April has been a memorable month, with close to 40,000 works newly indexed. We thus reached 900,000 index works, of which 750,000 are online. Also, we have now counted 200 millions abstract views since the start of the project, a highly filtered number as the raw count before removing multiple views, robots and other “illegal” activity is four times that.

The 13 new archives that joined RePEc in April were: Chapman University, Africa Growth Institute, Università Cattolica del Sacro Cuore (II), Programme National Persée, United Nations Development Programme, University of Winnipeg, University of Valencia (III), Leuphana University Lüneburg, Revista de Economía Crítica, Università Roma 1 (III), University of California-Davis, Pontifica Universidad Católica de Perú, Università Roma 2 (II).

And, finally, we reached an impressive list of thresholds during the month:
200,000,000 cumulative abstract views
900,000 listed works
750,000 listed online works
500,000 abstracts
24,000 registered authors
3,000 online books


How to improve citation coverage in RePEc

April 28, 2010

One aspect of RePEc that has grown in importance over the last years is its citations analysis, provided by the CitEc project, in particular due to their use in rankings. Citations extractions is a complex process. First, one needs to be able to access texts and find where references are (see details), then one needs to be able to interpret those references and match them with some work already listed in RePEc (see details). At this time, 5,400,000 references could be extracted from 240,000 works, with 2,300,000 matched to an item listed in RePEc. While these numbers may sound impressive, it still means that only about a third of online texts could be parsed successfully. To improve on this we rely on the RePEc archive maintainers to help us do a better job. Here is some advice in this regard that they should heed, as any linked reference allows links back and forth between the citing and cited works, thus increasing visibility.


  1. Check out how successful CitEc is in extracting references from your series and journals. Maintainers receive every months statistics about coverage that they can monitor. In addition, they can look up on CitEc the reasons why some items were not processed. For the series with the best coverage, see here.
  2. Make sure links in the metadata go directly to a pdf file, and not to an intermediate abstract page. CitEc does not go further than the link that is provided to it. If you really want the abstract page present in the metadata, provide it as a second link.
  3. Make sure that CitEc is actually allowed to get to the pdf. If the pdfs are gated, consider allowing CitEc to access with its IP, which will be provided upon request.
  4. The above are not possible, or if for some other reason references cannot be parsed, one can also transfer references to CitEc by using the X-File-Ref construct in the metadata, as described here.
  5. For larger archives, an alternative way of transferring references can be arranged.
  6. Also, CitEc sometimes grabs too many references. This happens for working papers when a list of other papers in the series is appended. This is also a waste of paper. We strongly recommend not to have such lists and, where they are present, to alert CitEc so that these errors can be remedied.

Any request should be send to José Manuel Barrueco, who is in charge of the CitEc project.