Institutional repositories and RePEc

January 10, 2009

More and more institutions are adopting mandates that force their researchers to put their works, published or not, in institutional repositories. The idea is that this research should be openly accessible to all, instead of being locked by the password of an online publisher. Such mandates are, however, of little use if those works cannot be found by others. Search indexes like Google (Scholar) or OAIster are often not capable of sorting efficiently for the purposes of a researcher. It is therefore important that works from institutional repositories be also indexed in field specific indexes, like RePEc for economics.

RePEc does not house files, it only indexes them. Thus, the goal is not to push PDFs to RePEc, but rather to push the appropriate metadata about those PDFs. Software used in institutional repositories typically generates metadata, unfortunately not in the format required by RePEc (which predates any other format). Thus, metadata needs to be converted. We make available a variety of scripts, typically written in perl that are easily customizable to local needs, in particular for DigitalCommons, DSpace and EPrints. Other converters are always welcome to be added to the list.


RePEc in December 2008, and what we have done over the year 2008

January 2, 2009

Some may validly argue that 2008 was not the best year, but the RePEc project certainly cannot complain. Whichever statistics you look at, it has been a stellar year: 10,000 bibliographic items were added every month, including from 700 new series and journals and 126 new participating archives, 3,500 authors joined, we recorded 31,000,000 abstract views and 8,000,000 document downloads and 4,000 NEP reports announcing new working papers were sent. Such intense activity is going to be difficult to beat in 2009.

In terms of new features, we added RSS feeds for the NEP reports, introduced the RePEc Input Service to facilitate data entry in some cases, and IDEAS moved to a new server sponsored by the Society for Economic Dynamics.

Now for the monthly report about December 2009, we surpassed 400,000 listed journal articles, we witnessed 704,217 file downloads and 2,619,953 abstract views, and added content from the following 12 new archives: Nordic Journal of Political Economy, Petru Maior University, Universidad de Murcia, University of Applied Sciences Berlin, Journal of Economic Education, Central Bank of Cyprus, University of Oradea, Savez ekonomista Vojvodine, Universidade Nova de Lisboa, Tulane University, Romanian Academy of Economic Sciences, Ave Maria University.

In terms of thresholds, we passed the following over the last month:

100,000,000 cumulative abstract views of working papers
35,000,000 cumulative file downloads through RePEc services
30,000,000 cumulative abstract views on EconPapers
25,000,000 yearly abstract views on IDEAS
400,000 listed articles
275,000 listed working papers
200,000 cumulative chapter downloads
150,000 cumulative book downloads
11,000 institutions listed
2,400 working paper series


The worldwide reach of RePEc

December 26, 2008

Following up on the post two weeks ago about how RePEc tries to contribute to the democratization of research, it is interesting to how far RePEc reaches in the world. While we do not have any recent study looking at who uses the RePEc services as a reader, we know much better who the contributors are. First the authors, of which about 18,500 are distributed over 118 countries (and all US states). Then, the 960+ RePEc archives, which each contribute bibliographic data to the project, are dispersed in 64 countries. But some of those archives collect data from several institutions. Thus, we actually have publications from 70 countries (and all but five US states: AK, CO, NE, NH and SD). And this is how this would look like on a world map:

Distribution of publications across the world


Institutional data in RePEc

December 19, 2008

RePEc gathers information not only about publications and authors, but also institutions. Specifically, the EDIRC project (Economics Departments, Institutes and Research Centers) catalogues since 1995 all academics and government institutions that employ a significant share of economists, including think tanks and associations. For-profit organizations (banks, consultants, etc.) are listed if they contribute their publications to RePEc. As of today, 11,000 institutions are listed, including over 600 associations. Over 4000 have at least one registered author and about 1000 have some publication in RePEc.

The collected institutional data is used and displayed in various ways throughout RePEc. Authors use it when
they register to determine their affiliations. So do RePEc archives for their publications. Author and institution data are combined on EDIRC to compile the publication output of all institutions. Combine this with citation data from CitEc and download data from LogEc to determine institutional rankings.

Note that all the information about institutions has been gathered with the help of a lot of people.


RePEc and the democratization of research

December 9, 2008

In the last issue of the American Economic Review, the following article caught my eye: Restructuring Research: Communication Costs and the Democratization of University Innovation by Ajay Agrawal & Avi Goldfarb. In short, it documents who gained in electrical engineering faculties from the reduced cost of collaboration through the introduction of Bitnet, in the early Internet days. The basic result is that the middle-tier universities benefited the most. Indeed, the top ones were already well connected with each other, and the middle ones took advantage of collaborating with the top ones.

The main goal of RePEc is precisely the democratization of research. Given publication delays in Economics, if one wants to stay abreast of developments at the frontier of research, one needs to read working papers. Before the Internet, the only way to get hold of them was either if you were already at a top ranked Economics department, or if you were somehow within a club of well connected researchers. Just being aware of the most current research was a challenge for anybody outside these circles. This is what motivated Thomas Krichel, as a research assistant in 1991, to find ways to learn about new working papers, and share what he found. This initiative evolved into RePEc in 1997.

Are Elite Universities Losing Their Competitive Edge? by E. Han Kim, Adair Morse & Luigi Zingales documents that Economics faculty in elite universities where more productive at least in part due to their location in the 1970s, and that such a location effect has disappeared by the 1990s. While it is open whether RePEc has contributed to such democratization, we have always favored it: everybody should be able to learn about current research, and everybody should be able to contribute to it.


RePEc in November 2008

December 2, 2008

We have just experienced a tremendous month. First, about 25’000 works were added, second we have seen traffic like never before. The only downside was that we had to move the blog due to various issues.

The push in new material was partly due to additions from Agecon Search, as well as from a lot of activity from many other archives and finally from 13 new archives, more than usual: ADRES, Universität Wuppertal, CORE, INRA, University of Ottawa, University of Osijek, University of Nevada, Las Vegas, Pion Ltd, University of Texas at San Antonio, University of Indonesia, Asociación Española de Profesores Universitarios de Contabilidad, University of Lancaster (II), Bilgesel Yayincilik.

In terms of traffic, we counted 860,187 file downloads and 3,292,711 abstract views on Econpapers, IDEAS, NEP and Socionet. These are easily new records.

Which brings us to the thresholds we passed during the past month, an impressive list:
50’000’000 cumulative article abstract views
12’000’000 cumulative article downloads
7’000’000 cumulative downloads on EconPapers
3’000’000 monthly abstract views
800’000 monthly downloads
650’000 works listed
550’000 online works listed
350’000 abstracts listed
270’000 working papers listed
200’000 online working papers listed
200’000 working paper abstracts listed
125’000 working papers with references
120’000 articles with citations
4’000 institutions with registered authors
2’000 books listed
800 journals


Parsing citations

November 22, 2008

One of the services RePEc offers to authors is the discovery of citations, CitEc. This is a difficult undertaking as this needs to be done entirely automatically. As project leader José Manuel Barrueco Cruz discusses in a previous post, the reference section of a paper is extracted through a series of steps: pdf download, file conversion to PostScript, further conversion to plain text, identifying reference section. In each of these steps there are losses.

But even once the reference section is in hand, we are not out of trouble. One needs to identify where each reference starts and ends, then try to match it with something already in RePEc. Considering all the different citation styles, typos, and plain errors, this is a daunting task. Matches that are sufficiently close are counted as citations, matches that are in some grey zone are fed to the RePEc Author Service to solicit the author’s help in sorting them out. Below are a few examples of what is offered to authors, for the case of a classic article by Gary Becker, Kenneth Murphy and Robert Tamura, Human capital, fertility and economic growth:


  • [3] Becker, G.; Murphy, K. ald Tamura, R. (1993)Humall capital, fertility ald ecollomic growth 01 Humall Capital, third editioll, Gary Becker.
  • Becker, Gary S.; Murphy, Kevin M.; and Tamura, Robert. Human Capital, Fertility, and Econonric Growth, Journal of Political Economy, October 1990 98(5) Part 2, pp. S12-S37.
  • Becker GS, Murphy KM, Tamura R (1990) Human capital, fertility and economic growth. J Polit Econ 98:S12–S37.
  • 1-25. Kevin M. Murphy, and Robert Tamura, Human Capital, Fer- tility and Economic Growth, Journal of Political Economy, October
  • BECKER, 0. S., K. M. MURPHY and R. TAMURA (1990) Human Capital, Fertility and Economic Growth, Journal of Political Economy 98, S 12-37.
  • [6] Becker, G., Murphy, K. and Tamura, R. (1990), Human capital, fertility, and economic growth, Journal of Political Economy, vol. XCVIII, pp.12-37.
  • Population and Development Review, Vol.12, Supplement: Below-Replacement Fertility in Industrial Societies: Causes, Consequences, Policies, pp. 69-76. Becker, Gary; Kevin Murphy, y Robert Tamura. (1990). Human Capital, Fertility and Economic Growth. The Journal of Political Economy, Vol.98, No.5, Part 2: The Problem of Development: A Conference of the Institute for the Study of Free Enterprise System, S12-S37.
  • (March/April 1973 Supplement), S279-88. ______________ Kevin M. Murphy, and Robert Tamura, Human Capital, Fertility, and Economic Growth, Journal of Political Economy, XCVIII
  • Becker, S. Gary, Kevin, M. Murphy and Tamura, Robert (1990). `Human Capital, Fertility, and Economic Growth The Journal of Political Economy, Vol. 98, Issue 5, Part 2, Oct. 1990, pp. S12-S37.
  • Bankconference on developmenteconomics. ecker, Gary, KevinMurphy, and RobertTamura. 1990. Human Capital, Fertility, and EconomicGrowth., Journal of PoliticalEconomy 98, 5, Part 2, pp. S12-S37.

These examples show what can go wrong in the file conversion and how citing authors can make mistakes. Still, CitEc has been able to recognize there references, but is not sure enough about them.

This also highlights that we try to minimize errors, even if this means leaving good citations out. Other citations services may have a different approach.


Looking for a deep link?

November 21, 2008

If you were following a link and were expecting to find a specific post on the RePEc blog, we unfortunately had to move to a different host and links were broken. Please look for your post in the archives. Or if you were using one of the RSS feeds, please use the new ones: entries or comments. We apologize for the inconvenience.


The blog has moved to a new host

November 19, 2008

Due to chronic problems with DOS attacks and spamming that have crippled several times the host server, the RePEc has now moved to a new host. It is still available under the old https://blog.repec.org/ address, but no more under the alternative http://repec.org/blog/. Also, the addresses within the blog have all changed, which breaks deep links. Finally, old RSS feeds may still work as they are redirected, but it is safer to recreate them.

Users who created accounts at the old location will have to create new ones, unless they have already one on WordPress. I am very sorry for the trouble, and especially for the violation of the RePEc principle that links should never break. But I think we now have a permanent home for this blog and this should not happen again.


RePEc in October 2008

November 5, 2008

The major development this past month is that the contents of AgEcon Search are now listed on RePEc. About 30,000 works will gradually be integrated over the next weeks. Also, October is traditionally a busy month, which is reflected by a large number of new participants (authors and institutions) and high traffic. We recorded 701,893 file downloads and 2,757,234 abstract views. In addition, the following publishers joined us during this month: British University in Egypt, Migration Letters, Universitatea “Al. I. Cuza”, Université du Littoral, AgEcon Search, EERI, University of Suceava, Econometica, Spiru Haret University, WorldFish Center, Université Libre de Bruxelles, esocialsciences.com, Scuola Superiore Sant’Anna, Tufts University.

In terms of thresholds passed this month, we have:
150,000,000 cumulative abstract views
25,000,000 year-to-date abstract views
640,000 items listed
375,000 articles listed
18,000 authors registered
3,000 series and journals indexed
2,500 book chapters listed