RePEc Genealogy, the academic family tree of economists

August 5, 2022

The RePEc Genealogy has now reached 15,000 entries. This site describes where and when economists got their terminal degree, as well as who their advisors were. This allows to build a family tree of the economics profession as well as gather information about graduate programs.

The data is gathered by crowdsourcing, much like a wiki: users registered with the RePEc Author Service can log in to the RePEc Genealogy and add or amend the records for themselves, their students, or their advisors. They can also add former students of the graduate programs they currently work with or graduated from.

Beyond displaying it on the RePEc Genealogy site, the collected data is used in a myriad on ways:

  • Various studies on economists have leveraged this data.
  • Profiles of economists on IDEAS display part of the RePEc Genealogy information where available.
  • EDIRC, the directory of economics institutions, has for each relevant institutions a list of alumni and a link to a compilation of their publications.
  • Female representation in economics by cohort.
  • How well graduate students do is a criterion for the rankings of economists and institutions.
  • The year of graduation is also used for rankings of economists by cohort.

Over 4000 people have already contributed to the RePEc Genealogy, everyone is welcome to make it more complete and useful.

RePEc in June 2022

July 11, 2022

CitEc is continuing a remarkable effort at citation matching, adding several million in a month. We welcomed a few new archives: International Association on Public and NonProfit Marketing, Yildiz Social Science Review, Bulletin of Political Economy, Spanish National Markets and Competition Commission (CNMC), Universidad ORT Uruguay. We counted 428,920 file downloads and 1,735,689 abstract views last month. And we reached the following milestones:
25,000,000 matched citations
1,000,000 journal articles with extracted references
600,000 working papers with extracted references
30,000 books available online
30,000 book chapters with citations
20,000 books with citations

How publishers can ensure their data looks right on RePEc

July 4, 2022

All material indexed in RePEc is provided by the respective publishers. They make this information available using a metadata syntax defined in 1997 by RePEc and that has not changed since, except for a few additions. But adhering to this syntax is important, as errors disqualify items from indexing and other problems may leads to various issues. If something is amiss or missing, every IDEAS or EconPapers page has an email contact listed for alerting the maintainer of the relevant data.

That said, RePEc helps the maintainers in various ways so that they can address proactively with any problems. They receive each month and email with various statistics and a link to their “problems” page on the EconPapers checker (add the three-letter archive code to the URL to get more details), which shows data download problems, detected syntax issues, and bad URLs to full text. EconPapers and IDEAS also provide FAQs. Also, re-reading the intial setup instructions or the ones for new maintainers can prove useful.

The most frequent issues that appears in the EconPapers checker are:

  • RePEc archive has moved from http to https: the maintainer needs to change the URL line in the archive template and alert someone in the RePEc team about the new location to fix the download process.
  • A series or journal is missing the correspondent series template.
  • A handle (identifier) is used multiple times. Handles are supposed to uniquely and permanently define any item in RePEc. Re-using them is a source of major problems.
  • Missing end-of-line that merges two fields.

Other problems cannot be detected through an automated process. Here, maintainers need to follow appropriate conventions or check that the visuals on the RePEc sites look right. Examples are:

  • Inappropriate use of a data field. Examples are putting a working paper number in a title, adding affiliations to an author name, putting an abstract in a title, or putting keywords and JEL classifications in the abstract. Each piece of information has its own field so can appropriate bibliographic records can be created.
  • Each author needs to be in their own author name field. Lumping them together in one field makes it impossible to attribute the work to registered authors.
  • When some work is available in multiple languages or is translated, each title goes into it own title fields instead of being merged into one. Also, the mention of the language goes into the language field, not in the title.
  • Errors in character encoding leads to records with funny looking characters. This happens by cutting-and-pasting strings from a file in one encoding to a file with a different encoding. Characters with accents (é, ñ, ü, ç, å), ligatures (ff, fi, ffl, æ, ß), non-latin character sets (cyrillic, arabic), and other special characters (long hyphens, Windows quotation marks and apostrophes) are especially problematic. They also make author or citation matching more difficult. The solutions are to fix these individually in the RePEc files, and if those are encoded as UTF-8 use and .redif extension instead of .rdf (be careful not to have both files in the RePEc archive, leading to duplicated handles).
  • No HTML markups should be present. The result in RePEc services and sites in unpredictable. The only exception is to be used to separate paragraphs in an abstract. The same applies to LaTeX or TeX markup.

RePEc in May 2022

June 8, 2022

There has been much reason for celebration last month. RePEc is now 25 years old. It reached 4 million indexed research items. And a rewrite of the CitEc matching algorithm increased the number of matched citations references by 8%. In addition, we welcome some new RePEc archives:State Agency for Intellectual Property (Moldova), Universidad de Santiago de Chile, Institute for Management and Planning Studies (Iran). We counted 507,098 file downloads and 2,000,566 abstract views. And we reached the following milestones:

20,000,000 matched citations
4,000,000 indexed items
1,600,000 items with extracted references
500,000 cited working papers

RePEc celebrates 25 years and 4 million indexed items

May 12, 2022

25 years ago, on 12 May 1997, a meeting among a few economists and librarians laid the foundation for RePEc. Thomas Krichel describes this meeting in a recent RePEc blog post. As more research was starting to get shared on the web, it became infeasible to index all of it by hand. A new scheme was agreed on that, in essence, set rules for sharing metadata about research publications in economics. These rules still apply today, despite the tremendous growth that RePEc enjoyed. Over 2000 publishers maintain RePEc archives, carrying over 10,000 serials, including close to 4,000 journals. 25 years ago, no one was expecting that much.

Coincidentally, a few days ago RePEc surpassed 4 million indexed research items. The graph about shows the evolution of the number of research items. What is striking is that there is steady growth and that each additional million takes less time. Thus it is not that there was a big stash of research that was waiting to be tapped. Rather, the body of research evolved steadily with the popularity of RePEc. Its composition changed over time, though. The goal of RePEc was always to enhance the dissemination of research in economics, and early on the biggest need was for working papers (pre-prints) that did not enjoy the marketing or networking of commercial publishers. But soon the latter realized that they needed to participate in RePEc as well, as RePEc became the central point of dissemination in the field for big and small publishers. As all RePEc services are free for users, authors, and publishers, RePEc can thus democratize access to research.

Calling it a central point is kind of ironic, because RePEc is anything but centralized. The scheme relies on each publisher maintaining the relevant metadata on their own ftp or web site. The only central aspect of RePEc is a file directory containing pointers where those decentralized RePEc archives sit. All data is public, and other services can leverage it to disseminate economic research in any way they see fit. Now most dissemination services, not just those within the domain, use RePEc data one way or another. This makes RePEc an extremely efficient dissemination tool. It reaches a lot of users at minimal cost, as the publishers are in charge of hosting content and indexing. Even running a service using RePEc data is cheap, as the full-text content is still with the publishers. Various sponsors take care of the hosting costs or host themselves a few servers.

To make things right, there are still some non-monetary costs, though. A team of volunteers takes care of new RePEc archives, answers queries, monitors data quality, provides updates to participants, and maintains some important RePEc websites. For more details, see a short history of RePEc, instructions on how publishers participate in RePEc, and a list of RePEc archives, which are currently located in 103 countries.

RePEc in April 2022

May 6, 2022

Shortly before RePEc celebrates its 25th birthday, we have to deplore the closure of Socionet. It used to display RePEc data for Russian users but ran into legal issues. We welcomed a few new RePEc archives: Superintendence of Companies of Ecuador, Omsk Humanitarian Academy, Hong Kong University of Science and Technology, Lodz University Press, Strategic Management Business Journal. We counted 493,485 file downloads and 1,888,113 abstract views. And we reached the following milestones:
125,000,000 cumulative downloads from reporting RePEc services
120,000,000 cumulative abstract views on EconPapers

Why is RePEc 25 years old?

May 5, 2022

I once read a quote that claimed that the reason why humanity has never reached its full potential, and never will reach it, are meetings. Interesting enough, RePEc was “made” at a meeting. That meeting took place on 12 May 1997. It is considered the birthday of RePEc. Now that is 25 years ago.

RePEc really started with the NetEc project. An account of February 1997 is in my note “About NetEc, with special Reference to WoPEc” at This gives a reasonable idea of the state of play before the meeting. In some ways that piece is an infomercial. It highlights the role that JISC funding played at that time.

What it does not mention are the the plans to build Swedish branch of WoPEc. The idea arose at a meeting in London where I met Frans Lettenström. He worked for Swedish Royal Library. I suggested they fund Sune Karlsson for a project to bud a Swedish economics working paper system.
On January 16 of 1997, Sune reported

“We had a meeting with our potential funders today and have reached a preliminary agreement on what to do. The idea is that we, as a pilot project, should get all the economics working paper series in Sweden on-line and into WoPEc.”

On March 1997, I received a cold email from Thomas W. Place of the library of Tilburg University. He was the technical lead for the DEGREE project. This project coordinated the publication of economics working papers by Dutch universities. I was aware of the project. I had tried to contact them on several occasions before, but never read from them. He proposed to furnish me data directly in the internal format used by WoPEc. This was an unprecedented act. As far as I can remember, until that point, Jose Manuel Barrueco Cruz (henceforth: JMBC) and I always has to take data from a provider and do conversions ourselves. But rather than accepting this offer with extreme enthusiasm it deserved, I wrote

“In the medium term I think we need to think over the whole structure of a distributed, mirrored archive system. I have already proposed that we use the list wopec-admin@mailbase to discuss a successor format to the WoPEc format. That would allow for administrative metadata, series descriptor, archive descriptions, permissions to mirror etc. This is longer term effort. I will publish some reflections soon.”

In fact, the email from Thomas W. Place gave me the impact to actually proceed in the direction outlined above. On 15 April I wrote to him

“My plan is to radically overhaul the structure of what we are doing, and I am writing a document that contains proposals for doing this. I have shown a draft to JMBC and he thinks it is very unclear at this stage … It is called the Guildford protocol.”

Thomas W. Place indicated he would be in London for a meeting on the May 13, so the 12 or 14 May would be good for him. Sune expressed a preference for 12 May. He used travel funds from the Swedish project and couched surfed at my flat in Martyr Court, Guildford. Sune arrived on the 8th at about 16:00. We went out for a walk to St. Martha’s Hill. On the hike, I popped the question to him. What did he think about my drafts? I was much relieved when he revealed that he thought they were reasonable.

The meeting as such was rather uneventful. The attendees were Corry Stuyts, who was the head of DEGREE, JMBC, Sune, Thomas, and myself. My office was too small and had too many computers in it, so we met in David Hawden’s office across the corridor. We basically set down and worked through the documents I had prepared. That’s all we did. We did not finish them. Thomas and Cory had to leave early. We went in great details through the two documents I had prepared. They are ReDIF specification and the Guildford protocol. Both documents are still the basis of RePEc. Sune contributed important corrections on May 16.

RePEc is a grass-roots initiative. Typically, grass-root initiatives take time to grow. Thus the precise start of such initiatives is not that easy to fix. The date of May 12, 1997 is generally accepted as the birthday of RePEc. But 25 years later, we need new directions. I have ideas but unfortunately, I am funded at this time to work on other business.

RePEc in March 2022

April 4, 2022

Over the past month, we counted 532,252 file downloads and 2,018,681 abstract views. We welcomed the following new RePEc archives: Institute of Economic Growth, International Association of Deposit Insurers, Pontifical Catholic University of Argentina, Pressburg Economic Centre Ltd, Libertas International University. And we reached:

3,600,000 items available online
2,000,000 articles with abstracts
70,000 book chapters available online

RePEc introduces NFT registration for academic papers

April 1, 2022

Every item (papers, article, etc.) in RePEc is identified by a unique and persistent handle, and has been for 25 years. Still, there is constant demand for additional persistent identifiers, see for example the introduction of the DOI. Yet, none of those identifiers clearly indicate who the owner of that item is. RePEc now introduces a way to take care of that by leveraging blockchain technology. A non-fungible token (NFT) is a non-interchangeable unit of data stored on a blockchain, a form of digital ledger (Wikipedia).

Authors can create a record on a blockchain (a NFT) for their, say, article, by specifying the relevant RePEc handle. Then, they can log into their profile in the RePEc Author Service and register the NFT with the corresponding item. For this purpose, a new NFT section was created on the site. They just need to find the work this token applies to, and add it through a menu.

Authors need to be aware of certain limitations, though:

  1. The RePEc registration is not a wallet. While the registered token is checked against its blockchain, it is not a proof of ownership. The author still needs a crypto wallet to store the token securely.
  2. Registration is on a first come first serve basis in the sense that if a co-author already registered a token for a RePEc handle, no other can be added.
  3. There are many blockchains and new ones are continually created. The RePEc form is populated with 50 popular blockchains, but one can add another one in free text.
  4. Keep in mind that having a token on one blockchain does not prevent somebody from obtaining a token for the same RePEc handle on another blockchain. Thus one needs to secure NFTs on several blockchains. One can register with RePEc the tokens from several blockchains for the same handle.
  5. Registered tokens are only for the RePEc handle. The actual full text is still with the relevant publisher, who keeps the appropriate rights.
  6. RePEc handles are created by the publishers indexing their works in RePEc. They are free to delete those handles.

If this does not make sense to you, don’t worry, it works.

Women economists on RePEc

March 7, 2022

On the occasion of the International Women’s Days on 8 March 2022, we take the opportunity to present all that RePEc is doing to highlight the work of female economists.

The first step is to identify them. When authors register with the RePEc Author Service, they are not asked for their gender. Hence, we need an additional step. This is performed based on the analysis of their names. We leverage NamSor, which uses an algorithm that includes guesses on the ethnicity to make more accurate gender attributions. Checks on the data revealed that this works well except for Chinese and Korean names, for which volunteers complement manually the assessments. Authors can also adjust their attribution from a link that is sent in their monthly email updates from RePEc.

The second step is to use the collected data. This page documents the proportion of females in the profession overall and by country, US state, research field and year of graduation. These statistics are updated every month. This page lists all the female economists who registered their Twitter handle with RePEc. This tool allows to identify female economists in some geographic areas and/or research fields, for example for a speaking in a research seminar. Finally, we have the ability to identify the best female economists, based on all their publications or on the last 10 years.

Use of this data is not limited to RePEc. It is available through the RePEc API and has already been leveraged for some research and has been presented numerous times in symposia.