Economics Search Engine

February 11, 2009

The Economics Search Engine (ESE) is a subset of the Google search engine that restricts its searches to 23,000 economics web sites. It is an outgrowth of Resources for Economists on the Internet (RFE) which lists and describes items for economists. Today many users prefer to use search engines to find resources of interest, so ESE was developed with the assistance of Hal Varian and Othar Hansson, both of Google. ESE not only searches web sites listed in RFE, but also web sites from RePEc Author Services (over 19,000 economists have registered) and Economics Departments, Institutes and Research Centers in the World (EDIRC), which lists more than 11,000 such sites. Thus, by searching at ESE, a user interested in an economic topic is searching over a substantial fraction of the web devoted to economic issues.

ESE is implemented with a Google Custom Search Engine, which enables users to set up a site that restricts a Google search to a user-selected set of sites. It takes some work to set up one as large as ESE, but smaller ones are quite straightforward and doubtless many would benefit from setting one up for their own needs. As with many Google services, it is currently in beta test-mode, so the results might be problematic at times.

To make ESE particularly easy to use, it includes a “Search Plugin” for Internet Explorer 7.0 or FireFox 2.0 and 3.0 users. This allows you to initiate searches directly from a search box in your browser; thus you don’t even need to visit ESE directly. You should be offered one for ESE when you check your search plugins when you’re at the ESE web site.

Citation Accuracy

December 19, 2007

Open Access News pointed out a very interesting article in the Journal of Cell Biology, Show Me the Data. Written by that journal’s executive editor, the executive editor of Journal of Experimental Medicine, and the Executive Director of The Rockefeller University Press, it first reiterates many quality issues with journal impact factors that seem to be well-known among biologists, but I suspect that they are news to many economists. Many of these issues also hold for citation rankings for individuals. Beyond that, there are other issues that make citation data suspect. Fortunately, there are potential solutions to many of these problems.

First, it helps to describe impact factors as they are calculated by Thomson Scientific (previously the Institute of Scientific Information, or ISI). An impact factor in year t is the mean number of cites to all articles in that journal in years t-1 and t-2 divided by the number of number of research or review articles. Criticisms include

  • the data in the denominator and numerator are not consistent
  • Thomson is unclear on what exactly defines a research or review article
  • some journals have negotiated with Thomson on exactly what defines the article type
  • retracted papers are not excluded
  • of course, the mean is inflated by a few star papers
  • editors can game the system; apparently some do and some don’t (I’ve even seen this in the Wall Street Journal)

The authors go on to say that they contacted Thomson and received some of their data. They found numerous errors in how article were categorized. Further, “The total number of citations for each journal was substantially fewer than the number published” as reported by Thomson. When they requested further data from Thomson, the data still didn’t add up. They conclude “It became clear that Thomson Scientific could not or (for some as yet unexplained reason) would not sell us the data used to calculate their published impact factor.”

Their bottom line is even more clear: “If an author is unable to produce original data to verify a figure in one of our papers, we revoke the acceptance of the paper. We hope this account will convince some scientists and funding organizations to revoke their acceptance of impact factors as an accurate representation of the quality—or impact—of a paper published in a given journal. Just as scientists would not accept the findings in a scientific paper without seeing the primary data, so should they not rely on Thomson Scientific’s impact factor, which is based on hidden data.”

Besides the points reiterated and brought up in the Journal of Cell Biology, there are further accuracy issues with Thomson data. For example, to identify authors, they only use initials for the their first and middle name. As they pool papers from all fields, this is a more severe error than one might first guess. Thomson reports that Kit Baum (known to Thomson as CF Baum) has publications in the Fordham Law Review (on nuclear waste) and the Sociology of Education (on group leadership).

A further issue is Thomson’s coverage; EconLit lists some 1,240 journals in our field while the last time I checked Thomson covered but a fraction of these. I don’t have recent data for their coverage, but in total Thomson covers 8,700 journals encompassing all academic fields, so it seems doubtful that Thomas has substantially changed its economics coverage.

A further problem plaguing all citation analysis is simply extracting citation data with software. After all, citations are written for people, not machines. I haven’t seen data for Thomson on this (one wonders if it is public), but I do know that CitEc has faced a very real challenge here.

There would seem to be several solutions to these problems. First, all of us should treat impact factors and citation data with considerable caution. Basing journal rankings, tenure, promotion, and raises on uncritical acceptance of this data is a poor idea. In the extreme, one could imagine legal action in a tenure case.

Second, as the authors of the Journal of Cell Biology argue, this data should be public, just as research findings should be. One initiative here is a Petition for OA [open access] to bibliographic data. My understanding is that through a “RePEc service” like EconPapers or IDEAS, raw CitEc data can be accessed by the public. Further, CitEc works with RePEc Author Services to correct citations. Here’s one more reason to join those 15,000 who have registered with it!

Third, we should investigate putting unique identifiers into each reference so that software can easily read it. That is, besides listing the journal, its volume, and so on, it would also include a unique identifier to the cited paper. DOIs are one possibility, but it is prohibitively expensive to get a license to dispense DOIs. However, “RePEc handles,” which identify papers in RePEc, are permanent and also cover working papers. Thus, we might start including them in each reference. This highlights a further issue: there is little incentive for authors to add this to their citations as it aids others. Perhaps one step in this direction would be for sites like IDEAS, which provide references for papers in different formats like BibTeX or EndNote, could include the RePEc handle along with the current author, title, journal, etc.

Further Thoughts on “New Peer Review Systems”

November 5, 2007

Several thoughts on various points raised in New Peer Review Systems and the comments that followed.

  • In a sense, the lag in the review process might be optimal. A publication of most any sort is valuable to the author and one in a leading journal of course has a very substantial return. Journals thus have a good reason to deter papers that aren’t at all appropriate; this was pointed out by Ofer Azar, “The Slowdown in First-Response Times of Economics Journals: Can It Be Beneficial?,” Economic Inquiry, 2007, 45 (1). The constraint here would seem to be editor’s and referees’ time. In economics, the most common cost that journals impose on authors is a lengthy review process. I’d hazard a guess that bepress gets around this by their ranking of papers into different tiers; they don’t have to deter less than stellar papers as they’ll likely get a home there. This is combined with their system where authors who submit there agree to review two papers quickly (a nice example of a virtuous cycle). Another way to speed up the referee process is a system where any reader can submit comments on a paper. But, as Christian points out, this doesn’t seem to attract many comments. It turns out I’ve looked a bit at this and found 5 journals that have tried a reader rating system and none have attracted a sufficient number of comments to make it fly. From here, one option is something that ranks papers after they’ve been out, such as citations. Paul Ginsparg has some thoughts on one approach, as does Hal Varian (now the chief economist at Google). But, these might take years to generate sufficient data to render a judgment on a paper. I think many of us want something that is quicker.

    Another possibility is something like Faculty of 1000 in biology and medicine where a level of review beyond journals takes place. I very much like the summaries in their sample web pages; you don’t see that in economics. One could imagine it working on top of our working paper culture. But, I wonder if some of their success comes from the grant culture in these fields as this is a fee-base service that seems to pay the reviewers? How might one set up something similar in economics with our working paper culture? Journals would likely see it as preempting their role.

  • I agree with Preston that perhaps the most interesting part of Economics E-Journal is the open review system (all can read reviews) and the feature that allows authors to publicly respond to referee reports. Both would seem to give referees correct incentives. I would think that journals could implement this quickly and easily at low cost. Also, while deep thinking and working through a paper you are writing is extremely valuable, so is getting feedback and discussing a paper and the ideas in it. I have a first draft of a paper on these topics (see below) but in writing this blog entry I have developed some new insights. I wouldd add that prompt discussion is something that the Internet can aid for those of us without local colleagues in our fields.A very minor point on his post: if you count economics journals by the number in EconLit, there are about 1,240. In short, most any paper should certainly be published!
  • I certainly second Christian’s point about blogs and research. All the economics blogs I know of do not discuss serious research that much if at all. Much more common are discussions of current economic events and policy that members of the public find interesting. I don’t know of a one where someone might say, “Say, that paper by Sam Jones in Computational Economics” is interesting because the algorithm he used to calculate…” Also, debates by papers take years given the time to write one and respond. This seems rather silly given today’s technology; after all, journals in the their current form came about when information traveled at the speed of a horse or ship. Perhaps a blog is too quick for complete works, but I understand in law their use is leading to a relative decline in the importance of law journals. One example (first paper found in a Google search) is Guest Blogger: The Start of the Supreme Court’s 2007-08 Employment Discrimination Docket: Federal Express Corporation v. Holowecki. Yes, it discusses current events, but the topic is one that only specialists would seem to care about. You do not seem to see such in economics blogs that I am aware of.

Where does this leave RePEc? Well, I’m not really sure. It is hard to change norms in a field and I am not sure that RePEc could swing it. But, I do not think that a comment system on individual papers would get that much traction. Somehow you want to get the judgment of peers (for promotion, tenure, and annual raises) in a speedy manner. Those two criteria seem to be at odds with each other.
It turns out I have done some thinking on these issues; at the risk of self-promotion they can be found in Next Steps in the Information Infrastructure in Economics. Note that this draft was written for a conference of non-economists, so some parts will strike a very obvious cord to an economist’s ear. I have also have yet to incorporate some very useful comments that Christian kindly wrote. In a nice mix of blog and papers, further comments on the paper are greatly appreciated.