Open Access to the Scientific Journal Literature: Situation 2009

Bo-Christer Björk, Patrik Welling, Mikael Laakso, Peter Majlender, Turid Hedlund, Guðni Guðnason | PloS One | June 23, 2010

The Emerging Phenomenon of Open Access

During the past two decades, scientific journal publishing has undergone a veritable revolution, enabled by the emergence of the World Wide Web. This revolution contains two interconnected phases. The first, and to date most visible, is the rapid shift from print only journals to parallel print and electronic publishing [1]. Ten years ago scholars and scientists did almost all their reading from paper journal issues, obtained as personal copies, circulating inside their organisations, or by retrieving the issues from library archives. Today the predominating mode is to download a digital copy and either read it directly off the screen or as a printout. This has been facilitated by publishers' electronic licensing to bundles of journals (“big deals”) and awareness tools such as emails containing tables-of-content of new issues of favourite journals. Today the average researcher at a university has instant access to a much broader range of journal articles than ever before during the print era.

The second stage in this revolution is access to articles without any restrictions posed by subscriptions, commonly referred to as Open Access. Open Access emerged in the early 1990s, triggered by the possibilities offered by the web, but also partly as a reaction to the so-called “serials crisis” of subscription prices, which seemed to be constantly rising faster than the rate of inflation. In the early days most Open Access journals were small-scale individual operations run by groups or individual scientists, much in the same spirit as Open Source Software projects. After the year 2000 an increasing number of professional Open Access publishers have emerged. (i.e. BioMedCentral, Public Library of Science, Hindawi, Bentham Open). These publishers typically finance their operations by publication charges levied on the authors of the articles, reversing the business model from being content sellers to being dissemination service providers, making the authors their clients rather than the readers. Today the number of OA peer reviewed journals is around 5000 (well documented in the Directory of Open Access journals, DOAJ). In addition to journals which are fully 100% Open Access, there are other journals which operate via subscriptions as mainstream journals do, but which offer open access to the electronic versions of their articles after a delay of usually a year, or selectively for individual articles provided the authors have paid an additional charge to “open up” the articles.

Open Access journals provide one solution to the problem of restricted access to results of publicly funded research. The other is supplementing the dominant subscription-based literature by free copies of the manuscripts, posted by the authors or their institutions on different types of web sites. In the early days the home pages of the authors or their departments was the typical place, and often the only place, to put such copies. Today digital copies are increasingly posted in subject-specific repositories such as the renowned arXiv, which started out focused on physics papers but has since expanded its disciplinary scope, or alternatively in repositories maintained by individual universities for providing archiving and access to the output of their faculty. A majority of international publishers actually allow the posting of some versions of published articles, sometimes after a delay, in such repositories. This latter solution to the access problem is often by OA activists called the “green route” as opposed to the “gold route” of direct OA journal publishing. Green copies come in a number of variations of decreasing value to the readers. The most useful ones are direct digital copies or scanned-in versions of the articles as published. Most publishers prefer to allow posting of the authors manuscripts after acceptance for publication, but before final copy-editing and pagination. The author manuscripts as originally submitted for peer review differ the most from the final published articles. In a few scientific disciplines, such as physics or economics, there are long-standing traditions of circulating such copies widely via preprint servers, or as so-called working papers.

Over the past fifteen years there has been a lot of debate about the economics of OA versus subscription based publishing, as well as about the advantages and disadvantages of gold OA publishing versus green parallel publishing. Proponents have emphasised the direct cost savings that can be obtained by OA in the publishing system and also the positive indirect effects on R&D thanks to increased access. There have also been several studies showing that openly available articles are cited more by peers, which provides a strong incentive for authors to post green copies. Opponents have warned of possible dangers to the peer review process and its level of quality control if publishers are forced to move to OA.

A central question many policymakers ask is consequently how common Open Access is today and how fast the share of OA is increasing? What proportion of journal articles are OA and to what extent do researchers post OA copies in repositories? Accurate answers to such questions would be very valuable for instance for research funders, university administrators and publishers. The purpose of the study reported on in this paper is to provide answers to this type of questions.

Earlier Research

Although some estimates of OA prevalence have been published over the last few years, there is a clear need for rigorously conducted and up-to-date studies. So far the volume of OA has been studied for instance in the following ways.

For gold OA publishing it has been possible to establish an overall share of OA journals by comparing the number of OA journals listed in the DOAJ index to the total number of active peer reviewed scholarly journals listed in the Ulrich's Periodicals directory.

For green OA there are directories (DOAR, ROAR) listing repositories and statistics of how many documents these contain.

For particular limited disciplines it is possible to take the content in a few leading journals and check the availability of OA copies using web search robots and manual checking for full-text copies.
Broader studies can be conducted using discipline-specific or global samples using article titles taken from indexing services (ISI, Scopus or Pubmed) which are then searched for using popular web search engines.

For larger masses of articles the availability of full text versions OA can be checked by web crawling robots that are fed by article titles from indexing services.

All these methods suffer from limitations. On average OA journals publish far fewer articles per annum than subscription based ones, and thus the share of OA articles in the total global article volume is much lower than the share of titles. The criteria for inclusion in DOAJ and Ulrich's might also differ, so that the number of journals may not be directly comparable. The share of existing OA journals which have been reported in DOAJ has also changed over time. Counting the number of documents in repositories may tell a lot about the growth of the repositories, but the numbers cannot usually easily distinguish between copies of articles published elsewhere and a wide range of other materials (theses, working papers, research data, teaching material etc). The OA figures obtained for a few select narrow disciplines are interesting but don't give the broad picture. Often many journals even in the said disciplines are not included in the sample. This method also works better for green copies than for gold OA. Web Robots offer a very cost-effective way of identifying copies but are prone to mistakes of many sorts, and it is very difficult to classify the found copies into types. The most precise and comprehensive method is manual checking of titles obtained from general indexing services. The downside of this method is that the amount of work is considerable and increases in direct proportion to the number of articles in the sample.

Aim of this study

Our objective in this study was to make a rigorous assessment of the overall share of the peer reviewed article literature, which is available as OA, either published directly or made available as copies in different sorts of repositories. Furthermore, the variations in the OA availability based on the scientific discipline was also of interest, as well as the breakdown of the available OA copies into types of gold or green publishing and also based on the quality of the copy for green.