(This Opinion piece presents the opinions of the author. It does not necessarily reflect the views of D-Lib Magazine, its publisher, the Corporation for National Research Initiatives, or its sponsor.)
Libraries and publishers are increasingly using download statistics to measure cost-effectiveness. Proponents of open access have also used download statistics to prove that open access journals are more cost-effective than subscription based journals. In this article, I argue that these calculations are misleading since they do not consider the age of the articles downloaded. Some implications and recommendations for standards of measurement are discussed.
The Number of Downloaded Articles Considered as Return on Investment
As illustrated in a previous article (Holmström, 2004), libraries seeking to provide some indication of the cost-effectiveness of electronic journals have related annual subscription costs to annual download statistics to come up with the cost per download measure 1. Some journal publishers have also used the same measure 2 to determine usage. Proponents of open access have used the average number of times articles are downloaded to compare the effectiveness of open access journals to subscription-based journals 3. All of these measures can be said to implicitly argue that downloads are a kind of return on investment (ROI), where the investment is the subscription fee.
However, none of the calculations described in the preceding paragraph take into consideration the fact that libraries usually get perpetual access to the journals they have subscribed to in a given year. The sum a library invested in, for example, 2002 was spent to gain access to articles published in 2002. Any ROI calculations should relate the 2002 sum to the total number of downloads from the 2002 articles since ROI is calculated by relating the investment to the return for the total lifetime of that investment. Furthermore, currently ROI of 2003 expenditures is calculated relating the money invested to gain access to the journals published in 2003 to the total number of downloads in 2003 from the publisher's collection for all years. This way ROI appears to be larger than it actually is by including downloads that should not be included. However, the current method also neglects future downloads to older articles and therefore makes ROI appear smaller than it actually is.
The current version of the COUNTER Code of Practice 4 does not yet mandate publishers to keep download statistics by time period published; therefore, calculating ROI accurately is nearly impossible. Nonetheless, previous research on the age of article readings can provide some guidance for estimating ROI.
The Age of Article Readings
Tenopir and King (2000) reported on a number of surveys performed between 1993 and 1998. They found that the age of articles read by university scientists in a given year was remarkably similar to the results of a study they had performed in 1960. The result of their 2000 research is presented in Table 1.
Table 1. Proportion of Readings by Age of Scholarly Articles by University Scientists
Adapted from table 25 on page 189 in Tenopir & King 2000.
The same authors have together with others recently performed two other studies that both come to the conclusion that there has been no observable difference in the age distribution of articles read with the introduction of electronic journals (King et al. 2003; Tenopir et al. 2003) 5.
The fact that articles become obsolete over time is nothing new. For example, libraries have used different kinds of citation analysis in their collection management work to determine weeding procedures. The novelty of this article is the application of article obsolescence to downloads and ROI.
The percentages in Table 1 can be used as a substitute for download statistics from a given publisher. In Table 2 download statistics from the National Electronic Library of Finland (FinELib) is used to estimate ROI on the 1998 subscription to 175 journals from Academic Press 6. Publishers do not yet have many years of back files. This is also true for the FinELib collection. Most of the journals are available only from 1993 onwards 7. Therefore, the figures in Table 2 are not exact, but they do still show how ROI is distributed over several years.
Table 2. Estimated ROI for Academic Press Articles Published in 1998
*Total number of downloads from articles published in 1998 for 1998-2002.
From Table 2 one finds that out of 29,242 downloads in 1998 only 17,107 actually derive from articles published in 1998. In addition, the percentages of readings in the years 2-5 only sum up to 26.2% (see column 4), but they actually make up 58.2% of the ROI in the years 2-5 (see column 6). This is because the number of downloads from the journal collections have increased considerably during the last few years. The numbers also show that the ROI on the 1998 articles is 40,891 downloads and not 29,242 and this excludes the 15.5% of readings still to come. (It may also be relevant to point out that according to Tenopir and King (2000), readings of older articles are more valuable than readings of newer articles, but that is not taken into consideration in this article where all downloads are considered to be of equal value.)
Implications for Comparison between Open Access and Subscription-based Journals
Most open access publishers have begun publishing rather recently and their journal collections have a low average age, whereas most subscription-based publishers have been publishing much longer and the average age of their articles is higher. Any fair comparison should take this factor into consideration. Proponents of open access have neglected to do so and have used the average times articles are downloaded to compare the effectiveness of open access journals to subscription-based journals. This situation has led some to conclude that open access journals provide 89 times more downloads per article than subscription-based journals (see Note 3). In a recent presentation, BioMed Central (BMC) quoted estimates by BNP Paribas Equities that the total average number of downloads per new article is 350 from Elsevier's ScienceDirect and more than 2,000 from BMC 8. It is somewhat unclear what "new" means, but because the recentness of articles is considered, the BMC comparison is nevertheless a more accurate one.
The lack of download statistics by time period published is an obstacle to reliably measuring ROI, but ROI can still be estimated using data about the average age of articles read. However, the reliability of these estimates is affected by the lack of back files as well as by changes in the number of journals in the collections, etc. Estimates of ROI presented in Table 2 are based on aggregated readings and will not be applicable to smaller more specialized journal collections. In order to accurately measure ROI, we need download statistics by time period published. Results from the 2002 Librarian Survey by COUNTER showed that 64% of the respondents considered statistics of this kind essential or desirable 9, and David Goodman, who is on the Board of Directors of COUNTER, speculates that the next version of COUNTER will require statistics by time period published 10. The arguments put forward in this article strongly support doing so.
The author wishes to thank Philip M. Davis, Life Sciences Bibliographer, Mann Library, Cornell University for providing feedback on a draft of the article.
1. See, for example, <http://www.lib.helsinki.fi/finelib/aineistot/tunnusluvut/>.
3. See for example slide 19 of the presentation "Two Roads, One Destination: The Interaction of Self Archiving and Open Access Journal" by David Prosser from SPARC Europe <http://agenda.cern.ch/fullAgenda.php?ida=a035925>. These numbers originate from Peters Suber's newsletter Open Access News (see <http://www.earlham.edu/~peters/fos/2003_08_31_fosblogarchive.html>).
5. They note that this may change with the increased availability of journal back files.
[King et al.] King, Donald W., Carol Tenopir, Carol Hansen Montgomery, Sarah E. Aerni. Patterns of Journal Use by Faculty at Three Diverse Universities. D-Lib Magazine 9(10). 2003 <doi:10.1045/october2003-king>.
[Tenopir and King] Tenopir, Carol and Donald W. King. Towards Electronic Journals. Washington (D.C.), Special Libraries Association, 2000. ISBN 0-87111-507-7.
[Tenopir et al.] Tenopir, Carol, Donald W. King, Peter Boyce, Matt Grayson, Yan Zhang, Mercy Ebuen. Patterns of journal use by scientists through three evolutionary phases. D-Lib Magazine, 9(5). 2003 <doi:10.1045/may2003-king>.
Copyright © 2004 Jonas Holmström