Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents



D-Lib Magazine
September/October 2007

Volume 13 Number 9/10

ISSN 1082-9873

Overview - Repositories by the Numbers


Chuck Thomas
Florida Center for Library Automation

Robert H. McDonald
San Diego Supercomputer Center

Cat S. McDowell
University of North Carolina, Greensboro

Red Line


Scholarly digital repositories continue to be one of the most dynamic and varying components of the emerging digital research library. Little consensus is evident on matters such as depositing content in disciplinary or institutional repositories, or both. Debates about deposit mandates and access to research have spilled into the political arena and have focused much attention on various aspects of digital repositories, including the economics and patterns of scholarly publishing, systems and technology, governmental and organizational policies, access, accountability, research impact, and the motivations of individual researchers. Scholarly digital repositories are a rich area for both empirical research and philosophical debate, and are the central theme of a growing body of published literature.

It is surprising, therefore, that so much is still unknown about the basic nature of digital repositories, including both differences and similarities. As the two Repositories by the Numbers articles in this issue show, digital scholarly repositories are diversifying both in their general nature and in the information they contain. Because there is still much to be discovered or understood at the most basic levels of digital repositories, co-authors Chuck Thomas and Robert H. McDonald and author Cat McDowell offer readers two different but complementary statistical studies of various types of institutional and disciplinary repositories. Re-iterating a theme of many of the recent works presented at the 2nd International Conference on Institutional Repositories, Thomas and McDonald apply statistical techniques to explore patterns of scholarly participation by more than 30,000 authors in several categories of repositories. McDowell reports on her ongoing analysis of the growth and development of institutional repositories in American universities and colleges. Together, these articles reveal new aspects of the digital repository landscape, and present data that will be of immense interest to repository planners and sponsors.

While each article is concerned with different aspects and measurements of repositories, two themes are common to both articles. First, the authors of both articles explain difficulties involved in gathering and comparing data from a variety of systems and organizations. Just as the definition and purpose of a digital scholarly repository is likely to vary among scholars, disciplines and organizations, obtaining and comparing similar data about each analyzed repository was a major challenge for both studies. In some instances, data from individual repositories were so dissimilar, or introduced so many uncertainties, that some repositories could not be measured as part of the study. Co-authors Thomas and McDonald give a lengthy explanation of these issues, and call for the scholarly digital repository community to begin work on some common reporting standards and guidelines.

The second theme common to both articles is the value of automated harvesting and analysis of data from repositories. Harnad (2006) explained the value of real-time data gathered from repositories; the Repositories by the Numbers authors detail the manual tabulation and analysis that was required in an environment of inconsistent and uncertain data gathered from a variety of scholarly digital repositories. However, both articles acknowledge the need for more automated harvesters to gather and analyze various data on the characteristics and contents of repositories.

Each article analyzes different characteristics of repositories. Even so, important topics like deposit mandates, research impact, and repository categorization inevitably are discussed as important considerations when evaluating many different aspects of scholarly digital repositories. By drawing upon the latest and most authoritative literature across the spectrum of digital repository research and debate, and by introducing new and thoughtful insights into the current state of digital repositories, the authors of both Repositories by the Numbers articles give D-Lib's readers useful reports to consider as they construct their respective pieces of the emerging digital repository.


Harnad, S. (2006). Online, continuous metrics-based research assessment. <>.

Copyright © 2007 Chuck Thomas, Robert H. McDonald, and Cat S. McDowell

Top | Contents
Search | Author Index | Title Index | Back Issues
Commentary Part 1 | Commentary Part 2 | Next Article
Home | E-mail the Editor


D-Lib Magazine Access Terms and Conditions