Oya Y. Rieger
Many cultural and educational institutions are in the process of selecting or developing repositories to support a wide range of digital curation activities, including content management, submission, ingest, archiving, publishing, discovery, access, and preservation. In addition, there is an increasing emphasis on deploying systems that support content re-purposing and delivery of a wide range of web services. This article offers strategies to match specific institutional requirements with repository system features and functionalities. The repository model selection process involves several essential stages including stakeholder analysis, needs assessment, service definition, and identification of use cases and governance-related matters. Also, it is essential to conduct a thorough resource requirements analysis to ensure the sustainability of development, management, marketing, and assessment activities. Equally important is taking into consideration the existing and evolving work culture and practices of repository stakeholders. The ultimate success of a repository implementation is often determined by how well it supports organizational procedures, policies, practices, and collaborations.
Role of Repositories in Digital Curation
Within an extended information lifecycle framework, digital curation involves a series of technical, intellectual, and managerial activities in support of creating, acquiring, appraising, repurposing, accessing, and preserving digitized and born-digital information assets (Figure 1). One of the critical processes in bringing efficiencies to the curatorial workflow is the deployment of a repository system. The goal of this article is to discuss selected key attributes in evaluating the suitability of a repository system for a particular curatorial setting. Although digital curation can be generalized by the common processes it encompasses, selection criteria for a digital repository in support of curatorial practice should be determined by a customized decision based on the unique characteristics of a given organization, its culture, and content types.
Figure 1: Lifecycle Management Process
Table 1 captures some of the key motivating factors behind the development of a repository in support of curatorial activities. There is a continuum of use levels and types from enabling access to providing stewardship.
Processes Involved in Selecting a Repository Model
1. Identify Key Stakeholders
According to Wikipedia, a stakeholder is a person or organization that has a legitimate interest in a project or entity. Stakeholders who are involved in a given initiative have an impact on the deliverables or are affected by the results of the initiative. Stakeholders can influence, support, or resist the outcomes of a given initiative. In the case of a repository for curatorial purposes, the stakeholder group would include archivists, technical staff, funders, administrators, institutional leaders, content creators, researchers, scholarly and cultural societies, and users (including both service staff and end-users).
Several recent studies highlight the challenges involved in promoting the use of repositories among targeted users (such as faculty or research staff) both as end-users and content contributors. This may be partially an outcome of the fact that the repository selection and implementation process is often seen as a special project, one that doesn't include representatives of key stakeholders. For example, Markey et al. raise the question of limited involvement from archivists in the design, implementation, and management of repositories. Table 2 highlights some of the key benefits of conducting a stakeholder analysis at the outset of a repository selection process. It is critical to involve, influence, inform, and listen to stakeholders to ensure that the system selected will gain broad acceptance and bring efficiencies to the curatorial process.
2. Conduct a Needs Assessment Analysis
A needs assessment is instrumental in gathering information about various aspects of a new service or a system to ensure that there is a common understanding of needs, opportunities, goals, and impediments. The assessment is a diagnostic process that surveys the existing or anticipated demand, curatorial needs, content characteristics, and technical and organizational infrastructure (Figure 2). The process helps to determine current demand for a repository, specific service needs, management issues, policy development requirements, and content types. Potential problems in recruiting content providers or unique aspects of ingest workflow are likely to be discovered during the systematic needs assessment process. Another benefit of the needs assessment process is building a common agenda and increasing awareness among system designers and developers, managers, funders, and users. Findings of the assessment process will also be instrumental in defining required services and creating use cases.
Figure 2: Factors in Needs Assessment
It is common to omit a systematic needs assessment stage due to its time-consuming nature.1 However, as Stewart points out, avoiding assessment may lead us to the "risk of making poor or uninformed decisions."
Cervone's article on selecting digital library software recommends starting with the question: "What problems are we trying to solve?" Unfortunately, according to his experience in many digital library projects, the answer to this focal question is not always obvious. This approach is especially problematic for curatorial initiatives that have an implicit or explicit goal of changing existing work practices (such as scholarly communication).
3. Identify Resource Requirements
According to the Association of Research Libraries SPEC Kit 292, repository implementers surveyed reported their start-up costs to be in a range of $8,000-$1,800,000 (with a mean of $182,550) and an average ongoing operating cost of $113,500. Costs for staff and vendors represent about 75% of the institutional repository (IR) budget. However, it is not clear how comprehensive these data are, as many institutions either do not keep track of their repository-related expenses or neglect some of the embedded and transparent costs such as those for planning, information gathering, promotion, and policy development. Nevertheless, even these preliminary findings point out the significance of financial commitments and the need to align implementation with existing financial realities.
4. Understand the Existing Human Landscape
As social construction of technology (SCOT) proponents argue, the reasons for acceptance or rejection of a particular technology could be revealed by examining the socio-cultural aspects of work. Institutional settings involve a significant amount of human-to-human communication and interaction. Understanding the organizational infrastructure (culture, policies, governance issues, politics, goals, etc.) is as important as articulating the technical infrastructure, which is composed of systems and applications. For example, a recent study by the Center for Studies in Higher Education concludes that approaches that try to "move" faculty and their deeply embedded value systems directly toward new forms of archival systems are destined to fail [King et al.] Traditions and work practices of different communities shape their acceptance and use of technologies. There is not yet a tool or a metrics to facilitate the evaluation of existing institutional policies, disciplinary cultures, and organizational infrastructures. This is partially due to initial efforts focusing mainly on technical matters and emphasis on building interdisciplinary institutional repositories.
Factors in Decision Making
In addition to the information gathered during the assessment stage, there are many other issues in selecting a repository system or a model that should be considered. Table 3 highlights some of these that are appropriate to explore during the decision-making process.
One of the challenges in providing an inclusive methodology for selecting and evaluating repository models is the heterogeneous nature of the content types and user communities. For example, the Digital Image Database Standards Checklist (DIDSC) is specifically designed as a tool to assess visual image repositories for deployment. Han provides another useful tool in support of content management requirements analysis. His matrix provides a structure for comparing organizational, presentation, access, preservation, performance, and cost requirements. The RLG/NARA Audit Checklist for Certifying a Trusted Digital Repository has proven it to be a useful instrument in assessing certain preservation characteristics of a given institution.2
As web services become ubiquitous in our digital lives, designing flexible and interoperable repositories is essential in order to meet the evolving needs of users both from end-user and system/service manager perspectives. A crucial repository feature in today's technology landscape is creating repositories that are scalable in order to accommodate expanding volume, content, and service types. Examples of web services include file format migration, social tagging, peer review, plagiarism detection, and citation analysis. There is also a growing type of content-specific tools in support of visual image manipulation, text annotation, and statistical data analysis.3 Another important factor is building a system that enables repurposing of content and services. For example, Cornell University Library recently started working with BookSurge (Amazon) to make its digital collections available via print-on-demand. Current programs, such as the cyberinfrastructure initiative, require that we examine information deposited in repositories through different lenses. Understanding the information chain and relationship among various forms of related information objects is becoming critical in order to create a usable and effective tool. In their report, Choudhury and Martino state that at Johns Hopkins, they are "promoting the idea that applications should access repositories through an abstract, repository agnostic layer, rather than through custom application to repository integrations."Selecting a repository system should be a holistic process taking into consideration institutional policies, user characteristics, factors in recruiting content, existing technical infrastructure and skills, and current and future goals. For example, several institutions are using more than one system in support of their curatorial processes, and this continues to pose a problem in streamlining technical operations and providing integrated services and discovery experiences. Repository assessment and selection activities need to be included within a broader context considering technical, organizational, and socio-cultural issues. Equally important is implementing repositories that will enrich the existing information environment through the creation of new services and collaboration models. We do not yet have a good handle on what the future holds regarding repository services and work practices in support of curatorial services. As the curatorial community gains more experience in selecting and deploying repositories, it is crucial that we share both success and failure stories, as we have much to learn from these experiences.
Art Libraries Society of North America. Digital Image Database Standards Checklist: Technical, Functional, Content, & Access Recommendations. . <http://www.arlisna.org/organization/com/standards/didsc.pdf>.
Bailey, Charles W., Karen Coombs, Jill Emery, Anne Mitchell, Chris Morris, Spencer Simons, Robert Wright. Institutional Repositories. SPEC Kit 292. Association of Research Libraries, Washington, DC:, 2006. <http://www.arl.org/bm~doc/spec292web.pdf>.
Cervone, Frank H. "Some Considerations When Selecting Digital Library Software." OCLC Systems and Services, 2006, Vol.22, Issue.2, pp.107-110.
Chavez, Robert, Gregory Crane, Anne Sauer. "Services Make the Repository." Digital Curation and Trusted Repositories: Seeking Success Workshop. June 2006, Chapel Hill, NC <http://sils.unc.edu/events/2006jcdl/digitalcuration/>.
Choudhury, Sayeed and Jim Martino. A Technology Analysis of Repositories and Services. April 5, 2005. <http://ldp.library.jhu.edu/projects/repository/documents/CNI-writeup.pdf>.
Han, Yan. "Digital Content Management: The Search for a Content Management System." Library Hi Tech. Volume 22, Number 2, 2004, pp. 355-365.
Kaczmarek, Joanne, Patricia Hswe, Janet Eke, Thomas G. Habing. "Using the Audit Checklist for the Certification of a Trusted Digital Repository as a Framework for Evaluating Repository Software Applications: A Progress Report." D-Lib Magazine. December 2006, Volume 12 Number 12. <10.1045/december2006-kaczmarek>.
King, Judson, Diane Harley, Sarah Earl-Novell, Jennifer Arter, Shannon Lawrence, Irene Perciali. Scholarly Communication: Academic Values and Sustainable Models. Center for Studies in Higher Education, University of California: Berkeley, July 27, 2006. <http://cshe.berkeley.edu/publications/publications.php?id=23>.
Markey, Karen, Soo Young Rieh, Beth St. Jean, Jihyun Kim, and Elizabeth Yakel. Census of Institutional Repositories in the United States MIRACLE Project Research Findings, CLIR Pub 140: February 2007. <http://www.clir.org/pubs/abstract/pub140abst.html>.
Research Libraries Group and OCLC. Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. February 2007. <http://www.crl.edu/content.asp?l1=13&l2=58&l3=162&l4=91>.
Stewart, Heather. Technology Assessment: Making Sure We Get It Right. Boulder, Educause Center for Applied Research Bulletin, 2002, Volume 2002, Issue 21. <http://www.educause.edu/ir/library/pdf/ERB0221.pdf>.
Copyright © 2007 Oya Y. Rieger