Technologies Employed to Control Access to or Use of Digital Cultural Collections: Controlled Online Collections
Kristin R. Eschenfelder
This article describes the results of a survey investigating the use of technological protection measure (TPM) tools to control patron access to or use of digital cultural materials made accessible by U.S. archives, libraries and museums. Libraries reported using a broader range of systems than archives or museums including repository software, streaming media servers, digital library software and courseware. In terms of controlling access to collections, most respondents reported using IP range restrictions and network-ID based authorization systems. Some reported restricting access to approved terminals or individual user registration systems. In terms of controlling use of collection items, respondents reported reliance on resolution limits, clips and thumbnails, and visible watermarking. A lower percentage reported use of click-through license agreements. Few institutions reported using new technologies to control access or use such as pop-ups, disabling right click copy and save functionalities, invisible watermarks, viewers or cross-institutional authentication systems.
Technological protection measures or "TPM" refer to computer hardware and software based systems or tools that seek to limit access to a work or use of a work. Many people use the term "DRM" or digital rights management system to refer to this set of tools; however, the term "digital rights management" refers to the much broader set of concerns and practices associated with managing rights from both a licensor and a licensee perspective. In order to avoid confusion, this article employs the narrower term TPM. TPM are defined in international legal documents as tools that prohibit uses not approved by owners of copyrighted works.(2) But while TPM are typically associated with controlling copyrighted works, they can also be employed to control works not protected by copyright. This article investigates TPM use regardless of the copyright status of the underlying works.
Given the new possibilities afforded by technology, to what extent are U.S. cultural institutions using TPM to create controlled online collections network accessible collections where access to the collection is controlled, or use of the materials is controlled? We currently have little knowledge about how U.S. cultural institutions use technologies to control access to collections and use of collection materials.
Few published studies of TPM use at academic or cultural institutions exist. One challenge is the difficulty of distinguishing between various means of controlling access or use in question asking. In this article we distinguish between control "systems" and control "tools." Systems are branded (often commercial) software packages involving numerous interrelated functionalities, including functionalities to control access to and use of content. They should be distinguished from individual control tools or techniques that might appear across many different types of systems. Past studies have examined both without considering the relationship between them. For example, the Maryland Center for Intellectual Property's study of copyright control systems use at universities investigated how universities seek to control use of digital content through use of a variety of systems types including course management software, digital library systems, streaming media servers, client-side digital viewers or players (Kelley, Bonner, Lynch, & Park, 2006). Other studies have examined specific tools. For example, Dryden's study of Canadian archives found that display of low resolution materials was the primary means of limiting further uses while watermarks were employed by 20% of respondents (Dryden, 2008).
Another challenge is making the distinction between access control and use control. As Agnew (2008) notes, controlling access and controlling use are often conflated but they are distinct, as they tend to involve different entities in a rights transaction (see Figure 1); moreover they likely employ different systems or tools. Access control involves technologies that interact primarily with the user, and they control users' access to a networked system. For example, userID and password authentication systems interact directly with a user by requiring them to enter a username and password to login to a resource.
Figure 1: Image-DRM Model (from Agnew 2008)
Usage controls tend to involve technologies that manipulate or mediate the resource itself. Usage controls can differ by user, such as time frame for usage that varies based on the payment of a fee, but the usage controls are applied to the resource itself. Further, they only affect users after the user has gained access to the resource. For example, display of low resolution images or streaming audio to deter copying requires manipulation of the content itself. Usage controls are sometimes simple manipulations of content (e.g., creation of a thumbnail) and sometimes require complex applications or scripts that act on content: for example, scripts that limit the number of uses, number of downloads, or hardware options for download (e.g., download only allowed to a specific brand of player), etc. Usage controls also include "moral claims," such as the publication of click through licenses prior to access, acceptable use policy statements, or Creative Commons licenses. These attach legal, but often unenforceable, restrictions to a piece of content (Agnew 2008). All types of use controls mediate how users can interact with content for example an acceptable use statement might deter a user from making a copy without seeking permission. A watermark might preclude a user from printing out an unmarked copy of a work.
Given the possibilities available to control access to online collections and to control how users make use of online collections, to what degree do U.S. cultural institutions employ access or use controls? This article reports the results of a study that was conducted to find what technologies "innovative" U.S. cultural institutions are using to control access to or use of digital cultural collections.
The study employed a paper-based survey to gather information about the technologies used by a purposeful sample of archives, libraries and museums actually engaged in the practice of controlling access to or use of collections. We defined this set of institutions as Controlled Online Collections Institutions or "COC Institutions."
There is no easily identifiable COC institution population. Accordingly, a good deal of time and effort went into developing the survey sample. The goal was to send the survey to organizations most likely to host COC (with a few caveats(3)). We assumed that "innovative" organizations were more likely to be experimenting with COC. Innovative institutions were defined as institutions that met the following criteria:
We sought to send the survey to individuals who had titles similar to the following: manager of digital collections or digital projects, IP manager, rights manager or head curator. We gathered contact information from institutional websites, from conference proceedings, and from cold-calling the institution's management offices. When no contact information was available, or when instructed to do so, we sent the survey to the director of the archive, library or museum.
We excluded all dark archives. We also excluded commercial providers of digital cultural collections such as ARTstor and libraries and archives housed in commercial organizations.
The survey was sent to a total of 343 institutions and 234 were completed and returned, for a 68% response rate. But the total response includes institutions that did not have COC and did not plan to create COC. This paper only describes the subset of "COC institutions" that reported either having COC or planning development of COC in the near future. The findings reported in this article are therefore based on 154 reporting COC institutions, including 53 archives, 60 libraries and 41 museums.(5)
Of the 53 COC archive respondents, 22 respondents were state government archives (41.5%), 18 were college/university archives (34%), 6 were historical society archives (11.3%), four were museum archives (7.5%), 2 were independent archives (3.8%) and one was a federal government archives (1.9%).
Of the 60 COC libraries, the vast majority (49) were academic libraries (81.7%). The next largest group (6) described themselves as "other" types of libraries (10%). Few public (2), state (1), museum (1) or archive libraries (1) fell in the COC group.
Of the 41 COC museums, 16 were art museums (39%), 7 were "other" museum types (17.1%), 6 were natural history/anthropology museums (14.6%), and 3 were history museums (7%). Three science/technology centers, two arboretums/botanical gardens, two historic house/sites, one nature center and one general museum also reported.
The survey included a set of questions about systems and a set of questions about tools. The systems question asked "Does your institution employ the protection measure features in the following systems to control access to or use of your controlled online collections?" Because a given institution might have different COC that employ different systems, respondents could choose multiple responses. The tools question asked "Indicate whether your institution employs any of the following technological tools within your controlled online collections." Each tool had a check box next to it. Respondents were instructed to mark the check box to indicate that they employed the tool in any of their COC. Tools were listed in six roughly interrelated sets to ease scanning of choices.
It is important to note that the response rate for the technology questions was lower than for non-technology questions that are not reported in this article (e.g., about motivations for controlling). Some respondents who completed other parts of the survey indicating that they did control access to or use of collections material then skipped the entire systems or tools questions sets. For this reason, the response rate for each technology question set is reported.
The lower response rate for the technology questions may stem from respondents' lack of technical knowledge to easily answer these question sets. Or, it may be that respondents are not using the systems and tools listed in the survey. The response rates might have been higher if we had been able to send the technology questions directly to the technical staff, or if we had included other types of systems or tools in our survey list. Alternatively, it may be that respondents really are not using any technologies to control access or use.
System Use: Comparison among archives, libraries and museums
Table 1 reports the results of the systems question across archive, library and museum respondents. We define systems as branded (often commercial) software packages involving numerous interrelated functionalities, including functionalities to control access to and use of content. Data show that a greater percentage of libraries report making use of each of the systems to control access and use. All data is rounded to the nearest whole number.
Table 1: Comparison of Systems Use among Archives, Libraries and Museums
Response rates to the systems question varied by institution type with 41 of the 53 COC archives responding, 56 out of 60 COC libraries responding, and only 29 out of the 41 COC museums responding.
Interestingly, 27% of library respondents reported using 3rd-party licensed platforms for their COC. This may stem from respondent error; while the survey instructed respondents not to include collections they licensed from commercial vendors (e.g., ARTstor, Naxos), write-in responses suggest that at least some respondents did include these types of resources. But an alternative explanation is that digital libraries are including their own works in 3rd-party licensed platforms such as ARTstor.
Tool Use: Comparison among archives, libraries and museums
Table 2 compares the reported tool use among the reporting COC archives, libraries and museums. We defined tools as techniques for control that might appear across many different types of systems. Response rates for the tools question was slightly higher than the response rates for the systems question, with 48 of the 53 COC archives responding, 57 of 60 COC libraries responding, and 36 of the 41 COC museums responding. All data is rounded to the nearest whole number. We focus on the tools chosen by at least 25% of respondents.
Table 2: Comparison of Tool Use among Archives, Libraries and Museums
"Other" tools written in by COC respondents included:
Data show that different institution types reported use of different types of systems to control access and use, but all system types were used across institution types to some extent.
Libraries reported employing a wider variety of systems to control access to or use than archives or museums. In considering why a greater percentage of library respondents reported use of various systems to control access or use, it is important to recall that most library respondents were academic libraries (81.7%) while most archives responding were state government archives (41.5%), and most museums were art museums (39%). It may be that the position of academic libraries within larger campuses allows them to draw on resources to acquire and experiment with a variety of technological systems that may be unavailable to state government archives or art museums.
An alternative explanation is that the systems listed did not represent those recognized by the archives or museums respondents. For example we should have included "Collection Management System" as an option instead of relying solely on the choice Digital Asset Management System. Moreover, it is clear that many respondents felt that username and password authentication systems should be considered a system rather than a tool.
While this study attempted to distinguish between systems and tools, the line we drew distinguishing the two is conceptually unsatisfactory. On one hand, systems may contain many different tools, but some tools may be part of larger systems, or may function independently. For example, one respondent described the image viewer contained in their larger collections management system, "Access is controlled by our collections management system and its embedded image viewing software (www.opencollection.org)."
This situation is complicated by the difficulty of knowing whether tools that are parts of larger systems may be optional to use or required. For example, although one keeps documents in an institutional repository (IR), one does not have to use the embargo function within the IR to restrict access. Alternatively, the tool may be a required function that would need significant programming to "turn off." The distinction is less important in the commercial TPM space where a system is an end-to-end implementation, from the licensing of a resource for download through playback. iTunes is a classic example of a system that is an enclosed model for end-to-end delivery of restricted resources from selection to playback and storage. In the digital library space, the classic commercial end-to-end TPM system seldom applies because of the need to integrate COC with existing collections, services and technological infrastructure.
The distinction between systems and tools is important in order to report to what extent and how institutions are restricting access or use. A respondent may know they use a particular system, but not know all the tools available or activated within that system.
As vendors build tools such as viewers and streaming into larger systems, more cultural institutions will acquire more access and use control capabilities and will be faced with the question of to what degree they wish to enact those capabilities.
Access and Use Control Tools
In terms of controlling access to collections, respondents continued to rely on relatively well-known techniques. Authentication and authorization systems and IP range restrictions were commonly reported. High use of internal protocol (IP) range restrictions (29% archives, 65% libraries, 22% museums) and network ID-based authorization systems (29% archives, 44% libraries, 24% museums) is not surprising, as these systems are standard network infrastructure in contemporary networked cultural institutions. By limiting access to "authorized users" or in-house patrons, these systems solve many access and use control concerns without application of further technologies (Agnew, 2008). However these systems are less useful when the majority of users have no affiliation with that institution and therefore no network ID credentials. Restricting access to LAN workstations or other approved terminals was less common, but is still prevalent in archives and libraries.
One relatively new technique reported by a sizable percentage of respondents was requiring registration. A substantial percentage of institutions reported requiring users to register for specialized accounts or to ask permission for use (33% archives, 37% libraries, 29% museums reporting). User accounts are very useful if one does not have a user population that shares network ID credentials. Furthermore, institutions might employ registration of user accounts to collect better data about users of the collection than that easily provided by networked credentials such as institutional affiliation details, and purpose of use. Arguably, the information provided by user accounts could allow institutions to more easily contact users about misuse concerns. Institutions might also employ user accounts in order to limit use of the collection to those who meet certain criteria, such as student status, or scholarly credentials. For example, if one has a collection of in-copyright materials, it might be useful to be able to claim that all users of the account are using the content for research purposes. Finally, institutions licensing images often create FTP server accounts for patrons who have purchased an image license in order to facilitate downloading that image from the secure FTP server.
One can begin to see uptake of cross-institutional authentication and authorization systems such as Shibboleth in the survey results (8% archives, 19% libraries, 2% museums). The higher percentage uptake among libraries is not surprising given that systems like Shibboleth are primarily aimed at higher educational institutions and given the high numbers of academic libraries among the respondents. Implementation of Shibboleth would allow cultural institutions to set finer-grained access rules than those typically supported by college and university authentication and authorization system; for example, resource access could be restricted to students and faculty from particular academic departments (Zhu, Eschenfelder).
The strong use of lower resolution files and thumbnails or clips is not surprising, as these techniques have been a digitization best practice for some time both for usability reasons and also as a means of controlling reuse (Seegar, 2001; Western State Digital Standards Group, 2003). The finding also comports with Dryden's findings about high Canadian archives use of resolution to control usage (Dryden, 2008).
However, these use percentages are not overwhelming; a large percentage of respondents did not report using resolution and clips to control use. It may be that many respondents only used them to improve download times and server loads, not to control use. Another explanation is that non-respondents' collections did not include materials for which use would normally be controlled through resolutions or clips. Finally, it could be that some respondents are purposefully posting higher resolution materials in response to calls by critics to make higher resolution materials freely available on the Web (Hamma, 2005; Max Planck Institute for the History of Science, Jan 11 2009).
Survey results suggest that a significant portion of archives and libraries employ visible watermarks to control use. Best practice sources caution that use of visible watermarking as a control technology can interferes with legitimate viewing and use of works. Visible watermarks must be used cautiously - as Agnew argues, "the user's experience of the content should not be affected by the presence of a watermark" (Agnew, 2008). The percent of institutions reporting use of forensic, invisible or bit stream watermarking was very low (2% archives, 2% libraries, 9% museums). Forensic watermarks are generally placed in hidden areas of digital files, such as low-level noise areas of an audiovisual file, where they will not impact the user experience but which enable both control of digital file use and tracking of downstream uses on the web. They may uniquely identify the content and sometimes the licensor of that content. These systems may identify unauthorized reusers via automated searches for unauthorized copies of the redistributed watermark code, or they may interfere with downstream uses such as playing, printing or further copying. Forensic watermarks are increasingly used by large content hosting sites, such as YouTube, often for an additional fee to a corporate subscriber (Agnew, 2008).
Only about 20% of institutions reported making use of "moral claim" controls such as click-through end user license agreements, or pop up copyright warnings or captions. While these percentages are modest, comparison of these results with Dryden's study of Canadian archives suggest that US institutions are using click-through agreements more than Canadian institutions. Dryden found that only 3% of Canadian archives reported using click-through agreements (Dryden, 2008).
The low percentage of institutions reporting use of streaming as a means of controlling content is curious, especially since a slightly greater percentage claimed to use streaming media servers. One explanation is that respondents did not see their use of streaming as a use control mechanism, but rather as a usability method to limit download waiting times. Similarly, respondents might have chosen streaming servers primarily as a well-established solution for the delivery of audio and video content rather than as a control device.
Only a very small number of institutions reported designing user interfaces specifically to deter copying, or using scripting to disable browser right-click or "save as" functionalities. This was somewhat surprising since these techniques are commonly employed by scholarly commercial publishers to make certain uses of online content less convenient (Eschenfelder 2008).
The results of this survey of a purposeful sample of innovative archives, libraries and museums suggest that the majority are not making use of newer technologies to create controlled online collections or digital networked collections where access to the collection is controlled or use of collections material is controlled. Rather, most institutions that have COC make use of technological systems and tools that have been available for some time.
The tools that were most commonly employed include resolution limits, clips, and authentication and authorization systems. The data show a smaller percentage of institutions experimenting with user accounts requiring registration or permission and streaming. Only a very small percentage of institutions reported use of less common tools such as pop-ups, disabling right click copy and save functionalities, invisible watermarks, viewers or cross-institutional authentication systems.
It is important to clarify that different collections within an institution may employ different systems and tools. It is impossible to say if the systems and tools reported by a respondent was limited to one small collection or used across many collections in the responding institution. Further, the results cannot say what types of collections (in terms of subject matter, media type or copyright status) are likely to employ access or use controls.
(1) This study was funded by Institute of Museum and Library Services Laura Bush 21st Century Research Grant RE-04-06-0029-06.
(2) For example the WIPO Handbook on Intellectual Property refers to technological measures: "5.229 No rights in respect of digital uses of works, particularly uses on the Internet, may be applied efficiently without the support of technological measures of protection and rights management information necessary to license and monitor uses." "Technological measures ... are used by authors in connection with the exercise of their rights under this Treaty or the Berne Convention and that restrict acts, in respect of their works, which are not authorized by the authors concerned or permitted by law." (pg. 273)
(3) In addition to the "innovative institution" sample development strategy described above, preliminary interviews also led us to believe that institutions with large audio and video collections might be most likely to experiment with COC; therefore, we purposefully included conferences addressing audio and video issues. Moreover, the end sample included two further classes of institution who were underrepresented in the initial drafts of the sample created using the above methodology: state archives and public libraries both of which were underrepresented in the initial sample list. Anecdotal evidence also suggested our sample list was missing archives and public libraries that we knew had had COC. Based on this, we added to the sample list all remaining state archives to our sample and the 12 largest public library systems, based on budget.
(4) We examined the programs of the following conferences for presentations on digital collections: ICHIM (05, 07), Museums and the Web (06,07), WebWise (06,07), Society of American Archivists Annual Meeting (05,06), Open Repositories (06,07) , Museum Computer Network (06,07), Computers and Libraries (06,07), ARSC Association of Recorded Sound Collections (06, 07), Association of Moving Image Archivists (06, 07), Mid-Atlantic Regional Archives Conference (06, 07), Joint Conference on Digital Libraries (06, 07).
Agnew, G. (2008). Digital rights management: A librarian's guide to technology and practise. Oxford UK: Chandos.
Dryden, J. (2008). Copyright in the real world: Making archival material available on the internet. Unpublished PhD, University of Toronto, Toronto.
Eschenfelder, K.R. (2008). "Every Library's Nightmare? Digital Rights Management and Licensed Scholarly Digital Resources." College and Research Libraries 69(3).
Eschenfelder, K.R. (2009). Controlling Access to and Use of Online Cultural Collections: A Survey of U.S. Archives, Libraries and Museums for IMLS. University of Wisconsin-Madison School of Library and Information Studies: Madison, Wisconsin.
Hamma, K. (2005). "Public domain art in an age of easier mechanical reproducibility." First Monday, 11(11)
Kelley, K. B., Bonner, K. M., Lynch, C. A., & Park, J. (2006). Digital rights management (DRM) and higher education: Opportunities and challenges. In K. Bonner (Ed.), The center for intellectual property handbook (pp. 107-121). New York: Neal-Schuman.
Max Planck Institute for the History of Science. (Jan 5 2009). Scholarly publishing and the issues of cultural heritage, fair use, reproduction fees and copyrights. Berlin: Max Plank Institute. Retrieved from here.
Seegar, A. (2001). Intellectual property and audiovisual archives and collections folk heritage collections in crisis. In V. Danielson, E. Cohen & A. Seegar (Eds.), Folk heritage collections in crisis (pp. 32-50). Washington DC: Council on Library and Information Resources.
Western State Digital Standards Group. (2003). Western states digital imaging best practices v. 1. Boulder Colorado: Western States Digital Standards Group.
Zhu, Xiaohua; Eschenfelder, K. R. (in press) The Social Construction of Authorized Users in the Digital Age? to appear in College and Research Libraries.
About the Authors