Inspired by the Open Archives Initiative, the United Kingdom (UK) Joint Information Systems Committee (JISC) established the FAIR (Focus on Access to Institutional Repositories) programme in 2002. One of the programme's objectives was to "explore the challenges associated with disclosure and sharing [of content], including IPR and the role of institutional repositories". To this end, the JISC funded a one-year project called RoMEO (Rights Metadata for Open archiving). RoMEO, which took place between 20022003, specifically looked at the self-archiving of academic research papers, and the subsequent disclosure and harvesting of metadata about those papers using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) by OAI Data and Service Providers [Open Archives Initiative, 2002a].
The RoMEO project aimed to develop simple rights metadata by which academics could protect their research papers in an open-access environment and also to develop a means by which OAI Data and Service Providers could protect their open-access metadata. RoMEO proposed to show how such rights solutions might be disclosed and harvested under OAI-PMH.
The RoMEO project was divided into two phases: a data-gathering phase and a development phase. The project team produced a series of six studies based on their work [Gadd, Oppenheim, and Probets, 2003a; 2003b, 2003c, 2003d, 2003e, 2003f]. (In the remainder of this article, these studies will be referred to as RoMEO Studies 16). This article aims to provide an overview of all the activities of the RoMEO project and to report on its key findings and recommendations.
Understanding stakeholder requirements
The goal of the first phase of the RoMEO project was to understand the intellectual property rights (IPR) issues facing the key stakeholders in the self-archiving process. Online questionnaire surveys were performed of academic authors, scholarly journal publishers, and OAI Data Providers (DPs) and Service Providers (SPs). As the response from journal publishers was poor, it was fortuitous that the project had also planned to perform an analysis of journal publishers' author copyright agreements. Such agreements provide a good overview of the contractual relationship between author and publisher, and the analysis proved very enlightening.
The full results of the academic author survey are reported in RoMEO Studies 1, 2 and 3 [Gadd, Oppenheim, and Probets, 2003a; 2003b, 2003c]. In total, 542 academics responded to the survey. They were based in 57 countries and represented a wide variety of subject disciplines. The main aims of the survey were to find out how academics wanted to protect their actual or potential open-access research papers and also how they expected to use such papers. This data would inform the development of appropriate rights metadata for this purpose. Authors were also asked for their views on the copyright ownership of research papers.
In total, 61% of respondents thought that academics owned the copyright in such papers, although 32% admitted that they did not know who owned the copyright. When it came to assigning copyright, however, 90% of respondents reported having done so, which must include many of those that were not sure whether in fact they owned such rights. Fifty per cent of respondents indicated that 71-100% of their papers were multi-authored. This could leave room for disagreement amongst co-authors as to if, when, and where self-archiving took place. An unexpected finding was that 25% of respondents had previously had to clear third-party materials before publishing a paper. Again, this would affect an author's ability to self-archive as the third-party or parties would have to agree not only to publication in a peer-reviewed journal, but also to have their work made freely available on the web.
The questionnaire provided academics with a list of possible permissions (e.g., print, save, excerpt, etc.), restrictions and conditions using terms from the Open Digital Rights Language (ODRL) [ODRL, 2003]. A restriction is a limit on the extent of a permission (e.g., you may print, but only four times), and a condition is a prerequisite that must be met before the permission is granted (e.g., you may only print if you have first paid a fee). The questionnaire asked respondents to specify which they should like to apply to their own open-access works. It then asked which of the same list respondents expected to apply to their use of others' open-access works.
The majority of respondents (60% or more) were happy for others to display, print, save, excerpt from, and give away their research papers as long as the respondents were attributed as the authors, and that all copies were exact (verbatim) copies of the original work. Most respondents wanted to prohibit sales of their works and 55% wanted to limit usage of their works to certain purposes, e.g., educational or non-commercial. A comparison between these usage limits and those provided by UK copyright law and many electronic journal licences showed that the academics' conditions were far more liberal.
Interestingly, the subsequent comparison between how academics-as-authors wished to protect their open-access research papers, and how academics-as-users expected to use such research papers, showed that academics did not expect to make use of all the permissions they were prepared to grant others.
The academic author survey showed that many academics feared either breaking existing publisher agreements, or not being published, if they self-archived their research papers. Obviously, there would be little benefit in developing a means of protecting open-access works through rights metadata if academics were too anxious to self-archive those papers in the first place. The analysis of 80 scholarly journal publisher copyright agreements (CAs) aimed to see if such fears were founded.
The full results of this analysis have been written up in RoMEO Studies 4 [Gadd, Oppenheim, and Probets, 2003d]. The analysis showed that 90% of publishers did ask for copyright transfer, with 6% asking for exclusive licences, and 4% asking for non-exclusive licences. However, we found that exclusive licences could be equally as restrictive (in terms of the rights retained by the author) as copyright transfer agreements. The move towards exclusive licences may be an increasing trend as publishers respond to authors' desires to retain copyright [ALPSP, 1999]. Indeed, the UK Association of Learned and Professional Society Publishers (ALPSP) has developed a model exclusive licence that it has made available to its members.
Despite the fact that most publishers asked for copyright, authors could still self-archive their works before assigning copyright if they did not fear publishers asking them to warrant that the work had not been previously published: the so-called Ingelfinger rule [Relman, 1981; Angell, 1991]. Our analysis showed that 68.7% of agreements asked for copyright prior to the refereeing process, meaning that only the preprint could be self-archived in this way if the Ingelfinger rule was in place. That would not stop authors employing the Harnad-Oppenheim proposal and self-archiving a corrigenda of changes along with the original preprint, whilst not breaking the terms of the agreement [Harnad, 2001]. In 75% of agreements authors were asked to warrant that their work had not been previously published. Interestingly, in only two agreements was it specifically stated that self-archiving constitutes prior publication. However, one could assume that publishers using other CAs that prohibit it viewed self-archiving as prior publication even when it was not explicitly stated so in the CA.
Of course, the matter of copyright transfer would be of less concern to the open-access movement if publishers would grant back to authors the right to self-archive. We found that although 28.5% of CAs did not give authors any rights to use their own works, just under 50% allowed authors to self-archive. However, there were no standard terms and conditions under which authors could do so. Some publishers allowed self-archiving of the preprint only, some the postprint only, some asked for preprints to be removed on formal publication, and others specified the type of site on which the self-archiving must take place.
RoMEO recommends that author copyright agreements be revisited by a committee representing the interests of all parties, perhaps with a view to developing a model agreement that all stakeholders could support. In the meantime, however, open access proponents are pleased that at least 50% of journals allow author self-archiving of some kind. RoMEO has compiled a directory of publishers' self-archiving policies. The directory is available at the RoMEO web site [Project RoMEO, 2003].
OAI Data and Service Providers
The aims of the Data and Service Provider (DP and SP) surveys were to ascertain their views on the necessity of metadata protection and the form it might take. In addition, DPs were asked a number of questions about their relationship with depositing authorsin particular, whether the relationship was governed by a licence agreement. The full methodology and results have been written up in RoMEO Studies 5 [Gadd, Oppenheim, and Probets, 2003e]. Thirteen SPs and 22 DPs responded.
Only one-quarter of responding DPs had licence agreements with their depositing authors. Half of all respondents either trusted their depositors only to mount documents to which they had rights, or the DPs just provided a general warning statement to depositors on this issue. In some cases, this may have been because the DP had only recently been established and had yet to finalise its policies and procedures. However, it is important that DPs protect their own interests, even in an open-access environment. A DP making available a copyright-infringing work will be responsible (under UK law at least) for secondary infringement.
With regard to the subject of metadata protection, an interesting picture emerged from the DP and SP responses. Initially, 50% of responding DPs thought that metadata records were facts and, as such, there was no copyright in them. In addition, 68% of responding DPs believed that whilst there was database right in their collection of metadata records [Council Directive, 1996], this right was "implicitly waived" within the OAI community. One-third of SPs also thought that metadata was implicitly free under the OAI, and another third had never thought about it. Only a third of SPs checked a DP's policy prior to harvesting the DP's metadata. However, when asked about how they expected their metadata to be used, 90% of DPs selected conditions they should like to apply to their metadata, indicating that there were rights that the DPs wanted to have protected. Similarly, no SP stated that it was happy to have its enhanced metadata harvested unconditionally.
The majority of DPs (68.4%) wanted their metadata to be attributed to their organisation. Fifty-eight per cent wanted the metadata both to continue to be made freely available, and for non-commercial purposes. A surprising 52.6% wanted to specify that their metadata remain unaltered. Of course, were this to be implemented, it would inhibit the function of Service Providers, many of whom need to enhance the metadata (e.g., provide subject indexing or authority control) in order to provide services. Just three main conditions of use were listed by SPs. One condition was 'by prior agreement'not a condition that could be easily automated. Another was attribution of the Provider, and the third condition was that subsequent harvesters disclose the metadata under the same conditions as it was harvested.
Promisingly, 77% of DPs and 77% of SPs thought a standard means of expressing the rights status of metadata would be beneficial.
Rights metadata and metadata rights solution
Having ascertained the needs of academic authors and OAI Data and Service Providers with regard to the protection of their open-access research papers and metadata, the second phase of the project set about considering the best way to express and disclose those rights. Much of this work has been documented in RoMEO Studies 6 [Gadd, Oppenheim, and Probets, 2003f].
Developing rights expressions
To develop a set of rights expressions that met the requirements of academic research papers and metadata, the RoMEO project team had three main options. Firstly, they could have developed their own expression language for the purpose. Secondly, they could have utilised an existing Digital Rights Expression Language (DREL). As Renato Iannella has pointed out [Iannella, 2001], such languages are concerned with the "'digital management of rights' and not the 'management of digital rights'". There are currently two main DREL players: XrML (eXtensible Rights Mark-up Language) [XrML, 2001], and ODRL (Open Digital Rights Language) [ODRL, 2003]. A third option was to turn to the Creative Commons Initiative that was developing a complete rights solution for open-access works [Creative Commons, 2002]. The Creative Commons Initiative provides creators with a series of 11 licences under which creators may make their open-access work available. The licences have three incarnations: a simple "human-readable" version, a "lawyer-readable" licence document, and machine-readable rights metadata.
In the interests of standardisation, the option of developing a RoMEO expression language was quickly dismissed. XrML was also dismissed on the grounds that it was a commercial, patented product with unclear licensing terms, and at the time of RoMEO project development, XrML did not have a Data Dictionary component. Thus, although the grammar of the language was available (how rights expressions would fit together) XrML had no generally agreed upon words or terms to give those expressions meaning. By contrast, ODRL was an open source language with a form of Data Dictionary. That is, the ODRL Dictionary provides a list of terms, but no generally agreed upon meanings for those terms. ODRL was the language used in the academic author survey.
The Creative Commons (CC) solution went beyond the communication of rights through metadata, to their expression through simple human-readable "Commons Deeds" with associated symbols, and detailed "Licence Codes". However, as a result of this three-pronged approach to rights expression, the actual rights metadata records were not very descriptive of the permissions and restrictions granted by the licences. For example, each licence allows the 'licensee' to aggregate the work into a collection of works. However, the rights metadata instances do not specify that "aggregation" is permitted.
A comparison of the ODRL and CC solutions showed that either would meet the basic requirements of academics and Data and Service Providers as found by the RoMEO surveys, although a RoMEO application profile of ODRL would provide a higher level of granularity of expression than the simple CC metadata. As the RoMEO Project progressed, the CC initiative increased in momentum, as did the level of support from open access proponents. The Open Archives Initiative developed a keen interest in adopting the CC solution, as did the Dublin Core Metadata Initiative [Powell et al., 2003]. DSpace , the open-source institutional repository software developed at MIT [Bass, 2002], also expressed its intention to adopt the CC licences [Smith, 2003].
The one key technical problem with adopting the CC solution was that their metadata was expressed in RDF/XML and did not have an associated XML schemaa prerequisite for any metadata disclosed under the OAI-PMH. The RoMEO project therefore proposed a two-fold solution. It would work with the CC licences for expressing rights over research papers, as they looked set to becoming an emerging standard. However, in addition to approaching the CC to encourage them to develop an XML schema for their RDF, the project would also develop ODRL versions (XML instances) of the CC licences that would conform to the ODRL XML schema. The ODRL versions should provide a more accurate description of the content of the eleven CC licences than the CC's own RDF.
Disclosing rights expressions under the OAI-PMH
The next step was to consider how best to disclose the rights expressions under the OAI-PMH. After discussions with the OAI, the RoMEO team proposed that rights expressions for both individual and collections of metadata records, and individual and collections of resources, be disclosed. However, this work is to continue through the formation of an OAI/RoMEO Technical Committee, OAI-RIGHTS, which hopes to report in Spring 2004. What follows are the proposals that were reached by Project RoMEO by the end of the project timeframe (Summer 2003).
The expression of rights and permissions relating to an individual resource (such as a research paper) would be expressed by the use of a separate rights metadata record. This record would consist of the XML instance (either ODRL or CC/RDF) relating to the chosen CC licence and would be accessible through an OAI-PMH GetRecord request with a specific metadataPrefix parameter, e.g., oai_cc (see Figure 1). This instance could also be referenced by the mandatory Dublin Core metadata record relating to that document. Thus within the <dc:rights> element, an OAI GetRecord verb URL could be included, that, if followed, would retrieve the rights metadata record.
Figure 1: CC/RDF metadata record.
Rights expressions for individual metadata records would be contained within a record's optional <about> container. Again, either CC's RDF or an ODRL version of the CC licences could be used. Figure 2 shows an ODRL/XML version of the CC's Attribution Licence within the <about> container of a record. Obviously, the RoMEO team would recommend consistency in the choice of either the CC's RDF/XML or ODRL/XML, and the use of one or the other to express rights over both resources and metadata.
Figure 2: ODRL version of CC Attribution Licence
Whole collections of metadata and resources would be protected by the optional <description> response to the Identify verb. The OAI's XML schema to describe content and policies of repositories in the e-print community [Open Archives Initiative, 2002b] recommends that e-print repository descriptions use a <metadataPolicy> and a <dataPolicy> element, each containing optional <text> and <uri> elements. Thus, <metadataPolicy> could provide a default statement describing the permissions status of metadata, and <dataPolicy> could provide a default statement describing the permissions status of resources (see Figure 3). The default data policy would, in most cases, have to be a simple copyright statement, unless the repository only accepts resources meeting a minimum set of CC licence terms.
Figure 3: Repository-wide rights expressions
Conclusions and future developments
The RoMEO Project has been a most interesting exercise. The findings, particularly the Directory of journal publishers' self-archiving policies, should encourage academics that self-archiving is a realistic approach. Nevertheless, the project has also highlighted a number of concerns about publishers' copyright agreements, whichif dealt withcould greatly improve an author's rights under the current journal publishing system. The RoMEO project has shown that academics do not require the level of copy protection currently provided by (UK) copyright law and/or publishers' e-journal licences. Therefore, the provision of an alternative means of protecting academics' works through rights metadata, such as that proposed by the project's development phase, should be a welcome one. RoMEO has also demonstrated that whilst most Data and Service Providers are happy to share metadata in the spirit of open-access, they too are interested in protecting some of their interests as rights-holders. It is hoped that the metadata protection solution proposed by the RoMEO project team will protect those rights. In this vein, the RoMEO team is delighted to be working with the OAI in establishing a dedicated Technical Committee, OAI-RIGHTS, to further develop their technical proposals into generic guidelines for disclosing rights expressions under the OAI-PMH. It is hoped that the work of the committee will be available for general comment in the spring of 2004.
The RoMEO project team gratefully acknowledge the Joint Information Systems Committee for funding this project. They are also indebted to Herbert Van de Sompel of the OAI, Renato Iannella of the ODRL, and Aaron Swartz of the CC, for the invaluable advice they have freely given during the developmental phase of the project.
[ALPSP] ALPSP (1999). What authors want: the ALPSP research study on the motivations and concerns of contributors to learn. West Sussex, Association of Learned and Professional Society Publishers.
[Angell] Angell, M., J.P. Kassirer (1991). "The Ingelfinger rule revisited." New England Journal of Medicine 325(Nov.7, 1991): 1371-1373.
[Bass]Bass, M., et al (2002). DSpace: a sustainable solution for institution digital asset services: spanning the information asset value chain: ingest, manage, preserve, disseminate: internal reference specification: functionality. Cambridge MA, Hewlett Packard Company: 10 <http://dspace.org/technology/functionality.pdf>.
[Council Directive] Council of the European Union Directive No. 96/9/EC of 11 March 1996 on the legal protection of databases. (1996) URL: <http://europa.eu.int/ISPO/infosoc/legreg/docs/969ec.html>.
[Gadd et al., 2003a] Gadd, E., C. Oppenheim and S. Probets (2003a). "RoMEO Studies 1: The impact of copyright ownership on academic author self-archiving." Journal of Documentation 59(3): 243-277.
[Gadd et al., 2003b] Gadd, E., C. Oppenheim and S. Probets (2003b). "RoMEO Studies 2: How academics want to protect their open-access research papers." Journal of Information Science 29(5): [In Press].
[Gadd et al., 2003c] Gadd, E., C. Oppenheim and S. Probets (2003c). "RoMEO Studies 3: How academics expect to use open-access research papers." Journal of Librarianship and Information Science 35(3): In Press.
[Gadd et al., 2003d] Gadd, E., C. Oppenheim and S. Probets (2003d). "RoMEO Studies 4: The author-publisher bargain: an analysis of journal publisher copyright transfer agreements." Learned Publishing 16(4): [In Press].
[Gadd et al., 2003e] Gadd, E., C. Oppenheim and S. Probets (2003e). "RoMEO Studies 5: The IPR issues facing OAI Data and Service Providers." Submitted to Journal of Information Law and Technology.
[Gadd et al., 2003f] Gadd, E., C. Oppenheim and S. Probets (2003f). "RoMEO Studies 6: Rights metadata for Open Archiving - the RoMEO Solutions." Submitted to Program.
[Harnad] Harnad, S. (2001). "For Whom the Gate Tolls? How and Why to Free the Refereed Research Literature Online Through Author/Institution Self-Archiving, Now." <http://cogprints.soton.ac.uk/documents/disk0/00/00/16/39/index.html>.
[Open Archives Initiative (2002a)] Open Archives Initiative (2002). The Open Archives Initiative Protocol for Metadata Harvesting <http://www.openarchives.org/OAI/openarchivesprotocol.html>.
[Open Archives Initiative (2002b)] Open Archives Initiative 2002. XML Schema to describe content and policies of repositories in the e-print community, OAI Executive. <http://www.openarchives.org/OAI/2.0/guidelines-eprints.htm> .
[Powell] Powell, A., M. Day and P. Cliff (2003). Using simple Dublin Core to describe eprints. Bath, UKOLN. URL: <http://www.rdn.ac.uk/projects/eprints-uk/docs/simpledc-guidelines/>.
[Project RoMEO] Project RoMEO (2003). URL: <http://www.lboro.ac.uk/departments/ls/disresearch/romeo/index.html>.
[Relman] Relman, A. S. (1981). "The Ingelfinger Rule." New England Journal of Medicine 305(Oct 1, 1981): 824-826.
[Smith] Smith, Mackenzie, [DSpace Project Director] (2003) to Elizabeth Gadd. Personal correspondence. 27 May 2003.
Copyright © Elizabeth Gadd, Charles Oppenheim, and Steve Probets