Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Articles

spacer

D-Lib Magazine
December 2004

Volume 10 Number 12

ISSN 1082-9873

Metadata Development in China

Research and Practice

 

Jia Liu
Department of Information Management
Peking University, China
<jialiulj@hotmail.com>

Red Line

spacer

The most common definition of metadata is data about data or, more particularly, structured data about data. Metadata has existed for a long time, but following the dramatic and rapid growth of the Internet, the term began to garner more attention.

It was in the mid-1990s that Chinese researchers and practitioners began to focus on the issue of metadata for both traditional and digital resources, and for the next several years they felt perplexed, and sometimes anxious, about how to create and manage metadata most effectively to enable networked resource sharing. For example they worried that Dublin Core (DC) records might entirely replace traditional cataloguing records. However, Chinese researchers and practitioners have now reached the point where metadata development and use have matured and become stable throughout the country's institutions.

1. Metadata research in China

Outside the computer science arena, the first paper published in China about metadata was written by Long Xiao, whose paper dealt with the metadata structure of the American Memory project in the US.

According to bibliographic statistics, in 1997 and 1998 combined there were only three computer science papers that primarily focused on the subject of metadata. The number rose to eight articles in 1999 (see Table 1), and these were primarily about metadata related to geographical information systems, data warehouses, and other computer software. In 2002, the number of articles on metadata jumped to 68, and most of these articles dealt with research on metadata related to digital libraries. Table 2 shows the results of a content analysis of the metadata-related papers published in China in 2002, classified according to subject categories and subcategories, including up to three sub-categories for each paper [1].

Table 1. Number of research papers related to metadata in China [2]

Year 1997 1998 1999 2000 2001 2002
Numb. of papers 1 2 8 11 8 68

 

Table 2. Papers classified according to the subject categories [3]

Subject Number of Papers
General introduction of metadata 17
Metadata formats 3
MARC and cataloguing 3
DC related 10
Metadata models 5
Interoperability 2
Multimedia metadata 1
Administrative metadata 1
XML/RDF 3
Metadata capturing 2
Metadata applications 21
Archiving 3
Preservation 2
Computer science related 14
GIS related 7
Educational information system related 5

With respect to monographs, so far three have been written in China about metadata. The following is a list of the monographs arranged in order of publication date.

  • Wu, Jianzhong. DC metadata. Shanghai: Shanghai Scientific and Technological Literature Publishing House, 2000.
  • Liu, Jia. Introduction to metadata. Beijing: Huayi Publishing Company, 2001.
  • Zhang, Xiaolin. The research and application of metadata. Beijing: National Library Publishing House, 2002.

As its title suggests, the first monograph focuses on the Dublin Core Metadata Elements Core (DCMES). The second one deals with not only the basic knowledge about metadata and metadata standards in the library world but also with metadata applications in digital libraries. The third monograph is about metadata research and applications.

In addition to the above monographs, some conference proceedings have included papers on the subject of metadata development in China, as well. An example of such proceedings is:

Management Center of the Agenda in the 21st century of China. Research on the metadata standard for the Chinese geographic information. Beijing: Science Publishing House, 1999.

There are also other types of Chinese publications in which the outcomes of research related to metadata have been discussed. For example, in 2000 Dr. Jia Liu completed her Ph.D. dissertation on the subject of metadata. Subsequently, graduate students at the Institute of Library and Information Science, Wuhan University, also chose metadata as the subject of their theses or dissertations. Meanwhile, a number of electronic documents on the topic of metadata have been published on the Internet. Most of the latter are to be found on the websites of the Chinese Network of DC Metadata at the Shanghai Library and the Institute of the Digital Library at Peking University, as well as on other academic websites. As is the case in other countries throughout the world, the Internet has become an increasingly important medium used by Chinese researchers and professionals to publicize their achievements with regard to metadata-related research.

With respect to the content of papers about metadata research published in China, formerly the papers were primarily about basic metadata theory and work being done outside China. More recently, papers have focused on metadata implementations, challenges faced by implementers, and practical use cases within China. In other words, papers on metadata research in China have become more practical and less theoretical.

2. Metadata practice in China

The term metadata has become a familiar one in China. Metadata is widely used in library science, computer science, meteorology, geology, electronics, government and other domains for scientific, industrial and commercial purposes. For instance, the website of the Climate Data Center Online, which is supported by the National Meteorological Center of China, provides not only a metadata-based search service but also a brief explanation about the metadata they use. In the area of meteorology, metadata is comprised of information used to describe the data sets of meteorological materials. Professionals working in other disciplines use metadata for describing the attributes of different types of content. The decision about what kind of metadata will be used by various organizations is based on their own needs and the forms of content held in their collections.

2.1 Projects with metadata components

In China there are no independent projects that focus solely and completely on metadata. Rather the issue of metadata is but one part of a project implementation within a particular field.

Besides the use of metadata (such as catalogue records) in traditional libraries, at this time in mainland China one of the application areas where the focus on metadata is most intense is in the area of digital libraries. In the Internet era, digital libraries represent completely new information infrastructures and knowledge environments. Integrating and utilizing the newest computer and communication technologies and digital content, the digital library builds huge extendable and interoperable collections. In order to manage the massive scale of these digital collections effectively, establishing a metadata model and application profile has become a fundamental part of any digital library project.

The China Pilot Digital Library Project (CPDLP) was both the first digital library project in China and the milestone for practice related to metadata there. It was a national scientific and technological project approved by the National Committee of Planning of China in 1997, and the project duration was from July 1997 until December 2000 [4]. The National Library of China, Shanghai Library, Shenzhen Library, Zhongshan Library, Nanjing Library and Liaoning Province Library were involved in the project. Patterned after the Digital Library Initiative in the United States, the project's leaders focused on technology development as well as digitizing resources.

In April 1998 the project task force proposed a draft metadata profile to define a core (minimum) set of metadata elements and provide refinement rules and markup guidelines. The task force suggested that the minimum metadata element set adopted should be DCMES. HTML 4.0 and XML/RDF were suggested for mark up. A two-layer metadata application model was recommended in which the first layer would be DCMES and the second layer would be MARC, TEI Header or another rich metadata format. The DCMES layer would not have to be created separately. It could be transformed dynamically through using a mapping or bridge mechanism. The draft metadata profile called for the implementation of the two-layer model in the resources development phase of CPDLP [5].

The metadata profile used by the CPDLP had a significant effect on subsequent metadata research and practice in China. Following CPDLP there have been a number of other digital library projects initiated in the country, such as the Peking University Rare Book Digital Library, Tshinghua University Architecture Digital Library, Shanghai Digital Library, and many others. The metadata application profile of each of these has been established on the basis of DCMES.

2.2 National or domain-specific metadata standards and specifications

Establishment of national or domain-specific metadata standards reflects the other aspect of the development of metadata practice in China. Following are descriptions of some of these standards:

  • Standard for the Metadata of Information Sharing for Sustainable Development of China: This standard is contained in the Set of the Standards for Information Sharing for Sustainable Development of China completed under the Ninth Five-year and Tenth Five-year Plan of China. The standard was delivered by the implementers of the pilot project within the five- and ten-year plans, Information Sharing for Sustainable Development (97-925). The standard stipulates the content of the metadata, including the identification, content, quality, and status, as well as any other related characteristics of the data needed for sustainable development. The standard mandates that the metadata structure for the sustainable data be divided into three layers: metadata section, metadata entity and metadata element. There are two levels of metadata. The metadata at the first level includes the minimum metadata entities and elements necessary for identifying a data set (data set, series of data set, element and characteristics) uniquely. The metadata at the second level includes the core metadata entities and elements necessary for identifying a data set file completely (data set, series of data set, element and characteristic) [6].
  • Specification for Learning Object Metadata (CELTS-3): On April 20, 2001, the China E-Learning Technology Standardization Committee released the Specification for Learning Object Metadata (SLOM). It prescribed a conceptual data model used to define the metadata structure of a learning object. In this specification the learning objects include both digital and non-digital objects used for learning, education and training. The characteristics of the metadata for the learning object are divided into several categories, including the general information, technical information, educational information, classification information, etc. The conceptual data model supports various languages and defines the data elements of which the learning object metadata is composed [7]. The SLOM is in accordance with the relevant international standard. One year later (i.e., in July 2002) the Metadata Specification for the Teaching Resources for Primary Education (MSTRPE) was developed, which put forward a group of basic metadata elements for the teaching resources used in primary and middle schools [8]. SLOM was one of the most important references when the MSTRPE was created.
  • Standard for the National Fundamental Geographic Information System (NFGIS) Metadata: This standard was released in 2003 and is maintained by the National Geomatics Center of China, which took as references ISO 15046-15 (Geographic Information: Metadata) (CD 2.0) and the Content Standard for the Digital Geospatial Metadata (version 2.0) released by Federal Geographic Data Committee of the United States. The standard prescribes the content of the NFGIS metadata, which includes identification, content, quality, and status, as well as other characteristics of the NFGIS data. The standard can be used for complete description of the NFGIS data set, cataloguing of the data set and network service for information interchange [9].
  • Metadata specifications within the project Establishment of Standards and Specifications for the Chinese Digital Library (ESSCDL): The Institute of Scientific & Technical Information of China, the Library of Chinese Academy of Sciences and the National Library of China initiated the ESSCDL project in October 2002. The project ended in September 2004. Eighteen large and important Chinese libraries and institutions were engaged in the project. The main focuses of the project included: construction and service of the digital resources in the China Digital Library (CDL) system; establishment of the developing strategies and standard and specification framework of the CDL; formulation of the CDL core standard and specification system; and setup of the open development and application mechanism of the standards and specifications of the CDL. This major project in the field of digital libraries in China encompasses ten sub-projects, four of which are directly related to metadata specifications. They are:
    1. Metadata Specification for the Basic Digital Object
    2. Metadata Specification for the Specific Digital Object
    3. Metadata Specification for Description of the Resource Set
    4. Open Register System of Metadata Standards and Specifications [10]
    Within the sub-projects a significant number of metadata standards and specifications have been created that are necessary for the successful development of the CDL.

Additionally, the Institute of Digital Library at Peking University released the Framework of the Chinese Metadata Standards in January 2001. On the basis of analysis and research on a variety of existing metadata standards, the developer of the Framework summarized a set of specifications and principles for creating the Chinese metadata standard [11]. The Framework has become a very important reference for current and future digital library projects.

3. Comparison of metadata activities in China with those in European countries

In China, the emphasis has been on following international models in nearly every metadata-related area. As the concept of metadata flourished along with the growth of the Internet, China entered into the age of reformation and openness. Thus the metadata activities in China and European countries are now quite similar. For instance, in both China and Europe understanding of metadata principles, the process of establishing metadata models and metadata implementations are much the same. Besides, an increasing number of application profiles in both China and European countries are created on the basis of a common metadata scheme, and in most cases that scheme includes DCMES. At the same time, some metadata elements suitable to the local context can also be added into the application profiles.

Though the metadata activities in China and Europe are very similar, there are certainly some differences, such as the following:

  • Cooperative range. The difference in cooperative range for China results from simple geography. In China cooperative projects take place mainly among mainland China, Hong Kong, Macau and Taiwan. Meanwhile, European countries have a long tradition of cooperation not only among themselves but also with the United States, Australia and other non-European countries. As time has gone on, the European tradition of cooperation has been extended from the western world to the eastern world so that many more resources might be globally shared. Such cooperation has been demonstrated to be feasible. For example, the Electronic Mathematical Archiving Network Initiative (EMANI), which aims at providing long-term preservation and access to mathematics literature, is a good example of cooperation among China, European countries and the United States. The application profiles for all the institutions cooperating in the EMANI project are DCMES-based so that harmonization and interoperability might be reached with each other, and the sharing of resources can be successful.
  • Mechanisms. Firstly, there is the difference in administrative mechanisms. At this time in China, the administrative mechanism is a centralized one, and that means standardization is easier, in a sense, than with decentralized administration. After investigation and negotiation, different Chinese institutions seek agreement, and then, based on that agreement, they create a certain unified standard to be used in institutions throughout the country. The agreed upon standard is popularised not only because of the power of the standard itself but also because of the centralized government administration mechanism. This mechanism has indeed proved useful to popularise metadata standards and supports the premise that the correct metadata standard was created or chosen from the very beginning. A comparatively different administrative mechanism is implemented in European countries where the mechanism is more decentralized. The adoption of a particular kind of metadata standard by a specific institution is reached during the practical implementation of a metadata project. The institution is free to choose whichever metadata scheme it feels is best for that institution's purposes. If the institution does not feel that cooperation and interoperability with the collections at other institutions is important to its mission, or if the institution has its own method for dealing with the challenges related to interoperation, it may choose not to comply with a particular standard. Secondly, there is the difference in the personnel mechanisms. In European countries frequently staff are employed just for the duration of a project. Therefore, the personnel mechanism for recruitment is quite flexible. After being employed, the project staff can entirely concentrate on the project at hand. In China, on the other hand, that kind of personnel mechanism has not been popularized. Generally when a project is initiated in China, professionals from different institutions might arrange to work on the project but while remaining in their original positions. In other words, seldom is a person recruited just for one project. Obviously, the mechanism implemented in European countries makes for easier project implementation, which may set the stage for a more successful project completion. Under such a situation, more professionals really crucial to the project might be absorbed into the task force, and fresh new ideas might be brought to the project.

By the way, as mentioned earlier, there is still no project in China that concentrates entirely on metadata such as those in some European countries. For example, the Nordic Metadata Project is a well-known metadata project not only in Scandinavia but also all over the world. The project focused so much on the Dublin Core Elements Set that it was taken as "the first international Dublin Core project" [12]. Though China has not engaged in that kind of metadata project up till now, China did set up metadata registry services like those set up in the western world, and in 2004 a mirror site of the Dublin Core Metadata Registry was established within the Library of Chinese Academy of Sciences [13].

4. Final words

Several years ago, the author took metadata as the subject for her Ph.D. dissertation. She presumes that to be the first dissertation on the subject of metadata in China. When she defended her dissertation, almost all the professors in attendance asked the author to declare whether or not the DCMES record would completely take the place of the bibliographic record used in China. At a later date, a professor working in another institution also approached the author to emphasize that the MARC format was not replaceable. These questions revealed a concern about the future of metadata use in China and how it would affect metadata as it used in the library world at present. At the time the author was questioned, she was not quite sure how to answer these concerns, but today her answer would be that although DCMES seems to dominate the metadata world, a variety of metadata schemes for different purposes will co-exist harmoniously for the foreseeable future.

This October the International Conference on Dublin Core and Metadata Applications (DC 2004) was held in Shanghai. This is a further demonstration that Chinese professionals have become increasingly involved in cooperative metadata development projects with their international colleagues.

Notes and References

[1] Because of the inaccessibility of the related Chinese database in Europe, the author failed in providing the latest data until 2002. The statistics here were created by Wei Liu on the basis of published papers collected from China National Knowledge Infrastructure (CNKI) and Chongqing VIP database.

[2] Created based on the following document: Liu, Wei. Chinese metadata review. Online at <http://www.library.sh.cn/libnet/sztsg/fulltext/reports/2003/ChineseMetadataReview.pdf>. (Accessed December 2004).

[3] Ibid.

[4] Zhuang, Jing. The China Pilot Digital Library Project. 1 October 1998. Online at <http://www.nlc.gov.cn/dlib/dlc1.htm>. (Accessed December 2004).

[5] Liu, Jia. Introduction to metadata. Beijing: Huayi Publishing Company, 2001.

[6] National Geomatics Center of China. Standard for the Metadata of the Information Sharing for Sustainable Development of China. Revised edition. Online at <http://www.sdinfo.net.cn/ngcc/sdinfo/document/stdmeta.htm (Accessed December 2004).

[7] The China E-Learning Technology Standardization Committee. Specification for Learning Object Metadata: CELTS-3. Released on 20 April 2001. Online at <http://www.celtsc.edu.cn/>. (Accessed December 2004).

[8] Ministry of Education. Center for Developing Teaching Material for Primary Education. Specification for Teaching Resource Metadata for Primary Education: CELTS-40. July 2002. Online at http://www.celtsc.edu.cn/>. (Accessed December 2004).

[9] National Geomatics Center of China. National Fundamental Geographic Information System (NFGIS) Metadata: first draft. Last modified on 12 May 2003. Online at <http://nfgis.nsdi.gov.cn/nfgis/chinese/bz/mt0.htm>. (Accessed December 2004).

[10] The Institute of Scientific & Technical Information of China. Establishment of Standards and Specifications for the Chinese Digital Library. Online at <http://cdls.nstl.gov.cn/cdls2/w3c/>. (Accessed December 2004).

[11] Xiao, Long et al. Framework of the Chinese Metadata Standards and its applications. Online at <http://www.lib.sjtu.edu.cn/chinese/virtual_reference_desk/metadata_framework.pdf>. (Accessed December 2004).

[12] Hakala, Juha. The Nordic Metadata II project: cataloguing, indexing and retrieval of network documents. Online at <http://www.lib.helsinki.fi/meta/nm2plan.html>. (Accessed December 2004).

[13] Library of Chinese Academy of Sciences. The Dublin Core Metadata Registry.

Copyright © 2004 Jia Liu
spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Previous Article | First Conference Report
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/december2004-liu