Lawrence M. Rudner
More than 100 organizations obtain the Educational Resources Information Center (ERIC) database and provide access through CD-ROM products, university intranets, regional servers, on-line gateways, library cooperatives, and a few public access web sites. Key advantages of this distributed system are that there are a variety of search engines to choose from and that there are few bottlenecks in searching the database. A key disadvantage in this era of accountability is that it can be difficult to obtain usage statistics. Some organizations do not keep records; others do not want to reveal proprietary market data.
This brief article applies a procedure for estimating ERIC usage despite these difficulties. The article has two goals: 1) to publicly document some aspects of ERIC usage and 2) to present a procedure that may be applicable to other databases.
The Educational Resources Information Center (ERIC)  is a national information system designed to provide ready access to an extensive body of education-related literature. Established in 1966, ERIC is a program of the U.S. Department of Education's Institute of Education Sciences (formerly the Office of Educational Research and Improvement) and is administered by the National Library of Education (NLE). At the heart of ERIC is the largest education database in the world containing more than one million records of journal articles, research reports, curriculum and teaching guides, conference papers, and books. Federal funds have traditionally paid for the development, but not the dissemination, of the database. Database development has occurred through the acquisition and cataloging activities of a decentralized network of subject-specific clearinghouses. Dissemination, it was felt, could best be handled by the private sector. The private sector could conduct research into user interfaces, mount the databases on high-speed computers, develop search engines, and host on-line searching. While the government could take pride in the investment and value-added services made by the private sector, the model precluded the attainment of accurate usage data. Multiple vendors held their usage data proprietary; there was no way to aggregate. With the advent of ERIC on CD-ROM and the expansion of computing capabilities, it became even harder to obtain meaningful numbers. Fortunately, for the purpose of obtaining a reasonable estimate, many of the web-based servers are linked to the ERIC Document Reproduction Service (EDRS) to provide on-line access to the full text of more than 86,000 recent documents through the EDRS e*subscribe program and a system to order print and microfiche copy of the non-journal literature in the ERIC database. Data from EDRS and two volunteer public-access points to the ERIC database provide the long-awaited data needed to make a reasonable estimate.
The estimated total number of searches is obtained from a sampling fraction. The 5,566 referrals from the ERIC Clearinghouse on Information and Technology (ERIC/IT) and the ERIC Clearinghouse on Assessment and Evaluation (ERIC/AE) accounted for 10.02% of all the referrals to EDRS in September 2001 [Dagutis, personal communication]. Other frequent referrers were EBSCO, OVID, Cambridge Scientific Abstracts, and OCLC. During the same time period, the logs for ERIC/IT and ERIC/AE showed 724,000 searches of the ERIC database.
Assuming that the mean search/referral rate across linked sites is equal to the search/referral rate for ERIC/IT and ERIC/AE, the projection from this sample is 724,000/.1002 = 7,225,000 searches per month. The 95% confidence interval for the sampling fraction is
Thus, if we drew different samples, we would expect the sampling fraction to be .1002 +/- .0079, 95% of the time. We would expect the total number of searches to be 7,225,000 +/- 57,000 per month.
Projecting from the monthly rate, at least 86 million searches of the ERIC database will be conducted over the next year. This compares quite favorably with PubMed, which has 250 million searches per year from the National Library of Medicine web site [Lindberg]. ERIC is searched more than seven million times per month and more than 230,000 times per day. Looked at another way, this is an average of more than 18 searches per year for each of the nation's 4.7 million school teachers, college of education professors, and college of education students .
Basic sampling methodology was used to obtain estimated usage and a confidence interval. This methodology is applicable when there is detailed information from a known fraction of the user base. If the unknown portion of the user base is similar to the known portion, then the resultant population estimate would be the best available estimate. If not, as was in the case in this paper, then the estimate should be viewed as either a liberal or a conservative estimate.
The estimate assumes that the mean search/referral rate for all sites is equal to the search/referral rate for the sample. If the search/referral rates for other sites are higher, which is likely for EBSCO, OVID and other university-based providers, then the 10.02% fraction for the sample referral rate overestimates the sampling fraction for the number of searches, and the estimated total number of searches is low. In other words, the 7,225,000 figure is a conservative estimate.
The estimate also does not include searches conducted using CD-ROMs or searches conducted at sites not linked to EDRS. Searches that result in full-text resources, or in documents with links to full-text resources, are also undercounted. The trend data from EDRS, ERIC/IT, and ERIC/AE consistently show large increases in usage over the past year, with every indication that usage will continue to increase. The estimate, therefore, is very conservative.
The growth of the Internet was a blessing for the ERIC system. ERIC system activities other than archiving and database building became more visible and more valued, especially among practitioners. In recent years, usage has soared for ERIC Clearinghouse products and services such as pathfinders, responses to frequently asked questions, interactive web-based products, question-answering services, selection and identification of top resources, and syntheses. While use of the ERIC database is impressive, use of these other products and services is far more impressive. From April to June 2001, ERIC Clearinghouse web sites averaged more than 70 million hits per month. While the database is ERIC's core, its impressive usage data must be kept in perspective. ERIC, like other information providers, is much more than its database.
[Dagutis] Dagutis, P. (2001). Personal communication between the Director of the ERIC Document Reproduction Service and the author of this paper.
[Lindberg] Lindberg, D.A. (2000). Internet Access to the National Library of Medicine. Effective Clinical Practice, 3(5), 256-260.
[Market Data Retrieval] Market Data Retrieval (2001). Interactive Catalog - College Faculty by discipline. Available online: <http://www.schooldata.com/scripts/wsisa.dll/WService=mdrwsb1/
[USDE, 2001] U.S. Department of Education (2001). National Center for Education Statistics, Education Statistics Quarterly "Early Estimates of Public Elementary/Secondary Education Statistics: School Year 2000-2001." by Lena McDowell. Available online: <http://nces.ed.gov/pubs2001/quarterly/spring/q4_3.html>. (This article was originally published as an Early Estimates report. The universe data are from the NCES Common Core of Data (CCD).)
Copyright © Lawrence M. Rudner