To arrive at relevant and reliable conclusions concerning the usability of a hypermedia educational e-book, developers have to apply a well-defined evaluation procedure as well as a set of clear, concrete and measurable quality criteria. Evaluating an educational tool involves not only testing the user interface but also the didactic method, the instructional materials and the interaction mechanisms to prove whether or not they help users reach their goals for learning. This article presents a number of evaluation criteria for hypermedia educational e-books and describes how they are embedded into an evaluation procedure. This work is chiefly aimed at helping education developers evaluate their systems, as well as to provide them with guidance for addressing educational requirements during the design process.
In recent years, more and more educational e-books are being created, whether by academics trying to keep pace with the advanced requirements of the virtual university or by publishers seeking to meet the increasing demand for educational resources that can be accessed anywhere and anytime, and that include multimedia information, hypertext links and powerful search and annotating mechanisms. To develop a useful educational e-book many things have to be considered, such as the reading patterns of users, accessibility for different types of users and computer platforms, copyright and legal issues, development of new business models and so on. Addressing usability is very important since e-books are interactive systems and, consequently, have to be designed with the needs of their users in mind. Evaluating usability involves analyzing whether systems are effective, efficient and secure for use; easy to learn and remember; and have a good utility .
Any interactive system, as e-books are, has to be assessed to determine if it is really usable as well as useful. Such an evaluation is not only concerned with assessing the user interface but is also aimed at analyzing whether the system can be used in an efficient way to meet the needs of its users  who in the case of educational e-books are learners and teachers. Evaluation provides the opportunity to gather valuable information about design decisions. However, to be successful the evaluation has to be carefully planned and prepared so developers collect appropriate and reliable data from which to draw relevant conclusions.
This article presents a number of evaluation criteria for hypermedia educational e-books and describes how they are embedded into an evaluation procedure.
2. Evaluation Criteria for Educational e-Books
In this section, a number of evaluation criteria are described for hypermedia educational e-books that help educational developers in two ways:
Developers must assess the interface quality in order to detect usability problems or misunderstandings that need to be resolved to improve the interaction process. Moreover, evaluation must also focus on the e-book's utility, to analyze whether the e-book can be used in an efficient way to meet the needs of its users. Taking into account these premises, we propose a number of evaluation criteria for hypermedia educational systems  that can be applied to e-books. These criteria are based both on the works described in articles listed in the References section of this article [2, 4, 5, 6, 7, 8, 9] as well as based on the author's experience in the development of educational systems, such as CESAR , Now-Graduado  or CIPP .
2.1 Criteria to evaluate educational usefulness
Even more important than the quality of the user interface is the educational usefulness of a particular educational e-book. Developers of education materials must conduct tests to determine whether their e-books enable users to reach their learning and teaching goals. The following criteria can be applied to evaluate usefulness.
Richness. Garzotto, Mainetti and Paolini introduced the concept of richness , and the evaluation of educational e-books discussed in this article extends the concept in order to assess the richness of an e-book, taking into account parameters such as the following:
Completeness. Measuring for completeness involves determining whether the system has an adequate number of content and interaction mechanisms to cope with the goals of different kinds of users. Some aspects to test when analyzing the system for completeness include:
Motivation. It is also important to assess how students are motivated, not only to use the system but also to learn more about the subject being addressed. Aspects to take into account to improve the system with regard to motivating students are:
Hypertext structure. This criterion is oriented towards analysis of structural properties as those proposed by Botafogo et al., Hatzimanikatis et al., and Yamada et al. [5, 6, 7]. It is fairly obvious that the node should be reachable or modular with regard to the hypertext. However, although there is no empirical evidence regarding their influence on system usability, other features like depth, imbalance, tree impurity and sequencing are also important and should be considered. Additional aspects of hypertext worthy of consideration include:
Autonomy. With respect to multimedia components, Ficarra first defined autonomy as the degree of navigation freedom offered to the user . Autonomy can also be redefined to include the degree of interaction freedom. Some aspects to analyze with regard to autonomy include:
Competence. Competence is related to the ability to navigate through the system and to reach a particular goal. (See Ficarra .) Some aspects to consider concerning competence are:
Flexibility. The ease with which the system can be used and maintained is evidence of its flexibility. Parameters for flexibility analysis include:
2.2 Criteria to evaluate user interface usabilityAs the communication channel through which the user comes into contact with the computer, the user interface has to make possible the performance of tasks by users. Criteria to evaluate the usability of hypermedia educational e-books are described below.
Aesthetic. How the inclusion of multimedia information is harmonized and used to enhance the comprehension of concepts is called its aesthetic. Analyzing the aesthetic takes into account parameters such as:
Consistency. Consistency refers to the extent to which elements that are conceptually similar are treated equally by the application, while those that are different are treated differently. (See Garzotto et al. .) Consistent educational applications are easier to use and remember. Therefore, users can pay more attention to performing their tasks than to learning how to use the system. Consistency is analyzed with regard to the following:
Self-evidence. Self-evidence determines how easily users can guess the meaning and purpose of things with which they are presented. (See Garzotto et al. .) Self-evidence is mainly analyzed by checking to see how tangible the system structure and functions are. Some techniques that can be used to increase self-evidence include:
Naturalness of metaphors. It is important to evaluate metaphors used in the e-book to see whether they improve communication with the user (see Ficarra ) or, conversely, whether they fail to convey all the features of the domain or communicate the features in different ways, both of which constrain and mislead users. For example, one of the most relevant conclusions of the evaluation of CESAR was that the use of book and story metaphors was a good choice, not only for usability purposes but for helping to socialize children (e.g., teaching them to share books or providing them with the opportunity of being told stories). Some aspects to be considered regarding the naturalness of metaphors are:
Predictability. The extent to which users can anticipate a system outcome may be thought of as its predictability (see Garzotto et al. ), that is, predictability is measured by the degree to which users know the kind of result they will get from a specific interaction. Predictability is different from self-evidence, since with self-evidence users can identify the purpose and function of each object they are presented yet they cannot be sure what will happen if they perform a particular action. The best way to increase predictability is to perform a task analysis with users in order to understand what they expect from each interaction and how results are supposed to be presented.
3. An Evaluation Framework for Hypermedia Learning e-Books
For evaluation criteria to be really useful, they should be integrated into a procedure that will guide developers during the assessment process. With this purpose, in this article the procedure defined in Catenazzi et al.  has been extended (see Figure 1) as discussed below.
Defining the evaluation objective. The evaluation of hypermedia educational e-books should be addressed in two different ways that emerge as two valid objectives:
These objectives have to be put into more concrete terms in order to be measured, so designers should specify what they mean by educational usefulness and interface usability. That is, they need to determine the specific learning goals the e-book is expected to meet, as well as to assess the intended audience in terms of requirements such as learning styles, disabilities, background, age, software and hardware platform, and so on.
Selecting the evaluation technique. Several evaluation methods for interactive systems have been proposed in the literature, including analytic, expert, empirical and experimental procedures. The decision regarding which method to use depends primarily on such factors as what resources are available or what stage of development has been reached.
Preparing the evaluation. In this step, developers must do the following: decide what data to collect, select the evaluators, establish the tasks evaluators will carry out, and prepare mechanisms to record information on the evaluation process.
Conducting the evaluation. During this step, evaluation is carried out in one or more sessions, in a centralized or distributed manner, depending on objectives and available resources.
Elaborating data. The purpose of this step is to transform data collected, findings and recommendations for improvements to deliver a useful and usable e-book. The elaboration of data is usually based on statistical formulas, such as average, standard deviation, etc., for which conclusions related to the objectives of the evaluation are drawn.
Usability is a key concern when developing a hypermedia educational e-book. The e-book has to be analyzed in terms of educational usefulness and in the usability of the user interface. For this purpose, an evaluation process has to be carried out. Evaluation will provide data to derive relevant findings concerning the e-book's usability. This article describes an evaluation framework for hypermedia educational e-books that can be used to test the user interface for the e-book as well as its educational usefulness. The framework proposes a procedure whereby a number of criteria and parameters are used to assess the e-book (see Table 1).
The set of criteria proposed in this article is also intended to help designers of other educational materials during the analysis and design stages. However, the proposed criteria should be viewed as an incomplete list, since evaluating utility and usability is a complicated and polyhedral undertaking, with many facets remaining open to discussion.
The author thanks Ignacio Aedo for his cooperation in this work.
 Preece, J., Rogers, Y., Sharp, H. Interaction Design: beyond human-computer interaction. John Wiley & Sons, Inc. New York, 2000.
 Lee, S. H. Usability testing for Developing Effective Interactive Multimedia Software: Concepts, Dimensions and Procedures. Educational Technology & Society, 2(2), 1999.
 Díaz, P., Sicilia, M.A. and Aedo, I. Evaluation of Hypermedia Educational Systems: Criteria and Imperfect Measures. Proc. of the International Conference on Computers in Education, Auckland 3-6 December, 2002. 621-626.
 Mendes, M.E.X., Harrison, R. and Hall, W. Applying Metrics to the Evaluation of Educational Hypermedia Applications. Journal of Universal Computer Science, April 1998.
 Botafogo, R.A., Rivlin, E. and Shneiderman, B. Structural Analysis of Hypertexts: Identifying Hierarchies and Useful Metrics. ACM Transactions on Information Systems, 10(2), 1992, 142-180.
 Hatzimanikatis, A.E., Tsalidis, C.T. and Christodoulakis, D. Measuring the readability and maintainability of hyperdocuments. Journal of Software Maintenance, Research and Practice, 7, 1995. 77-90.
 Yamada, S., Hong, J. and Sugita, S. Development and Evaluation of Hypermedia for Museum Education: Validation of Metrics. ACM Transactions on Computer-Human Interaction, 2(4 ), 1995. 284-307.
 Garzotto, F., Mainetti, L. and Paolini, P. Hypermedia Design, Analysis and Evaluation Issues. Communications of the ACM, 38(8), 1995. 74-86.
 Ficarra, F.V.C. Evaluation of multimedia components. Proceedings of the International Conference on Multimedia Computing and Systems, Ottawa, 1997. 557-564.
 Díaz P., Aedo I., Torra N., Miranda P. and Martín M. (1998): Meeting the needs of teachers and students within the CESAR Training System. British Journal of Educational Technology. 29 (1), 1998. 35-46.
 Aedo, I., Díaz, P., Panetsos, F., Carmona, M., Ortega S. Huete E. A hypermedia Tool for Teaching Primary School Concepts to Adults. IFIP WG 3.3 Working Conference Human Computer Interaction and Educational Tools. Sozopol (Bulgaria). May 27-28, 1997. 180-188.
 Aedo, I., Díaz, P., Fernández, C., Muñoz, G. and Berlanga, A. Assessing the utility of an interactive electronic book for learning the Pascal programming language. IEEE Transactions on Education, 43(4), 2000. 403-413.
 Catenazzi, N., Aedo, I., Díaz, P. and Sommaruga, L. The evaluation of electronic books: Guidelines from two practical experiences. Journal of Educational Multimedia and Hypermedia. 6(1), 1997. 91-114.
Copyright © Paloma Díaz