Toward an Effective Understanding of Website Users: Advantages and Pitfalls of Linking Transaction Log Analyses and Online Surveys

Search | Back Issues | Author Index | Title Index | Contents

D-Lib Magazine
March/April 2007

Volume 13 Number 3/4

ISSN 1082-9873

Toward an Effective Understanding of Website Users

Advantages and Pitfalls of Linking Transaction Log Analyses and Online Surveys

Diane Harley, Ph.D., and Jonathan Henke
Center for Studies in Higher Education
University of California, Berkeley
<cshe@berkeley.edu>, <jhenke@berkeley.edu>

Introduction

Almost every American research university and library has made significant investments in digitizing its intellectual and cultural resources and making them publicly available. There is, however, little empirical data about how these resources are actually used or who is using them (Harley, 2007). Those who fund and develop digital resources have identified the general lack of knowledge about the level and quality of their use in educational settings as pressing concerns. As part of a larger investigation into use and users of digital resources (Harley et al., 2006),¹ we conducted an experimental analysis of two commonly-used methods for exploring the use of university-based Web-based resources: transaction log analysis (TLA) and online site surveys. In this article, we first provide an overview of these two methods, including their key challenges and limitations. We then describe an implementation of TLA and online surveys in combination on two local sites and the results of that test, including an exploration of the surveys' response rates and bias. From that test, we draw conclusions about the utility of these two methods and the particular analytic methods that may provide the most valuable and efficient results.

Background

TLA and online surveys explore slightly different aspects of a site's use and users; they can be complementary tools, and the combination of the two may allow a deeper understanding of a site's use than either alone. For example, many Web sites use online surveys to learn more about their users. Among their strengths, surveys can be used to develop a profile of the site's visitors and their attitudes, behavior, and motivations. In particular, sites often employ surveys to determine personal information about their users, to discover users' reasons and motivations for visiting the site, and to explore user satisfaction levels. Transaction log analysis (TLA), on the other hand, can describe the actual usage of the site, including the relative usage volume of different resources, the details of users' navigation paths, the referring pages that led users to the site, and the search terms used to locate or navigate the site. It is a particularly valuable method, either alone or in combination with online surveys, because the usage data are collected automatically and passively; the method records actual user behavior on a site rather than relying on self-reports.

Although these two methods are widely used, there seems to be some ambiguity about the best way to implement them and to report the results, particularly for educational resources (Troll Covey, 2002; Mento and Rapple, 2003). This lack of consensus makes it difficult to interpret statistics for different sites and to compare one site with another (Bishop, 1998). Both TLA and online surveys can be time-consuming and labor-intensive and, unless research and analytic methods are sound, the results may be ambiguous or even misleading. Online surveys often suffer from disappointingly low response rates and biased samples, resulting in potentially misleading interpretations.

Transaction log analysis

Transaction log analysis (TLA) takes advantage of the computerized log files that automatically record online access to any Web site. By analyzing these logs, one can determine a number of characteristics of the site's users and summarize total site use.

There are significant challenges to assessing the use and usability of digital collections through transaction log analysis (Troll Covey, 2002). Bishop's (1998) previous research suggested many of the same issues.

Because the logs identify only the client computer, it is usually not possible to identify individual users or track them over time with 100 percent accuracy. It can be difficult to determine which log records are associated with the same user. The user's IP address is often used as a proxy for a user identifier, but the IP address is not a perfect identifier in all cases:

The same user may visit a site from several IP addresses. Users with dial-up connections or on other types of networks will have dynamically assigned IP addresses that vary from session to session, or even within a session.
Several users may share one IP address. A public library or campus terminal may be used by many users to access a site. Also, small networks frequently share one IP address, so different users on different computers may still appear to originate from a single IP address.

Each IP address can be associated with a particular hostname, but the IP address (or even the hostname) may not reveal anything of interest about the actual person.
Analyses may attempt to use hostnames to identify characteristics of individual users, such as their country of origin, educational status, or institutional affiliation; however, these analyses can be unreliable or even misleading:

Researchers may attempt to identify users from colleges and universities by looking for hits from .edu domains. Many educational users, however, rely on commercial dialup for home access, where the commercial IP address has no bearing on the user's educational status.
Hostnames can be used to attempt to locate users geographically, particularly for hits from international (country code) domains. Most users, however, still originate from generic top-level domains (without country codes), which are difficult or impossible to pinpoint geographically based on the hostname or IP address alone.
In addition, hostname lookup may not be 100 percent reliable, due to incomplete or out-of-date DNS records; these records may be less reliable for international domains.

Other, more advanced geolocation techniques exist and have improved dramatically since the 1990s, but they can be quite expensive and are still not 100 percent accurate. MIT's OpenCourseWare analysis used the Akamai service to help locate its users geographically (Carson, 2004).
Proxy servers limit the reliability of server logs: if the requested page is in the proxy server's cache, the Web server will not be contacted, and will have no record of that access.
Not all Web browser events are logged by the Web server. For example, the Web server is generally oblivious to the user pressing the back button, because the page will be reloaded from the Web browser's cache. Even though hours may pass, when a user re-visits a site later, that site may still be loaded from the browser cache instead of the Web site, effectively evading logging. Other events such as scrolling the window, switching applications, and periods of computer inactivity are also not logged.
Cookies can serve as a better user identifier than an IP address, but while every Internet user has a relatively inflexible IP address, users can control the cookies placed on their system (and many users block cookies). For example, a cookie can be copied to other computers, deleted, or systematically modified by users for their own purposes, all of which will impact logging.
More advanced TLA techniques (using some high-end analysis tools) may require extensive site modifications. Many commercial packages require the placement of a special HTML tag on each page to facilitate the software's best features. Other techniques require JavaScript code embedded in the site's pages or invisible Macromedia Flash files that set Flash "cookies." These modifications require considerable expertise and place an additional burden on the site designer or manager, particularly for small or understaffed organizations.

Despite these challenges and limitations, transaction log analysis still has two major advantages over most other user research methods. First, it captures the actual behavior of real users in their own real-use environments; it does not rely on biased self-reports or artificial, laboratory-based use scenarios. Second, because TLA records behave passively without requiring users' active participation, it can capture a much broader spectrum of uses and users than can surveys, focus groups, or other methods.

Online surveys

An online survey can be a valuable complement to transaction log analysis for studying the use and users of a Web site; while TLA can reveal users' actual online behavior and usage patterns, surveys can reveal users' motivations, goals, attitudes, and satisfaction levels (Evans and Mathur, 2005). In the past decade, online surveys have become more widespread for a variety of reasons (Fricker and Schonlau, 2002; Gunn, 2002). Online surveys provide some cost and convenience advantages over other survey modes, but they also raise some problems that warrant careful consideration (Evans and Mathur, 2005).

Online surveys can take a variety of forms. Surveys can be administered online as part of a traditional, well-developed survey methodology involving a defined population of interest; an explicit sampling method for generating a representative sample; a well-thought-out recruitment strategy; carefully calculated response rates; and statistical estimates of the likelihood of response bias. Increasingly, however, online surveys are posted on a Web site and made available to anyone who happens upon them. These surveys rarely have a defined population or sampling method; with no way of tracking those who do or don't complete the survey, it is often impossible to report a response rate or estimate response bias.

When one designs a survey instrument for online administration, a variety of new options are available for question structure, layout, and design (Gunn, 2002; Schonlau et al., 2002; Faas, 2004). Important issues in instrument design include question wording, survey navigation and flow, skip patterns, survey length, and the graphical layout of the instrument. Computerization allows the design of more complicated skip patterns and question randomization. Additionally, it is possible to program automatic data checks and verification to disallow the entry of inconsistent responses.

The automation of data collection and analysis can result in an economy of scale, making online surveys much more cost efficient, especially for large sample sizes. Automation can also mean that data (and basic analyses) are available in a much shorter timeframe – even instantaneously. [A more detailed exploration of techniques for survey design, administration, and analysis can be found in Rossi, Wright, and Anderson (1983) and Fowler (2002).]

Survey response rates

Survey response rates are of some concern to researchers, as rates for all types of surveys have been on the decline since the 1990s (Johnson and Owens, 2003; Baruch, 1999). Evidence suggests that response rates for online surveys are lower than for other media and continue to shrink (Fricker and Schonlau, 2002). In traditional social science survey research, sampling methods are designed to ensure that the survey respondents are representative of the population of interest. If the sample is representative and the response rate is high, the survey results can shed light on the characteristics of the population. If, on the other hand, response rates are low or the sample is known to be non-representative, it is possible – even likely – that the survey results will be misleading. (A large response rate alone is no guarantee that the respondents are representative.)

Sampling techniques and the measurement of response rates, however, are a particular challenge when a survey is posted online and made available to any Web user anonymously, without active recruitment or sampling. In such an environment, the population of users and the characteristics of the respondents are essentially unknown, making it difficult to report response rates and even more difficult to estimate the survey's response bias. The lack of knowledge of the complete population also makes it difficult to design appropriate sampling frames.

Measuring response rates is a particular challenge for online surveys, partly because of the tricky definition of "response." Bosnjak and Tuten (2001) identify distinct response types, including lurkers (who view a survey without responding), drop-outs (who complete the beginning of a survey without continuing), item non-responders (who omit individual questions), and complete non-responders. Complicating the picture is the common practice of offering various rewards to increase participant motivation. The use of rewards and incentives can introduce response bias, however. Individuals who are motivated to respond by a specific reward may not be representative of the whole study population.

Methods and Results: Testing response bias in online surveys

We conducted a test on two local sites, using a combination of TLA and online surveys, to explore the effectiveness of these two methods for elucidating patterns of use and to explore survey response rates and bias. We selected two sites for our analysis: SPIRO, which provides online access to the UC Berkeley Architecture Department slide library, and The Jack London Collection, which features a wide variety of resources about the early-twentieth-century American author.

We placed short surveys on the homepages of both sites for a two-month period and collected the sites' transaction logs from the same period. After analyzing the logs and the survey responses individually, we combined the two by matching each survey response with the logs from the same Web user. (We identified individual users by the combination of IP address and user agent.) We then used this combined dataset to estimate each survey's response rate and to attempt to quantify the self-selection bias among the respondents. More information about the tests, including the survey instruments and analyses, can be found on our project Web site.

Test results

Table 1: Transaction log analysis: Selected results

	SPIRO	Jack London
Number of sessions	54,375	145,956
Number of unique IP addresses	38,962	97,284
Number of sessions per IP address (mean ± SD)	1.4 ± 4.45	1.5 ± 2.71
User persistence (by session):
First-time users	89%	83%
Repeat users	11%	17%
User hostnames:
.com, .net and .org	44%	44%
.edu	9%	2%
International TLDs	22%	11%
Unknown/unresolved	25%	43%
Referrers
Search engines	26%	31%
berkeley.edu pages	30%	29%
Other .edu pages	2%	5%
Other	25%	10%
not provided	18%	24%

Table 2: Online surveys: Selected responses

	SPIRO	Jack London
Which title best describes you?	N=106	N=433
College/university student or professor	55 (52%)	139 (32%)
K-12 student or teacher	10 (9%)	176 (41%)
Other	41 (39%)	118 (27%)
How often do you use the Jack London site?	N=45	N=196
Daily, weekly, or monthly	11 (24%)	25 (13%)
Less then monthly	4 (9%)	26 (13%)
Today is my first time	30 (67%)	145 (74%)
What is your affiliation?	N=40	N=167
College or university	29 (72%)	42 (25%)
K-12 school	5 (12%)	79 (47%)
Other	6 (15%)	46 (28%)

A summary of the test results can be found in Tables 1 and 2. Table 1 summarizes the usage of the two sites, based on TLA. The Jack London Collection had nearly three times as many usage sessions and unique users as SPIRO (145,959 vs. 54,375 sessions; 97,284 vs. 38,962 unique IP addresses). Overall, the usage patterns were similar, with a few exceptions. SPIRO received twice as much traffic as Jack London from international top-level domains (22% vs. 11%) and four times as much from .edu domains (9% vs. 2%). The Jack London Collection had approximately fifty percent more repeat visitors than SPIRO (17% vs. 11%).

Table 2 summarizes the responses to each site's online survey. Both surveys were designed with a single question on the site's home page leading to a second page with the remainder of the survey. The initial question was completed by 433 visitors for the Jack London Collection and 106 for SPIRO; fewer than half of these completed the remainder of the survey (196 for Jack London; 45 for SPIRO). In both cases, the number of survey responses was less than one percent of the number of unique IP addresses logged through TLA. Among the survey responders, SPIRO had more visitors from higher education than from K-12 schools (72% vs. 12%) while Jack London had the reverse (25% vs. 47%). For both sites, the majority of responders reported that it was their first time on the site, although the trend was more extreme for Jack London (74% vs. 67%). Only a minority of responders reported using each site at least monthly (24% for SPIRO; 13% for Jack London).

Online survey representativeness

In both tests, fewer than two site visitors in a thousand completed the online survey. Because of the low number of responses relative to site visitors, we had serious concerns about the respondents' representativeness and the value of the survey results. Since users could freely choose whether to answer the survey, it seems reasonable to assume that certain types of people were more or less likely to respond. But could we test that assumption quantitatively?

Combining online surveys with transaction log analysis of the same site during the same time period allows new techniques for measuring the survey's response rate and for estimating response bias. The transaction logs enables us to measure the full population of site users during the study period – every user who viewed the site's homepage and therefore had the opportunity to take the survey. The transaction logs also allow us to describe everyone in the target population according to a few behavioral measures, based on their actual browsing patterns on the site. (Additional analyses would be required to see if site usage during the study period was typical of site usage at other times.)

To assess whether the survey respondents were representative, we identified three behavioral measures from the transaction logs that could be calculated for both survey responders and non-responders: the number of browsing sessions each person had during the logging period, the number of files accessed per session, and the average session time length. The first of these measures the frequency of site usage; the second and third estimate the depth of that usage or the user's level of engagement with the particular site. We compared these measures for the survey responders and the survey non-responders.

Representativeness analysis

To assess the likelihood and magnitude of response bias, we performed a series of t-tests, comparing the two groups on the three behavioral measures above. The t-test focuses on the observed means and provides an estimate of the likelihood that the difference between the means of the respondents and the non-respondents is due to chance (Steel and Torrie, 1980). A low p-value indicates that the survey responders are unlikely to be a representative sample of the population. We performed this analysis for both test sites.

Table 3: Representativeness of t-tests

	SPIRO			Jack London
	Responders^†	All users	p-value^††	Responders^†	All users	p-value^††
N	45	38,962		161	97,284
Number of sessions (mean ± SD)	5 ± 17	1.4 ± 4.6	<0.0001	2.2 ± 2.1	1.5 ± 2.7	0.003
Number of sessions (median)	1	1	<0.0001	1	1	0.003
Number of files per session (mean ± SD)	242 ± 309	68 ± 210	<0.0001	553 ± 408	143 ± 306	<0.0001
Number of files per session (median)	93.5	1	<0.0001	460	5	<0.0001
Session length (seconds) (mean ± SD)	63 ± 99	11 ± 46	<0.0001	24 ± 24	9 ± 23	<0.0001
Session length (seconds) (median)	21.5	2	<0.0001	16	5	<0.0001

† Survey responders (who submitted both pages of the survey) for whom log data are also available
†† Likelihood that the responders' observed mean would have been obtained from a random sample (from t-test)

For both sites, these results indicate that the users who responded to the survey were noticeably different from the typical site user – they used each site more frequently, and each session was longer and more in-depth, using more files per session. The p-values indicate that these differences are highly statistically significant. The survey clearly suffers from response bias, and the respondents are a non-representative sample on the three measures we compared.

These findings confirm our fear about survey response bias; the few users who bothered to respond to the surveys are demonstrably different from the average site visitors. Since the results show that the respondents are non-representative on these three behavioral measures, we determined that it would be unwise for us to draw any conclusions from the survey about the characteristics of the site visitors overall.

Discussion

Online surveys and TLA, when properly implemented, can be useful and reliable methods for understanding site use; however, low response rates for online surveys may make it difficult to obtain a representative sample of users (Bishop, 1998; Evans and Mathur, 2005). In this case, a survey may actually yield ambiguous or even misleading results. Based on our tests, we have several observations and suggestions for the use of online surveys and TLA.

First, survey responses will probably not be representative of all site users. Therefore conclusions drawn from survey results can be erroneous if applied to the whole population of a site's users. As a basic-level check, we suggest that researchers at least estimate the survey's response rate by tracking the total number of site users during the survey time period. This analysis should be relatively straightforward, and will not require actually merging the two datasets.

Estimating the survey's response bias quantitatively, as we did in our tests, is more involved; nonetheless, this may be a valuable analysis to conduct before drawing any global conclusions from a potentially biased survey, especially if expensive planning decisions might be drawn from the results. In order to estimate the survey's response bias, we linked survey responses to transaction log data from the same usage session. We found that this process required a high level of expertise and a great deal of time. In addition, we did not discover any easily-available software tools to facilitate these analyses. For sites that have the time and expertise, this high-level analysis can allow the calculation of survey response rates and estimation of survey response bias.

When planning to link survey responses with transaction logs, both the site and the survey should be designed to support linking, with unique identifiers visible in both the usage logs and in the survey results. The data manipulations should certainly be part of a pilot test, before the full-scale survey is launched.

Conclusion

Our experiment involving the complementary power of online surveys and TLA suggests both benefits and drawbacks to Web site owners in using one or both of these tools.

Online surveys can be performed very simply by using a variety of inexpensive or freely available tools and Web sites (such as SurveyMonkey or Zoomerang). Because of the difficulty in obtaining a reliable sample, however, administering a survey and using its results should be approached with caution.
First-time visitors are likely to be the vast majority of users; survey questions should therefore be designed with these users in mind. Questions about user identity, demographics, and intentions or goals are valuable and apply equally well to new and return visitors.
Despite the wide availability of complex TLA software packages that provide flexibility and powerful analyses, freely available software packages can provide a wealth of valuable information and are much easier to install and run.² Many of the basic analyses can be fully automated, with reports generated every month, for example. Other analyses may require a greater time investment from the researcher.
Although TLA can be a useful tool for describing some aspects of users' online behavior, it is of limited utility in describing who the users are, their goals, their offline behavior, and the outcomes of their site use.
The list of referring pages (specific pages that link to the site) can provide one sense of the mindset and intentions of users. Search engine referrals can be particularly interesting. Full search queries (rather than a list of individual keywords) that led people to the site can provide an overall sense of what users are seeking. Search queries can help answer a variety of questions about users' intentions: Do search queries suggest that users are looking for resources with an explicitly educational or instructional focus? Are users looking for specific individual resources, for broad packages of information, or for the site itself? Which searches lead to which site pages? Are there any "false positives" – pages found by users that are probably not what they want? What site resources are not being found by search engine queries?
Both online surveys and TLA are prone to overlook the universe of non-users. An understanding of non-users and their motivations can be extremely valuable for planning and development.

Understanding the usage and users of an educational Web site can provide valuable insights to facilitate better decision-making, improve site design, and support the site's target users in making better use of the available materials. Though there are many types of sites and models for understanding users, TLA and online surveys have become ubiquitous practice. Supplementing survey results with information from transaction logs can help to reveal survey bias, to better understand users' goals and objectives, and to clarify user behavior patterns. While a comprehensive picture of users is not generally possible even when combining these tools, the extra effort of linking the two methods may provide a more robust understanding of users than if using either surveys or TLA alone.

Acknowledgements

This work was made possible by generous funding from the William and Flora Hewlett Foundation and the Andrew W. Mellon Foundation. Additional support was provided by the Hewlett-Packard Company, the Center for Information Technology Research in the Interest of Society (CITRIS), the California Digital Library (CDL), and the Vice Chancellor of Research, UC Berkeley. We are very grateful to the staff of SPIRO and the Jack London Collection for being so generous with their time and allowing us access to their sites and records. Invaluable contributions to data collection and analysis were made by Ian Miller, Charis Kaskiris, and David Nasatir. Shannon Lawrence assisted with editing.

Notes

1. The final project report and other materials are available on our project Web site at <http://cshe.berkeley.edu/research/digitalresourcestudy/index.htm>.

2. See for example the Web TLA Software Package Comparison in Appendix N: Harley, Diane, Jonathan Henke, Shannon Lawrence, Ian Miller, Irene Perciali, and David Nasatir. 2006. Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences. University of California, Berkeley. Available at: <http://cshe.berkeley.edu/research/digitalresourcestudy/report/>.

References

Baruch, Yehuda. 1999. Response Rate in Academic Studies: A Comparative Analysis. Human Relations 52(4): 421-438 (April). Available at <http://hum.sagepub.com/cgi/reprint/52/4/421>.

Bishop, Ann Peterson. 1998. Measuring Access, Use, and Success in Digital Libraries. The Journal of Electronic Publishing 4(2). Available at <http://www.press.umich.edu/jep/04-02/bishop.html>.

Bosnjak, Michael, and Tracy L. Tuten. 2001. Classifying Response Behaviors in Web-based Surveys. Journal of Computer-Mediated Communication (JCMC) 6(3) (April). Available at <http://jcmc.indiana.edu/vol6/issue3/boznjak.html>.

Carson, Stephen E. 2004. MIT Opencourseware Program Evaluation Findings Report. Massachusetts Institute of Technology, Cambridge, Mass. Available at <http://ocw.mit.edu/OcwWeb/Global/AboutOCW/evaluation.htm>.

Chambers, Ray L., and Chris J. Skinner. 2003. Analysis of Survey Data. Chichester, England: Wiley.

Evans, Joel, and Anil Mathur. 2005. The value of online surveys. Internet Research 15(2): 195-219.

Faas, Thorsten. 2004. Online or Not Online? A Comparison of Offline and Online Surveys Conducted in the Context of 2002 German Federal Election. Bulletin de Méthodologie Sociologique 82: 42-57.

Fowler, Floyd J. 2002. Survey research methods. Thousand Oaks, Calif.: Sage Publications.

Fricker, Ronald D., and Matthias Schonlau. 2002. Advantages and Disadvantages of Internet Research Surveys: Evidence From the Literature. Field Methods 14(4): 347-367 (November 1). Available at <http://fmx.sagepub.com/cgi/reprint/14/4/347.pdf>.

Gunn, Holly. 2002. Web-based Surveys: Changing the Survey Process. First Monday 7(12) (December). Available at <http://www.firstmonday.org/issues/issue7_12/gunn/>.

Harley, Diane, Jonathan Henke, Shannon Lawrence, Ian Miller, Irene Perciali, and David Nasatir. 2006. Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences. University of California, Berkeley. Available at: <http://cshe.berkeley.edu/research/digitalresourcestudy/report/>.

Harley, Diane. 2007. Why Study Users? An Environmental Scan of Use and Users of Digital Resources in Humanities and Social Sciences Undergraduate Education. First Monday, volume 12, number 1 (January 2007) Available at <http://firstmonday.org/issues/issue12_1/harley/index.html>.

Johnson, Timothy, and Linda Owens. 2003. Survey Response Rate Reporting in the Professional Literature. Paper presented at Annual Conference of the American Association for Public Opinion Research, Nashville, Tenn., May 15.

Mento, Barbara, and Brendan Rapple. 2003. Data Mining and Data Warehousing. Association of Research Libraries, Washington, D.C. Available at <http://www.arl.org/bm~doc/spec274webbook.pdf>.

Rossi, Peter H., James D. Wright, and Andy B. Anderson, eds. 1983. Handbook of Survey Research. New York: Academic Press.

Schonlau, Matthias, Ronald D. Fricker, and Marc N. Elliott. 2002. Conducting research surveys via e-mail and the web. Santa Monica, Calif.: Rand. Available at <http://www.rand.org/publications/MR/MR1480/>.

Steel, Robert George Douglas, and James H. Torrie. 1980. Principals and Procedures of Statistics: A Biometrical Approach. New York: McGraw-Hill.

Troll Covey, Denise. 2002. Usage and Usability Assessment: Library Practices and Concerns. Digital Library Federation and Council on Library and Information Resources (CLIR), Washington, D.C. Available at <http://www.clir.org/pubs/reports/pub105/contents.html>.

D-Lib Magazine Access Terms and Conditions

doi:10.1045/march2007-harley

D-Lib MagazineMarch/April 2007

Volume 13 Number 3/4 ISSN 1082-9873