Interview with Mark Meiss
Mark Meiss recently lead a study on web-users behavior and found that most of URL request did not emanate from links but from the user’s favorites or from direct typing into the address bar. He clarifies the results in this interview.
Analysis of the research can be found on the Online Journalism Blog.
Looking at the visits stats from some websites (ranging from micro-niche to timesonline.co.uk), search engines typically make 50-80% of visits, referrals about 20-30% and very little is left to direct traffic. How do you explain such a difference between your results and this data? Do students at Indiana behave in very particular ways?
We are familiar with a number of sources that report far higher percentages of visits attributable to search engines (see especially the paper by Cho cited in our paper), and the disparity we measured was surprising to us as well. We’re not able to give a firm explanation, but we have discussed a number of factors that may contribute to the disparity:
- Our classification of a URL as representing a page or not is based on file extensions, so we may incorrectly identify some URLs as being pages when they are actually other file types that would not be returned by a search engine.
- Pages with frames (whether based on FRAMESETs or IFRAMEs) will generate multiple HTTP requests for pages, but a search engine will be the referrer for only the top-level document.
- Pages with some form of dynamic refresh, especially AJAX-based applications, may generate many HTTP requests even though the page has only been explicitly loaded a single time.
- Traditional websites that report these sorts of statistics may be a shrinking proportion of overall traffic. For example, the most visited site in our data is Facebook.
- Many of the most visited sites are so familiar that nobody needs a search engine to find them — Microsoft, Apple, CNN, and so forth are so well-known that much less of their traffic may come from search engines. Because of the scale-free distribution of the traffic, this could be a large effect.
This is a matter of ongoing interest for us, and we’re investigating ways of testing these hypotheses in our future research. I should also note that while Indiana University users are drawn to some sites
disproportionately to the general population (Facebook being an example), our experience with other data sets implies that this does little to change the overall distribution of traffic. For example, a longitudinal study we’ve done on traffic on the Internet2/Abilene network found that traffic patterns changed very little when the network began carrying commercial traffic in addition to academic traffic.
Similarly, a graph indicates that the trend towards empty referrals is gaining weight. Would that mean that users visit fewer websites on a regular basis or that they memorize/favoritize more URLs?
Both of these are possibilities, but we suspect that a larger factor may be the rise of applications that are well-integrated with the Web and generate empty-referrer requests — especially e-mail, RSS readers, and PDF files with embedded links. Browser home pages and bookmark visits are also likely to generate many such requests.
Do you have data for the end of 2007? How do you think the rise of RSS and the personalized home page will impact the behavior of web users?
Our data collection is ongoing, and we will definitely be performing more analysis of these sorts of trends. RSS and personalized portals are certainly becoming quite popular, but it’s quite challenging to measure their effects, since they may generate many requests without referrer information. It would be nice to have some solid data on their popularity as a proportion of users (and know how typical those users are in terms of traffic load).

[...] Quite surprised himself, he admits first that his experimental design considered requests from AJAX pages or RSS readers as having no referral. He also stressed the difference between some heavily visited websites, such as Facebook, and those on the news market. Users looking for news can be dwarfed by the new usages that emerged recently and that focus on a few websites (read: social networking). The full interview can be read here. [...]