Evidence Summary


Perceived and Actual Search Behaviors May Provide Markers for Healthcare Utilization and Severity of Illness


A Review of:

White, R. W., & Horvitz, E. (2014). From health search to healthcare: Explorations of intention and utilization via query logs and user surveys. Journal of the American Medical Informatics Association, 21(1), 49-55. http://dx.doi/org10.1136/amiajnl-2012-001473


Reviewed by:

Lindsay Alcock

Head, Public Services

Health Sciences Library

Memorial University of Newfoundland

St. John’s, Newfoundland, Canada

Email: lalcock@mun.ca


Received: 12 June 2014  Accepted: 16 Oct. 2014



cc-ca_logo_xl 2014 Alcock. This is an Open Access article distributed under the terms of the Creative CommonsAttributionNoncommercialShare Alike License 4.0 International (http://creativecommons.org/licenses/by-nc-sa/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly attributed, not used for commercial purposes, and, if transformed, the resulting work is redistributed under the same or similar license to this one.




Objective – To gain an understanding of the relationship between online health information searching behaviour and healthcare utilization.


Design – Survey and log data analysis.


Setting – A software development campus and health information websites with servers in the United States of America.


Subjects – Two separate subject groups were used for this study. For the search log analysis, participants were randomly selected English-speaking users of a Microsoft toolbar who had consented to provide their anonymous log data. 489 volunteers who indicated they could recall their last visit to a medical facility were invited to participate in the survey.


Methods – To determine search behaviour, four months of data from 2011 were collected and analyzed from search engine logs. A unique user identifier allowed for analysis of individual search behaviour across multiple sessions, which then provided the opportunity to identify search behaviour changes over time. Search queries were labelled and annotated as symptoms, serious illnesses, and benign explanation based on curated lists identified in a related study. Erroneous synonymous entries were removed to increase labelling precision (e.g., astrology-related terms were removed for “cancer”). The researchers specifically noted searches signifying health utilization intent (HUI). Initial queries indicating HUI for each user were identified to determine whether or not there were changes in search behaviour prior to and following searches indicating HUI.


Perceptions of motivators related to healthcare utilization (HU) were gathered through a validated, anonymous electronic survey. Through fifty open and closed questions, participants were asked how they search for medical information online, how they locate medical facilities and scheduled appointments, and how their search behaviour might differ before and after HU. Survey results were compared with search log data to identify and explain trends.


Main Results – From log data, search queries focusing on symptoms increased prior to the first indication of HUI and decreased afterwards. The authors suggest that this increase may reflect a “heightened state of concern or uncertainty” (p. 51). As well, searches on relatively benign symptoms were observed to spike dramatically three weeks after the first identified HUI search, reflecting what the authors suggest may be related to users having been reassured through a visit with a health professional. The increase in benign symptom searching is supported by survey data. The number of symptom-related searches is shown to correlate with the number of HUI searches using Pearson’s correlation coefficient (r=0.64, t(78)=14.43, p<0.001).


Nearly 40% of survey participants searched online for information about a medical facility prior to a visit, and facility visits normally occurred within one (78%) or two (94%) weeks of the HUI search. Those visiting a facility for the first time were more likely to search for information related to the facility prior to the visit than those who had visited the facility previously. Knowledge level was observed to contribute to the results as well in that searchers with self-reported low domain knowledge were not only more likely to search for a type of facility rather than a specific facility, but were also more likely to visit the facility sooner after an HUI search than those with high domain knowledge. Low-domain knowledge participants were also more likely to self-diagnose, more prone to alarmist behaviour related to symptom severity, and were more concerned with medical insurance.


Survey respondents indicated that the focus of their searches prior to HU was primarily on symptom checking and potential diagnosis. Following facility visits participants’ searches focused more on specific conditions or treatments. In addition, respondents noted that the frequency of their medical-related searching for serious conditions reduced after they had been to see their physician, indicating that the initial perceived severity of illness was potentially alarmist.


Conclusion – Search activity, both perceived and actual, may act as a marker to HUI and as an indication that HU has occurred as well as the severity of the HU outcome. Information gleaned from user logs could be used to adapt and model search engine output for users both before and after HU. Further analysis on potential search engine output and geolocation is suggested to determine the full application of such data analysis.





Aside from a few studies (Shuyler & Knight, 2003; White & Horvitz, 2010; White & Horvitz 2013), little research has been done to determine how to tailor search results more effectively to a user based on web searching behaviour. While a literature review is provided, the lack of disclosure regarding search strategies or resources consulted raises questions regarding the comprehensiveness of the review. This article attempts to fill the gap between perceived and actual searching behaviour and how it relates to HU. Using the critical appraisal checklist (Glynn, 2006), the study is determined to be valid.


This complex study is clearly written. The subjects for the log data analysis and survey are different, and the two methodologies provide separate but related results. Therefore, it is important to note that trends can only be described. That said, similar trends did indeed emerge, namely the spike in benign symptom checking after HU. The authors identify that HUI intent and HU cannot be determined with certainty in user logs, which does cast a question of validity on the inferences made.


The sample size appears reasonable and the participants were randomly selected although little information is provided regarding randomization procedures and sample size power. Consent was obtained from both populations. It is unclear whether both groups were similar, as demographics were not obtained for all participants. Therefore, a precise comparison between the two groups’ behaviours is not possible. There may be some inherent bias with the log-user data population due to the user’s knowledge that their log data was being analyzed, which may have affected their search behaviour. The authors recognize that the survey participants may not be representative of the broader population given that they were all drawn from Microsoft and were therefore, likely Microsoft employees.


Data collection methods were clearly described and could be replicated. The user log data collection was based on similar validated studies and the survey was tested on volunteers.


Given the study limitations, the results from each data set were clearly described and reflected in the accompanying figures and tables. Especially interesting were the suggested explanations provided for the fluctuations in search logs/queries and the possible correlations observed between user logs and survey responses. That two different user groups providing two different data sets are shown to exhibit similar online behaviours with respect to HU is intriguing and fodder for future research. The addition of inferential data analysis would have added insight to the study results, particularly with the addition of demographic data as independent variables.


Health searching behaviour and health utilization are inextricably linked. To garner searching behaviour in order to provide more relevant and tailored information to users is a logical leap for providers of healthcare and health information. This study provides the link between perceived and actual behaviour and also the initial groundwork for further research.




Glynn, L. (2006). A critical appraisal tool for library and information research. Library Hi Tech, 24(3), 387-399. http://dx.doi.org/10.1108/07378830610692154


Shuyler, K. S., & Knight, K. M. (2003). What are patients seeking when they turn to the internet? Qualitative content analysis of questions asked by visitors to an orthopaedics web site. Journal of Medical Internet Research, 5(4), e24. http://dx.doi.org/10.2196/jmir.5.4.e24  


White, R. W., & Horvitz, E. (2010). Web to world: Predicting transitions from self-diagnosis to the pursuit of local medical assistance in web search. AMIA Annual Symposium Proceedings / AMIA Symposium. AMIA Symposium, 2010, 882-886.


White, R., & Horvitz, E. (2013). From web search to healthcare utilization: Privacy-sensitive studies from mobile data. Journal of the American Medical Informatics Association, 20(1), 61-68. http://dx.doi.org/ 10.1136/amiajnl-2011-000765