When do people worry about hurricanes?
I recently read an article in the December issue of Significance titled, “Does Christmas
really come earlier every year?” by Nathan Cunningham of the University
College of Dublin. His premise was that,
by using cluster analysis of Google Trends data, we can see how people have
begun thinking about the holidays earlier and earlier each year. It’s a good read: http://www.statslife.org.uk/significance/1892. I should note that Nathan graciously
answered my emails asking for clarification and saw real value in this
technique for emergency management work.
I decided to replicate his results using a FEMA-related
search term: “hurricane”.
Google Trends
Google Trends (http://www.google.com/trends/)
allows you to view the volume of searches on particular terms. The units are percentage of total Google
searches. For example, the week that
Hurricane Katrina made landfall, “hurricane” scored almost 100; almost all
searches were hurricane related. If you sign-on with your Google ID, you can
also download the data to CSV. Cunningham
used Google Trends to analyze search volumes on holiday-related terms
(“Christmas”, “Santa Claus”, etc). Here
I’ve compared the search terms “hurricane” and “tornado”. You can see that there is a somewhat
repetitive pattern of increase mid-year.
I wanted to explore this pattern.Cluster Analysis

Further Investigation
This simple example shows how cluster analysis can illustrate the behavior of data that have more
than one pattern. This could find
application in data that vary from Region to Region or JFO to JFO, or changes
with disaster type.
R Code used in this example
## Crow's nest Clustering example
– Tim Allen
# Adapted from
http://www.statslife.org.uk/significance/1892
# Nathan Cunningham - Does
Christmas really come earlier every year?
# Significance Magazine 11
November 2014
#
Allow multiple plots (2 rows x 6 columns)
par(mfrow=c(2,6))
#
You have to install and load the mclust package
library(mclust)
#
Calculate clusters for each year
for (yr in 2007:2013) {
# 1) load this
year's data in a matrix
observations <- span="">->as.matrix(subset(gtrends, year==yr, select=c("week","hurricane")))
# 2) find
clusters based on models' BIC
fit <- span="">->Mclust(observations, 2)
# 3) Plot the
clusters and print the model summary
plot(fit, what="classification", xlab=yr)
print(summary(fit))
}