Appendix A. Detailed description of data and covariates used for wombling.
County-level spread records - We derived the dynamics of HWA's spread for the years 1951 through 2009 using county-level records compiled by Forest Service, US Department of Agriculture, Forest Health Protection personnel (http://na.fs.fed.us/fhp/hwa/maps/distribution.shtm). We updated these county-level records with more localized records drawn from multiple sources, including: the National Entomological Collection at the Smithsonian Institute (G. Miller), the Pennsylvania General Hemlock Survey executed by the Pennsylvania Department of Conservation of Natural Resources (B. Regester), township-level records for Massachusetts (C. Burnham) and New York (J. Denham), surveys performed by the Georgia Forestry Commission (J. Johnson), stand-level surveys for southwestern Virginia (T. McAvoy), surveys in southern Vermont by the Vermont Department of Forests, Parks, & Recreation (B. Burns), and stand-level surveys in Connecticut and Massachusetts (D. Orwig). When these more local surveys indicated an earlier date of first infestation than the county-level records, we updated the county-level records as necessary. Finally, to simplify coding of the models, we removed 12 island counties, (i.e., counties with no infested neighbors possibly infested by long-distance jump dispersal). The final dataset comprised 322 counties with dates of first infestation ranging from 1951 to 2009.
Estimates of hemlock abundance - To produce a map of hemlock abundance we used the randomForests algorithm (Liaw and Wiener 2002) in R 2.9.1 (R Development Core Team 2009) to relate observed hemlock abundance (basal area, m2 ha-1) from the USDA Forest Inventory and Analysis (FIA) database (comprised of 16,084 occurrences) to 26 environmental predictor variables. Environment predictors included 23 bioclimatic variables describing minimum, maximum, and seasonality in temperature and precipitation and water balance (Hijmans et al. 2005, Svenning and Skov 2005), two topographic variables (slope and compound topography index) from the USGS HYDRO1k dataset (http://eros.usgs.gov/#/ Find_Data/Products_and_Data_Available/gtopo30/hydro), and an index of net primary productivity (Zhao et al. 2005). All variables were manipulated in ArcGIS 9.3 such that they were spatially congruent, had a common resolution of 1 km, and were projected using and equidistance conic projection to preserve distance characteristics between locations. We used the resulting model to predict hemlock abundance across eastern North America. Although Carolina hemlock (Tsuga caroliniana) is also susceptible to HWA, we did not model its distribution as it is relatively rare and narrowly distributed and its distribution falls entirely within the range of eastern hemlock. To account for the fact that most cells were not 100% forested, we multiplied the map of hemlock abundance by a corresponding remotely-sensed estimate of percent forest cover. The result was a map of hemlock abundance adjusted for forest cover that corresponds well with its known distribution and abundance.
Estimate of human population density & mean winter temperature - Estimates of human population density were derived from 2000 U.S. census data (http://www.census.gov/main/www/cen2000.html). Estimates of mean winter temperature (December, January, February, March) at 1km spatial resolution were downloaded from the Worldclim database (http://www.worldclim.org/, Hijmans et al. 2005). For all covariates, we used the Zonal Statistics tool in ArcGIS 9.3 to calculate summaries of covariates for each county.
LITERATURE CITED
Hijmans, R. J., S. E. Cameron, J. L. Parra, P. G. Jones, and A. Jarvis. 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25:1965–1978.
Liaw, A., M. Wiener 2002. Classification and regression by Random Forest. R News. 2: 18–22.
R Development Core Team. 2009. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Svenning, J. C., and F. Skov. 2005. The relative roles of environment and history as controls of tree species composition and richness in Europe. Journal of Biogeography 32:1019–1033.
Zhao, M., F. A. Heinsch, R. R. Nemani, and S. W. Running. (2005). Improvements of the MODIS terrestrial gross and net primary production global data set. Remote Sensing of Environment. 95: 164–176.