Whether the goal is to improve air quality, reduce storm water runoff, conserve energy, or increase property values, urban forests inevitably become a critical part of municipal resource management. But while it’s relatively straightforward to explain why trees are valuable, it’s often more difficult to figure out how to manage them in accordance with strategic goals to realize as much of that value as possible. Where should you be planting trees to have the optimal impact on storm water runoff? Which neighborhoods meet your biodiversity goals and which don’t? Which neighborhoods should we prioritize for future plantings?
This article explores how data analysis methods can critically inform tree inventory management decisions. We will use tree inventory data from yegTreeMap, which is maintained by the City of Edmonton in Alberta, Canada and contains over 260,000 trees across 375 neighborhoods. We will also source data from Edmonton's Open Data Catalogue, a free repository of over 500 city-level data sets on topics ranging from transportation, to demographics, to environmental services. While strictly analyzing Edmonton's tree inventory in isolation is a valuable exercise, using complementary data sets allows us to model the urban forest in novel ways, and understand it through different community and socio-economic lenses. In some instances, it can even provide opportunities to explore topics like political advocacy and environmental justice in quantifiable terms.
For many tree inventories built with OpenTreeMap a community decides to create a map, their trees are plotted by the thousands, and eco-benefits are automatically calculated using regional and species-specific standards set by the USDA Forest Service (iTree). Simply calculating eco benefits can be informative—Edmonton's trees alone account for more than $30 million in annual benefits. Such a valuable resource justifies dedicated maintenance and investment, but there aren't many tools that help urban foresters model how management decisions will impact their community in scientific terms, let alone figure out what decisions they should be prioritizing in the first place.
In early 2016, OpenTreeMap release a Forestry Modeling and Prioritization Toolkit capable of optimizing a planting strategy in a matter of minutes. By simply adjusting factors like air quality, storm water management, and biodiversity, the tool will be able to provide heat maps of optimal planting sites, calculate ecosystem impacts over a 30-year period (accounting for growth and mortality rates), and cross-reference territories maintained by tree tending organizations and volunteer groups. In the mean time, there are steps you can take today to learn more effectively manage your urban forest.
Behind the Map
Eco benefits are perhaps the flashiest part of an OpenTreeMap inventory—they are integral to communicating the value of urban forests in real terms and useful for setting specific and quantifiable goals. But even before eco benefits are accounted for, OpenTreeMap inventories can reveal surprising insights. In this blog post we look at a few observations that can be deduced by simply knowing the location, diameter, and species of your trees.
How are trees distributed throughout the City of Edmonton?
Looking at Edmonton’s tree inventory data, the first question to answer is deceptively simple: how are the trees distributed throughout the city? One approach to this question is to break down Edmonton into its constituent neighborhoods and sort them, first by the total number of trees and subsequently by tree density to control for geographic area.
Unsurprisingly, neighborhoods containing large public parks generally dominate the lists for both raw number of trees and tree density per square mile (while industrial parks fall out toward the bottom of the list). But sorting tree inventories with simple queries can be helpful in ways other than neighborhood rankings, potentially revealing more sophisticated insights.
For instance, we used a similar sorting technique to find the largest tree of each species within every neighborhood, and plotted the results in an interactive map here (Figure 1). If you don’t have access to tree diameter, you can substitute a single representative of each species chosen at random (within each neighborhood) and achieve similar results. This visualization can indirectly illustrate biodiversity (or lack thereof) by comparing it with the tree density ranking we created earlier—if you have two or more neighborhoods with a similar tree density, they should theoretically look similar on the map in Figure 1 if biodiversity is uniform. But if one of those neighborhoods has only a few species (even if they have many trees) the map will look comparatively sparse.
Take, for instance, the neighborhoods of Ottewell and Clareview Town Centre with nearly identical tree densities of roughly 225 and 224 trees per square kilometer respectively (Figure 2). Ottewell’s map is speckled with dozens of dots, each representing a unique species. Clareview Town Centre, on the other hand, shows just nine distinct species (equating to roughly 28% of Ottewell’s biodiversity).
Why the discrepancy?
Hundreds of American Elm trees were planted in Clareview, almost to the exclusion of other species, creating a concentrated monoculture. The result is that this community is at a greater risk of losing its trees to Dutch Elm Disease if it reaches Alberta, which has the largest number of unaffected elms remaining in the world. Identifying neighborhoods with lower biodiversity and prioritizing plantings in those areas can avert disaster decades later when a disease or invasive pest attacks one species aggressively.
Early in the process of using analytical methods to better understand the composition and distribution of a tree inventory, the natural impulse is to look for complementary data sets that can provide a fuller context for the impact of an urban forest on its community. While establishing a causal connection between tree counts and economic or social variables is elusive, correlational data can still help to explain some of the variation in tree volumes across neighborhoods that seem similar geographically.
For these types of insights, we use a process called regression analysis to identify the social and economic factors most closely related to the number of trees in each neighborhood. For Edmonton, we looked at neighborhood-level statistics for variables ranging from employment, to real estate, to school enrollment in order to find those topics that bore the closest relationship to the number of trees in each neighborhood.
The first level of analysis is looking at which variables correlate strongly with higher tree count (i.e. positive correlation) and which correlate with lower tree count (i.e. negative correlation). Distinctly positive variables included the number of duplex homes, single detached homes, and employed people in a given neighborhood, while distinctly negative variables included the number of row homes and whether or not the neighborhood was designated as an manufacturing zone.
Those relationships are relatively intuitive, but a second level of analysis, looking at the statistical significance of those relationships, provides slightly more insight. For instance, the number of homemakers in a neighborhood is only 50% significant for predicting the number of trees—a neighborhood with a high number of homemakers is just as likely to have a low number of trees as a high number of trees. Similarly, knowing the number of retirees and unemployed residents is not useful for predicting the amount of trees in a given neighborhood.
Most variables are poor indicators on their own, but we can combine them to get more descriptive models. In Edmonton’s case 40% of the variance in tree count across neighborhoods can be explained by just two variables: the number of employed people under the age of 30 and the number of row homes. The former is positively correlated with tree count, while the latter is negatively correlated.
Looking at more variables provides diminishing returns—our models that used three variables could account for 45% of the variation, four variables for 47%, five variables for 48%, and so on. The extremely complex systems interacting with a large tree inventory like Edmonton’s can’t be satisfactorily explained by the relationship of a few variables, but regression analysis can identify auxiliary data sets that are most closely related to the overall health of the urban forest. In Edmonton’s case, employment numbers and real estate characteristics were inextricably tied to the number of trees in each neighborhood while school enrollment data and residents not in the labor force (homemakers, retirees, etc.) bore little in common with tree count.
Simply knowing the location and species of trees within an inventory can yield worthwhile insights into the strengths and weaknesses of an urban forest. Understanding more about the composition and distribution of trees throughout a city can influence the way resources are allocated, projects are prioritized, and overall risk is assessed. And by combining tree inventory data with other open data sets, analysis can yield richer, more illustrative aspects of the impact an urban forest has on its community.
Tyler Dahlberg, Geospatial Solution Specialist with the Data Analytics Team, performed all of the analysis for this article. If you’re interested in conducting your own tree inventory analysis, please do not hesitate to reach out to the OpenTreeMap team.
 Edmonton’s neighborhoods are actually standardized geographical territories recognized by the city and are used as census tracts.
Featured image courtesy of WinterforceMedia