Making good predictions

Authors: Poliana Mendes, Amanda Schwantes

Introduction

We often assume that a relationship between two variables is linear; however, if we increase or reduce the spatial extent of a study, we may see a nonlinear relationship appear. This matters because we often use data from one area to make predictions for other areas. If we assume the relationship is linear when it’s not, our predictions will be wrong.

Case study

In this case study, we explore the relationship between soil carbon and plant cover in vacant lots in Quebec City. The vacant lots were carefully chosen to cover the full range of plant cover. However, if a scientist only picked a few of these lots (like the green dots on the two left-hand figures), they might mistakenly think the relationship between soil carbon and plant cover is linear (as shown by the green line). This could lead to incorrect predictions for the entire area because when we look at data from the whole study region (shown on the right), we see the relationship is actually non-linear (the gray line). This nonlinear relationship could happen because in areas with high plant cover, other factors, like soil type or land use history, may play a role in explaining the carbon levels.

Locations of sampled vacant lots.
Linear relationship observed when only a subset of the vacant lots is considered (green dots).
Non-linear relationship when all vacant lots are considered. Plant cover (%) can be greater than 100% when there is more than one layer of vegetation (for example trees and herbs).
Data source: Mendes P; Bourgeois B; Pellerin S; Ziter CD; Cimon-Morin J; Poulin M. Linkages between plant functional diversity and soil-based ecosystem services in urban and peri-urban vacant lots. Urban Ecosystems. 10.1007/s11252-023-01470-5

Best practices and opportunities

  1. Always choose the spatial extent that makes most sense for the question you are asking

  2. When making predictions for a new area, check if the environmental conditions are similar to the original study area. If not, think about whether the relationship between variables might be non-linear

  3. Seasonal changes can also create non-linear relationships (like between temperature and time). To handle this, it might be necessary to collect data at the same time each year or more often throughout the year