Testing the ability of species distribution models to infer variable importance

Adam B. Smith,Maria J. Santos

Testing the ability of species distribution models to infer variable importance

2019

Identifying biophysical factors that define species9 nichesand influence geographical ranges is a fundamental pursuit of ecology. Frequently, models of species9 distributions or nichesare used to infer the importance of range- and niche-defining variables. However, very few--if any--studies examine how reliably distribution and nichemodels can be used for inference. Here we use a simulation approach to understand the conditions under which species distribution models reliably measure variableimportance. Using a set of scenarios of increasing complexity, we explore how well models can be used to 1) discriminate between variablesthat vary in importance and 2) calibrate the effect of variablesrelative to an " omniscient" model used to simulate the species. Variableimportance was assessed using a sensitivity test in which each predictor was permuted in turn. Importance was inferred by comparing model performance between permuted and unpermuted predictions and by calculating the correlation between permuted and unpermuted predictions. Of five metrics of importance (correlation statistic and AUC each calculated with presences/absences or presences/background sites, plus the Continuous Boyce Index), only the Continuous Boyce Index was capable of indicating absolute (versus relative) variableimportance. In simple scenarios with one influential environmental variablewith a linear spatial gradient and one uninfluential randomly-distributed variable, models were unable to discriminate reliably between variablesunder conditions that are typically challenging (low sample size, high prevalence, small spatial extent, coarse spatial data resolution with low spatial autocorrelation, and high collinearity between variables). In more complex scenarios with two influential environmental variables, models successfully discriminated between variableswhen they acted unequally, but overestimated the importance of the stronger variableand underestimated the importance of the weaker variable. When variableshad equal influence, models underestimated importance when nichebreadth was narrow. Generalized additive modelsand Maxent had better discrimination accuracy than boosted regression trees. Our work demonstrates that permutation tests can reliably discriminate between variableswith different levels of influence but cannot accurately measure the magnitude of influence. The frequency with which distribution and nichemodels are used to identify influential variablesbegs further research into methods for assessing variableimportance.

Keywords:

Correction
Source
Cite
Save

136

References

Citations