Distribution modelling

Development of the habitat suitability models

We used the geographic range and habitat preferences information obtained from the IUCN Red List of Threatened Species as a baseline for developing habitat suitability models for 5029 (94.32%) out of 5332 non-extinct (sensu IUCN Red List Category) known terrestrial (including coastal and flooded habitat) mammals. For 286 species (5.36%) we didn’t develop a habitat model either because we lack of literature information about species’ s habitat preferences or because of the critically small species’ s geographic range (below 100 km²) while no spatial information was available on the Extent of Occurrence of the remnant 17 species (0.13%). For the species without a defined habitat model we have considered the EOO as entirely suitable.

For each species we developed a habitat suitability model at 300 m resolution and limited to within the species’ known geographic ranges, to avoid predicting species presence beyond their distribution limits. We focused on the following environmental variables: type of land cover, human impact, elevation, and hydrological features. The type of land cover and the amount of human impact were mapped through the Globcover ver. 2.1. This is a global, 300 m resolution map containing a legend based on the standard UN Land Cover Classification System. The advantage of using an LCCS-based land cover map is that the habitat preferences assessed against its legend can be easily applied to other maps in the future. The elevation map was produced by resampling to 300 m the SRTM elevation at 1 Arc seconds resolution (approximately 90 m at the equator). The map of water bodies was produced by merging two sources: a 300 m wide buffer around the 210 class (water) of the Globcover for polygonal water bodies (lakes and large rivers), and the Vmap0 linear permanent waters map (transformed to raster at 300 m resolution) for linear water bodies.

When known and recorded in the Red List, the information on the elevation range where a species is found is expressed as a minimum-maximum range in meters and therefore it is readily available for the analysis. The rest of the information on habitat preferences, including the preferred types of land cover, tolerance to modified habitat and relationship to aquatic habitats are in the form of a textual description. Therefore to make it usable for the analysis we extracted the information in two steps. First we assigned each species to one or more broad habitat type (forest, shrubland, grassland, bare, artificial, flooded) and intersected this information with the level of tolerance to human-modified (degraded or mosaic) habitat types, to generate an automated classification of the land cover map. In the second step, where detailed information on the habitat preference was available, if necessary we modified manually the suitability of land classes in the land cover map. In addition we recorded whether the species distribution should be restricted to within a small distance to water bodies.

We defined three levels of suitability for the land cover: high, corresponding to the primary habitat of a species, i.e. habitat where a species can live; medium, corresponding to secondary habitat, i.e. habitat where a species can be found (e.g. if primary habitat is present nearby), and unsuitable, where species are not expected to be found. All cells in the model inside the elevation range of the species retained the suitability score assigned to the land cover, while the other cells were classified unsuitable. In addition, for species whose distribution is restricted to within a small distance to water bodies, all cells outside the map of water bodies were classified unsuitable. Models were developed using the free-software GRASS GIS.

Model evaluation

For a subset of 265 species, point locality data were available to evaluate the predictive power of the models. These data were derived from three different datasets. (1) The validation data collected for the African Mammals Databank were collected in 100 random points in each of four countries (Morocco, Cameroon, Uganda, Botswana), for a total of 400 points, and consist of lists of species known to be present in a 1 km radius around the point (either by direct observation or interview with residents and local wildlife professionals). (2) The validation data collected for the Southeast Asian Mammals Databank including a list of species with the same characteristics of the African Mammals Databank data and collected in Thailand, Vietnam, Borneo, and the Philippines, and a set of occasional (non-random) occurrences derived from various sources. (3) Data extracted from the Global Biodiversity Information Facility. This third dataset is substantially different from the first two because it contains occasional data of different provenance and age. The subset chosen for the evaluation of mammal models included data collected in the last 20 years (1989-2009) and with a nominal positional error smaller than 1 km (in the subsequent analysis the positional error for these points was downgraded to 1 km). For all three datasets, only species with at least five separate occurrences and five separate non-occurrences (i.e. in different 1 km² cells) were considered for model evaluation.

Data structure and availability

Read the full publication