Understanding the suitability modeling workflow—Analytics

The suitability modeling workflow
Step 1: Define the problem
Step 2: Identify and derive the criteria
Step 3: Transform values
Step 4: Weight and combine
Step 5: Locate
Step 6: Analyze
Conclusion
Lessons and resources

Suitability modeling is the most common application for ArcGIS Spatial Analyst extension and can solve a variety of problems:

Where to site a new housing development
Where to locate a new ski area
Where to deploy troops in a military operation
Where to locate firefighting crews to best fight fires in the dry season
For this case study, which sites are best for bobcat habitat

This case study demonstrates the steps in a workflow for creating a suitability model. In this case study, we will use the workflow to identify the best areas to conserve for bobcat in Vermont, USA. The focus of the case study is to present the suitability modeling workflow. While bobcat habitat is being used to demonstrate this workflow, the steps can be applied to most other suitability modeling scenarios. The data and the workflow discussed in this case study are real but the actual problem and the analysis were developed for demonstration purposes. We encourage you to refine the general principles and criteria presented in the model with your specific knowledge.

Your problem statement

You are a director in the Vermont Department of Natural Resources and you need to conserve wildlife populations that are being threatened by habitat fragmentation. New roads, housing developments, and timber harvesting have broken large tracts of contiguous forest into isolated patches that are too small for many forest-dwelling animals. To slow the rate of fragmentation, the department of natural resources needs to prioritize which lands to protect and design wildlife corridors to connect them.

Connecting isolated patches of habitat is important to maintain biodiversity and to allow individuals from different populations to create a robust metapopulation (a group of populations of the same species separated by space but with some interaction between the populations). This connectivity will allow spatially-discrete subpopulations to interact with each other to encourage genetic diversity and allows the species to recolonize a patch if the local population goes extinct.

Patches of populations making a metapopulation

Since you cannot individually consider each wildlife species, you opt to design the conservation plan around an umbrella species. An umbrella species is a species generally high on the food chain which requires large areas of intact habitat. In this particular area, bobcat (Lynx rufus) will be your umbrella species.

By focusing your conservation efforts on protecting an umbrella species, you will indirectly protect other species with similar habitat needs. As the name implies, the bobcat acts as an umbrella for protecting other species with similar requirements.

The study site

Vermont is a small, rural state, with the Green Mountains running north-south through the center. Forest vegetation, mostly maple/hickory and spruce/fir tree types, dominates the land cover, followed by agricultural fields.

The suitability modeling workflow

To identify the best bobcat patches to conserve you will use a suitability model. A suitability model is comprised of six steps:

Define the problem
Identify and derive the criteria
Transform values to a common scale
Weight the criteria relative to one another and combine
Locate the phenomenon
Analyze the results

This case study will walk you through each step and the steps will be implemented through a ModelBuilder model.

The bobcat ModelBuilder model — The bobcat ModelBuilder suitability model that you will create in this case study.

Step 1: Define the problem

The following steps can help with defining the problem the suitability model will address:

Identify the goal
Establish how to evaluate the model
Create submodels

Identify the goal

The goal of the model should be specific, overarching, and measurable:

The goal of locating a ski area may be to make money.
The goal of deploying troops may be to keep the troops safe while placing them in the most strategic locations for military advantage.
For this bobcat model, the goal is to maintain a viable population of bobcats for the next 100 years.

Establish how to evaluate the model

Suitability models should be evaluated relative to the overarching goal to determine if the model is successful:

Does a ski area make money?
Do military troops successfully complete a mission with few casualties?
Is a minimum viable population of 20 mating pairs of bobcat maintained?

Create submodels

Submodels help simplify the problem and clarify the relationships of the criteria within the model. Each submodel contributes to the overarching goal of the suitability model:

A ski resort may include Terrain, Accessibility, and Cost of Development as submodels.
Troop deployment could be simplified into Safety, Accessibility, and Proximity to Hostile Forces.
This bobcat suitability model identifies three submodels:
- Habitat - Identifies the most preferred habitat for bobcat to live within.
- Food - Identifies the most likely areas bobcat may find suitable food.
- Security - Since the bobcat is an interior species and generally avoids human activity, this submodel identifies the least human-impacted areas.

Submodels for the bobcat ModelBuilder model — The three submodels for the bobcat suitability model.

Step 2: Identify and derive the criteria

Each submodel contains criteria relevant to the goal of the submodel. For our bobcat model:

The Habitat submodel has three criteria bobcat are responding to when finding their best habitat: shelter (with forestland being the most preferred), access to water, and the terrain features (with steeper slopes being preferred).
Input criteria within the Food submodel include: access to the maximum amount of food (with forestland and grassland being preferred as well as access to prey).
The Security submodel focuses on distance from houses, roads, and human development.

These criteria can be summarized and organized in a table before use in an ArcGIS ModelBuilder model.


Goal	Preferred criteria	Datasets
Habitat
Maximum shelter	Forestland	Land use
Proximal to streams	Access to water	Streams
Terrain features	Ledges and cliffs	Slope
Food
Maximum prey	Prey has suitable habitat, access to prey	Land use, deer yards
Security
Minimize human interaction	The farther away from roads and buildings the better	Roads, building locations

Tables help outline how datasets, criteria, and submodel goals are connected. These relationships can be realized in a ModelBuilder model.

Bobcat suitability ModelBuilder model — Criteria are sorted into submodels within ModelBuilder following the organization of the table.

Some of the criteria can be used directly in the model, (for example, the land use for the shelter criterion for the Habitat submodel). For other criteria, the criterion must be derived from the base data (for example, the Euclidean distance to streams must be determined from the streams base data for the Habitat submodel). Each criterion should be relevant to the goal of the submodel and irrelevant criteria should not be included.

The Euclidean Distance tool dialog box to derive the distance from streams criterion.

Streams on the resulting distance from streams layer — The Euclidean distance from streams layer will be used to capture the bobcats' preference for locations closer to streams.

Step 3: Transform values

The values of the input criteria, whether base data or derived data, are relative to the criteria they represent and not relative to one another. For example, a 3 may indicate single-family residential housing on a land use map, 100 may represent meters from an existing stream, and 10 may represent a slope at a particular location. Adding these three criteria together results in 113 for the location, a meaningless value especially with respect to the phenomenon being modeled.

The original base or derived input values must be transformed to a common ratio or interval preference scale so the criteria can be compared. That is, the input land use categories, distance from streams, and slope values must be transformed to represent the phenomenon's (in our case bobcat's) preference for the criteria values relative to their contribution to the goals of the submodel (in this case the Habitat submodel).

We will use a 1 to 10 scale, but other common scales include 0 to 1, 1 to 9, and 1 to 100. In the bobcat Habitat submodel, steeper slopes are assigned a ten (most preferred on a scale of one to ten) and flatter land is assigned a lesser value (for example, a one or two).

Transformed slope layer — Slope values, derived from the elevation dataset, are transformed to a scale from 1 (red) to 10 (green) with 10 being the most preferred.

The other two criteria for the Habitat submodel, Euclidean distance from streams and land use are transformed onto the same preference scale. This process continues for each criterion in each submodel - the Habitat, Food, and Security submodels. For additional information on types of numbers and preference scales see the Transforming data onto a common preference scale story map.

The two main tools that are designed to transform base and derived data to a common scale are Reclassify and Rescale by Function.

Tools to transform values: Reclassify versus Rescale by Function

The Reclassify tool is generally used to transform categorical input data to a common preference or suitability scale (for example, 1 to 10).

Reclassify dialog box for weighting land use — The Reclassify tool dialog box weighting the different land use types onto a common 1 to 10 preference scale.

Transformed land use layer — Reclassified land use for the Habitat submodel. The green areas are more preferred, with the red areas being least preferred.

The Rescale by Function tool is generally used to transform continuous input data to the common preference scale using a continuous mathematical function. The mathematical functions can be either linear or nonlinear (for example, Exponential decay).

The Rescale by Function tool dialog box being used to transform the distance from streams onto a common scale. The closer locations are much more preferred.

Transformed Euclidean distance from streams layer — Rescaled Euclidean distance from streams for the Habitat submodel. The green areas are more preferred with the red areas being least preferred.

The output from the Reclassify tool produces integer output (for example, 1, 2, …, 10), while the Rescale by Function tool produce floating-point output (for example, 1.1, 1.2, …, 9.9). With Rescale by Function, the preferences change continuously with each change of the input value. For your bobcat Habitat submodel, this means that the locations become less preferred with each step the bobcat takes away from a stream. Other examples of using Rescale by Function include:

Within a ski resort Terrain submodel, Rescale by Function may be used on the slope criterion to emphasize preference for steeper slopes.
The Safety submodel for troop deployment may place much greater preference on areas with low visibility.
In the case of the bobcat Security submodel, farther distances from roads may receive a much higher preference.

For additional information see the Reclassify and Rescale by Function story map.

Many times, when transforming continuous input data (for example, distance from streams), the lowest to the highest values within the study site are transformed to the lowest to highest values in the preference scale (or vices versa). This type of transformation is data dependent. With a greater understanding of the phenomenon, you may implement a data independent transformation for the phenomenon basing the transformation on criteria values whether they are available in the study area or not.

For additional information see the Data dependent and data independent suitability modeling story map.

At this point in the workflow, each input criterion has been identified and placed on a common scale and the criteria and submodels can now be combined.

Step 4: Weight and combine

Certain criteria in a submodel or the submodel itself may be more significant than another. Therefore, they should be weighted more relative to one another before the criteria within a submodel or the submodels are combined. For example:

For a ski area model, slope may be weighted higher than aspect within the Terrain submodel.
For a troop deployment model, the Safety submodel may be more important than the other submodels. Therefore, it should receive a higher weight.
The criteria in the bobcat Habitat submodel have the following weights:
- Land use: 2
- Slope: 1
- Distance from streams: 1

Weighted Sum tool in ModelBuilder to combine the habitat criteria — The Weighted Sum tool in the ModelBuilder model to combine the criteria into a final Habitat submodel surface.

Weighted Sum tool dialog box to combine the habitat submodel criteria — The Weighted Sum tool dialog box weighting the different input criteria within the Habitat submodel.

Habitat submodel output layer — The resulting Habitat submodel surface (green indicates favorable habitat whereas red indicates less suitable habitat).

Two tools can be used to weight and combine input criteria within submodels and submodels within the overall suitability model: Weighted Overlay and Weighted Sum.

Tools to weight criteria and submodels: Weighted Overlay and Weighted Sum

The Weighted Overlay tool accepts integer values and allows you to reclassify and weight the input base data, derived data, or submodels.

The Weighted Sum tool requires the input base and derived criteria or submodels to be previously transformed onto the common preference scale. The input can then be weighted and combined by the tool.

Weighted Sum tool in ModelBuilder to combine the submodels — The Weighted Sum tool was first used to combine the criteria for each of the submodels and is used here to combine the three submodels to create the final bobcat suitability surface.

Weighted Sum tool dialog box to combine the submodels — The Weighted Sum tool dialog box to weight and combine the three submodels. In this landscape, Security is deemed more important than the Habitat and Food submodels.

At this stage of the workflow, each of the criteria have been transformed to a common scale. This common scale allows for meaningful values to be produced when the criteria are combined. Using the Weighted Sum tool, the transformed criteria were weighted and combined. Then the three submodels (Habitat, Food, and Security) were combined also using the Weighted Sum tool to produce the final suitability surface.

By adding the criteria and submodels together, the higher values on the resulting surface represent more preferable locations. The final suitability for each location is based on the tradeoff of the preferences of the goals represented by each submodel. In your case, the highest values have the best habitat, the most food, and are safer. The final suitability surface identifies the preference for each location relative to one another.

Final suitability surface — The final suitability surface. The green areas are more preferred with the red areas being least preferred.

Step 5: Locate

The point of the suitability model is to identify the best locations for locating or conserving for the phenomena (in your case, for bobcat). Generally, the most suitable locations (locations with the highest preference values) are not contiguous.

Not only do phenomenon have preference for the attributes at each location, they will also have spatial requirements in which to function most effectively. In many cases, you will have knowledge of the size, number, and spatial relationships of the regions that you are looking for. The phenomenon often needs the resulting regions to be contiguous. For example:

A ski area may need 1000 hectares and those hectares need to be contiguous and contained in the most compact shape possible.
Troops may need to be distributed between five locations with no two regions closer than 2 kilometers and not farther than 10 kilometers.
The bobcat model seeks 6 habitat patches. Each patch must be 4 contiguous square miles and the habitat patches must be spread throughout the study area.

In most cases, there is a tradeoff between the suitability values, maintaining contiguity, and a desired shape.

Tool to locate phenomena: Locate Regions

The Locate Regions tool identifies the contiguous areas with the highest overall preference given the specified area and number of regions, allowing for control over the shape and orientation of the regions. Spatial constraints such as the minimum and maximum distance between regions can also be controlled. The final locations will be the best contiguous regions of a specified size meeting the desired spatial constraints.

Six best bobcat habitat patches displayed over land use — The six best bobcat patches (displayed in purple) derived from the final suitability surface using the Locate Regions tool.

For additional information see the Locate Regions story map.

At this stage of the workflow, you have identified the best locations for your phenomenon - bobcat. You must now analyze the results to gain further insight into your decision.

Step 6: Analyze

The following steps will be used to analyze your results:

Evaluate the proposed plan to make sure it meets your objectives.
Perform sensitivity analysis to explore the input parameters.
Perform error analysis to understand the effects of error on the results.
Create alternative scenarios with different assumptions.

Evaluate proposed plan

Before the proposed plan is acted upon, it should be validated to make sure the suitability model is correct. Validation may come from a variety of sources:

Asking experts from different disciplines to review the proposed plan.
Visiting the site to make sure certain criteria have not changed (for example, a new building was constructed destroying the views for a housing development) or certain criteria were missed (for example, a land fill is near the proposed site for a shopping center).

In the bobcat model, wildlife biologists went to the sites and found fewer bobcat tracks in grassy areas than expected. The model is altered by reducing the suitability of grassland in the habitat submodel.

Perform sensitivity and error analysis

It is useful to explore a model by testing the sensitivity of the specified parameters to gain an understanding of just how much the data transformations and weights for each criterion and submodel affect the overall model. This is done by systematically changing one parameter slightly and observing how it affects the output plan. If a small alteration causes a dramatic change to the final result, the model is sensitive to that parameter. The following parameters can be changed:

Preference ranges within Reclassify (for example, 0 to 105 meters from roads being assigned a preference of 10 instead of 0 to 100 meters).
Transformed values within Reclassify (for example, single family housing is assigned a 5 instead of a 6 on the suitability scale).
Transformed values within Rescale by Function (for example, the base factor for the exponential function is slightly increased).

This same logic can be applied to explore how error may affect the resulting plan. For instance, you may know the elevation values have an accuracy of plus or minus one meter. You might have used slope as one criterion in your suitability model. To explore how this error may affect the output, you can run your model multiple times adding random values between -1 and 1 meters (with more of the random values centered around zero) to the elevation values and rerunning the model on each of these altered elevation surfaces to see how much the error can change your results.

If a model is sensitive to a particular criterion or a criterion has a great deal of error, additional data can be collected or you can make decisions but do so with some understanding of the uncertainty associated with that decision.

Create scenarios

You may want to run different what if scenarios to challenge the assumptions you have made in your initial model. What happens if changes occur in the input data (for example, a new road is built or the average temperature increases by 5° Fahrenheit)? Changes can also be made to preference values and weights.

Conclusion

Suitability modeling is comprised of six steps:

Define the problem
Identify and derive the criteria
Transform values to a common scale
Weight the criteria relative to one another and combine
Locate the phenomenon
Analyze the results

Suitability modeling can be the starting point for additional modeling and analysis. For instance, because of conservation principles, it is important to not only identify the best habitat patches for bobcat, but it is also critical to connect those patches through corridors.

Using the Cost Connectivity tool, the habitat patches can easily be connected over a cost (or preference) surface identifying the network of patches and corridors that will allow for the most viable metapopulation.

Six bobcat habitat patches connected with the optimum network of wildlife corridors

For additional information on connecting regions (in this case, habitat patches), see Understanding cost distance analysis.

Lessons and resources

The following lessons walk you through the suitability modeling workflow. The lessons will allow you to create the bobcat model described in this case study.

Lessons for ArcGIS Desktop

Lessons for ArcGIS Pro

Acknowledgements

We thank Steven Lamonde of Johnson State College and the Vermont Center of Geographic Information for their contributions to both the case study and the accompanying lessons.