Summary
Generate an Esri classifier definition (.ecd) file using the Random Trees classification method.
The random trees classifier is a powerful technique for image classification that is resistant to overfitting and can work with segmented images and other ancillary raster datasets. For standard image inputs, the tool accepts multiple-band imagery with any bit depth, and it will perform the Random Trees classification on a pixel basis or segment, based on the input training feature file.
Usage
Random Trees is a collection of individual decision trees where each tree is generated from different samples and subsets of the training data. The idea behind calling these decision trees is that for every pixel that is classified, a number of decisions are made in rank order of importance. When you graph these out for a pixel, it looks like a branch. When you classify the entire dataset, the branches form a tree. This method is called random trees because you are actually classifying the dataset a number of times based on a random subselection of training pixels, thus resulting in many decision trees. To make a final decision, each tree has a vote. This process works to mitigate overfitting. Random Trees is a supervised machine-learning classifier based on constructing a multitude of decision trees, choosing random subsets of variables for each tree, and using the most frequent tree output as the overall classification. Random Trees corrects for the decision trees' propensity for overfitting to their training sample data. In this method, a number of trees are grown—by an analogy, a forest—and variation among the trees is introduced by projecting the training data into a randomly chosen subspace before fitting each tree. The decision at each node is optimized by a randomized procedure.
For segmented rasters that have their key property set to Segmented, the tool computes the index image and associated segment attributes from the RGB segmented raster. The attributes are computed to generate the classifier definition file to be used in a separate classification tool. The attributes for each segment can be computed from any Esri-supported image.
Any Esri-supported raster is accepted as input, including raster products, segmented rasters, mosaics, image services, or generic raster datasets. Segmented rasters must be 8-bit rasters with 3 bands.
To create the training sample file, use the Training Sample Manager from the Image Classification toolbar. For information on how to use the Image Classification toolbar, see What is image classification?
The Segment Attributes parameter is enabled only if one of the raster layer inputs is a segmented image.
Syntax
TrainRandomTreesClassifier (in_raster, in_training_features, out_classifier_definition, {in_additional_raster}, {max_num_trees}, {max_tree_depth}, {max_samples_per_class}, {used_attributes})
Parameter | Explanation | Data Type |
in_raster | Select the raster dataset you want to classify. You can use any Esri-supported raster dataset. Options include a 3-band, 8-bit segmented raster dataset, where all the pixels in the same segment have the same color. The input can also be a 1-band, 8-bit, grayscale segmented raster. | Raster Layer; Mosaic Layer; Image Service; String |
in_training_features | Select the training sample file or layer that delineates your training sites. These can be either shapefiles or feature classes, which contain your training samples. | Feature Layer; Raster Catalog Layer |
out_classifier_definition | A JSON file that contains attribute information, statistics, or other information needed for the classifier. A file with an .ecd extension is created. | File |
in_additional_raster (Optional) | Optionally incorporate ancillary raster datasets, such as a multispectral image or a DEM, to generate attributes and other required information for classification. | Raster Layer; Mosaic Layer; Image Service; String |
max_num_trees (Optional) | The maximum number of trees in the forest. Increasing the number of trees will lead to higher accuracy rates, although this improvement will level off eventually. The number of trees increases the processing time linearly. | Long |
max_tree_depth (Optional) | The maximum depth of each tree in the forest. Depth is another way of saying the number of rules each tree is allowed to create to come to a decision. Trees will not grow any deeper than this setting. | Long |
max_samples_per_class (Optional) | The maximum number of samples to use for defining each class. The default value of 1000 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. | Long |
used_attributes [used_attributes;used_attributes,...] (Optional) | Specify the attributes to be included in the attribute table associated with the output raster.
This parameter is only enabled if the Segmented key property is set to true on the input raster. If the only input into the tool is a segmented image, the default attributes are COLOR, COUNT, COMPACTNESS, and RECTANGULARITY. If an in_additional_raster is also included as an input along with a segmented image, then MEAN and STD are available as options. | String |
Code sample
TrainRandomTreesClassifier example 1 (Python window)
This is a Python sample for the TrainRandomTreesClassifier tool.
import arcpy
from arcpy.sa import *
TrainRandomTreesClassifier("c:/test/moncton_seg.tif",
"c:/test/train.gdb/train_features",
"c:/output/moncton_sig_SVM.ecd",
"c:/test/moncton.tif", "50", "30", "1000",
"COLOR;MEAN;STD;COUNT;COMPACTNESS;RECTANGULARITY")
TrainRandomTreesClassifier example 2 (stand-alone script)
This is a Python script sample for the TrainRandomTreesClassifier tool.
# Import system modules
import arcpy
from arcpy.sa import *
# Set local variables
inSegRaster = "c:/test/cities_seg.tif"
train_features = "c:/test/train.gdb/train_features"
out_definition = "c:/output/cities_sig.ecd"
in_additional_raster = "c:/cities.tif"
maxNumTrees = "50"
maxTreeDepth = "30"
maxSampleClass = "1000"
attributes = "COLOR;MEAN;STD;COUNT;COMPACTNESS;RECTANGULARITY"
# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")
# Execute
TrainRandomTreesClassifier(inSegRaster, train_features,
out_definition, in_additional_raster, maxNumTrees,
maxTreeDepth, maxSampleClass, attributes)
Environments
Licensing information
- ArcGIS Desktop Basic: Requires Spatial Analyst
- ArcGIS Desktop Standard: Requires Spatial Analyst
- ArcGIS Desktop Advanced: Requires Spatial Analyst