Celebrating linguistic diversity automation—Analytics

Workflow automation

Building custom model and script tools allows you to not only automate common workflows but also to document them. Presented below are the model pieces and a final composite model for the aggregation component of the Celebrating Linguistic Diversity case study. The final model presented, and also provided in the data package above, splits and aggregates neighborhoods with suppressed data, combining them with neighborhoods that do have data. The steps are summarized graphically below using only five neighborhoods for simplicity.

Graphical overview of the aggregation process

If you need to perform this type of aggregation regularly, the workflow outlined below should get you started with building your own custom Geoprocessing model tool. The focus of this workflow is to create a model tool designed to work specifically with the case study data. Documentation is also provided, however, to help you build and share more generic tools that would be robust to a variety of different data sources.

Access the model tool and data

If you haven't done so already, download and unzip the data package provided at the top of this workflow.
To work with ModelBuilder in ArcGIS Pro, open ArcGIS Pro and browse to the LingualDiversity.ppkx project package. When it opens, expand the Folders on the Project pane until you find the LinguisticDiversity.tbx toolbox. Expand the toolbox, right click on the Aggregate Neighborhoods model tool, and select Edit.
To work with ModelBuilder in ArcMap, double-click the LanguageData.mpk map package and use the Catalog to browse to the AnalysisModels.tbx toolbox. Expand the toolbox, right click on the Aggregate Neighborhoods model tool, and select Edit.

Each component of the model is described below.

Create working layers

You will use the Select Layer By Attribute and Copy Features tools to create working layers. One layer will contain the neighborhoods with data, the other will contain the neighborhoods where data has been suppressed.

The AllNeighborhoods data set contains a field (HasDataMT) indicating whether the mother tongue data has been suppressed or not for each neighborhood. The model selects the neighborhoods where HasDataMT is equal to one, and passes the selected features to the Copy Features tool. The result is a new data set (DataNeighborhoods), containing 328 neighborhoods with mother tongue data. The model then selects the neighborhoods where HasDataMT is equal to zero and, again, passes those selected features to the Copy Features tool. The result is a new data set (NoDataNeighborhoods), containing the 60 neighborhoods with suppressed data. Finally, the model uses the Select Layer By Attribute tool to clear all selections.

The model components to create working layers

The dashed arrows in the model above reflect preconditions. If you right click the final Select Layer By Attribute element in the model, you can ensure selections are not cleared until both the DataNeighborhoods and NoDataNeighborhoods layers are created.

Create random points

The second workflow component creates random points

Based on neighborhood size, you will compute the number of random points to generate within each neighborhood (DataNeighborhoods). The model first adds a new field (NumPoints) to the DataNeighborhoods data set to hold the number of points to generate within each neighborhood polygon. To populate the new field, the model divides each neighborhood area by 100,000 yielding 6 points for the smallest neighborhood and 1,514 points for the largest neighborhood. Using these values, the Create Random Points tool generates points in each neighborhood polygon. Each random point is given an ID indicating which DataNeighborhood it is associated with. The name of this ID field is CID.

The model components to generate random points

Add the CID field to DataNeighborhoods

This step creates a new field called CID inside DataNeighborhoods and calculates the values to be the ObjectID. This step ensures every working layer has the CID field; this will be important for the last step in the model.

The model components to add and calculate the

Create, snap, and dissolve Thiessen polygons

For each random point, this part of the model creates a Thiessen polygon (Create Thiessen Polygons). It then snaps (Snap) the edges of the polygons to the edges of the NoDataNeighborhoods polygons to reduce the number of sliver polygons created in the final step of the model. Finally, it creates overlay areas by using the Dissolve tool to combine Thiessen polygons that have the same CID values.

The model components that create the overlay areas.

Aggregate the NoDataNeighborhood pieces with the DataNeighborhoods

The overlay areas carve up the NoDataNeighborhoods into pieces using the Intersect tool. These pieces are merged (Merge) with the DataNeighborhoods polygons. Finally, using Dissolve, the carved pieces are combined with the DataNeighborhoods. You will need to follow the steps for removing sliver polygons before you will have your final geometry with 328 neighborhoods and complete mother tongue data.

The model components that split the no data neighborhoods, aggregating them to data neighborhoods

You can find the final model in both the ArcGIS Pro project package and the AnalysisModels toolbox included in the data package.