Most database design guidelines promote organizing your database into multiple tables—each focused on a specific topic—instead of one large table containing all the necessary fields. Having multiple tables prevents duplicating information in the database, because you store the information only once in one table. When you need information that isn't in the current table, you can link the two tables together.
For example, you might obtain data from other departments in your organization, purchase commercially available data, or download data from the Internet. If this information is stored in a table, such as a dBASE, INFO, or geodatabase table, you can associate it with your geographic features and display the data on your map.
ArcGIS allows you to associate records in one table with records in another table through a common field, known as a key. You can make these associations in several ways, including by joining or relating tables temporarily in your map or by creating relationship classes in your geodatabase that maintain more permanent associations. For example, you could associate a table of parcel ownership information with the parcels layer, since they share a parcel ID field.
When you join two tables, you append the attributes from one onto the other based on a field common to both. Relating tables defines a relationship between two tables—also based on a common field—but doesn't append the attributes of one to the other; instead, you can access the related data when necessary.
Joining the attributes from a table
Typically, you'll join a table of data to a layer based on the value of a field that can be found in both tables. The name of the field does not have to be the same, but the data type has to be the same; you join numbers to numbers, strings to strings, and so on. You can perform a join with either the Join Data dialog box, accessed by right-clicking a layer in ArcMap, or the Add Join tool.
Suppose you have obtained data that describes the percentage change in population by county and you want to generate some population growth maps based on this information. As long as the population data is stored in a table in your database and shares a common field with your layer, you can join it to your geographic features and use any of the additional fields to symbolize, label, query, or analyze the layer's features.
One-to-one and many-to-one relationships
When you join tables in ArcMap, you establish a one-to-one or many-to-one relationship between the layer's attribute table and the table containing the information you want to join. The example below illustrates a one-to-one relationship between each county and that county's population change data. In other words, there's one population change for each county.
Here's an example of a many-to-one relationship. Suppose you have a layer where each polygon is classified according to its land-use type. The layer's attribute table only stores a land-use code; a separate table stores the full description of each land-use type. Joining these two tables establishes a many-to-one relationship because many records in the layer's attribute table join to the same record in the table of land-use descriptions. You might then use the more descriptive text when generating the legend for your map.
One-to-many and many-to-many relationships
When using data where a one-to-many or many-to-many relationship exists, you should use a relate or relationship class to establish the relationship between the datasets. However, it is possible to create a join under these circumstances. When you create a join in such a case, there are differences between how tools and other layer-specific settings work depending on the data source. If you are using geodatabase data to create the join, all matching records are returned. If you are using nondatabase data, like shapefiles or dBASE tables, to create the join, only the first matching record is returned.
This means that if you have created a 1:M or M:M join with geodatabase data and you generate a report, you see multiple records in the report, one for each corresponding match. The multiple matches are also seen when using a join field while symbolizing a joined layer, labeling, identifying features, generating a graph, and using either the Find or Hyperlink tool. If you are using the joined layer as input to a geoprocessing tool or in an export operation, the multiple matching records are used.
Joining data by location (spatially)
When the layers on your map don't share a common attribute field, you can join them using a spatial join, which joins the attributes of two layers based on the location of the features in the layers.
You can perform a spatial join with either the Join Data dialog box, accessed by right-clicking a layer in ArcMap, or the Spatial Join tool.
With a spatial join, you can find any of the following:
- The closest feature to another feature
- What's inside a feature
- What intersects a feature
- How many points fall inside each polygon
Join by location, or spatial join, uses spatial associations between the layers involved to append fields from one layer to another. Spatial joins are different from attribute and relationship class joins in that they are not dynamic and require the results to be saved to a new output layer.
One of three types of associations, described below, can be used to perform a spatial join:
- Match each feature to the closest feature or features: In this association, you can append either the attributes of the nearest feature or an aggregate (min, max, and so on) of the numeric attributes of the closest features.
- Match each feature to the feature that it is within: In this case, the attributes of the feature that the current feature is within are appended. For example, some cases where this match will occur are a point that is within a polygon or a line segment that is completely within (in other words, overlapped by) another line segment.
- Match each feature to the feature or features that it intersects: Like with the closest feature or features association above, you can append either the attributes of a single intersecting feature or an aggregate of the numeric attributes of the intersecting features.
For each point, polygon, and line combination, only the most commonly used of these associations are available in the join dialog box. With VBA, however, it is possible to perform a join based on any association and with any combination of point, line, or polygon feature layers.
Relating tables
Unlike joining tables, relating tables simply defines a relationship between two tables. The associated data isn't appended to the layer's attribute table like it is with a join. Instead, you can access the related data when you work with the layer's attributes.
For example, if you select a building, you can find all the tenants that occupy that building. Similarly, if you select a tenant, you can find what building it resides in (or several buildings, in the case of a chain of stores in multiple shopping centers—a many-to-many relationship). However, if you performed a join on such data, ArcMap will only find the first tenant belonging to each building, ignoring additional tenants.
Relates defined in ArcMap are essentially the same as simple relationship classes defined in a geodatabase, except that they are saved with the map instead of in a geodatabase.
If your data is stored in a geodatabase and has relationship classes defined, you can use these directly without having to establish a relate in ArcMap. The relationship classes will automatically be available when you add a layer that participates in a relationship class to the map. Note that the many-to-many relationship is defined differently when your data is stored in a geodatabase. In general, if you have relationship classes defined in your geodatabase, you should use these instead of creating new ones in ArcMap.
Relationship classes in the geodatabase
A relationship class stores information about associations among features and records in a geodatabase and can help ensure your data's integrity.
Joins versus relates
Follow these general guidelines when choosing between joins and relates on your data:
- You'll want to join two tables when the data in the tables has a one-to-one or a many-to-one relationship.
- You'll want to relate two tables when the data in the tables has a one-to-many or many-to-many relationship.
Learn more about deciding between relationship classes, joins, and relates
Saving joins and relates
When you save a map containing joins and relates, ArcMap saves the definition of how the two attribute tables are linked rather than saving the linked data itself. The next time you open your map, ArcMap reestablishes the relationship (whether a join or relate) between the tables by reading the tables from the database. In this way, any changes to the source tables that have taken place since they were last viewed on the map are automatically included and reflected on the map.
Joins can be stored in an ArcMap document or in a layer file. If you plan on moving the data at some point, you should save your ArcMap documents with relative paths. If data is moved, you can repair tables and layers after opening the document, but unless the target table and the join tables are in the same directory or workspace, the joins are not repaired. If you save your document with relative paths, tables and layers are restored automatically with joins as long as the document has been moved relative to where the data has been moved.
You can make a permanent disk copy of a layer with joined data simply by exporting the layer. To export the layer, right-click it in the table of contents, point to Data, then click Export Data. This creates a new feature class with all the attributes, including the joined fields.
Using joins, relates, and relationship classes together
If your data is involved in both joins and relates, the order in which the joins and relates are created is significant. If your layer or table has a relate, it is removed once data is joined to it. If you perform a relate on a joined layer or table, the relate is removed when the join is removed. As a general rule of thumb, it is best to create your joins, then add your relates.
In a situation where you need to join tableA and relate tableB to layerC, two of the three possible ways of doing this will work. The following describes each case:
- Join tableA to layerC, then relate tableB to layerC: This scenario works. You end up with a joined layer that has a relate to tableB.
- Relate layerC to tableB, then join tableA to layerC: This scenario also works. Since a relate is bidirectional, both tables involved are able to use it regardless of which table owns the relate. In this case, tableB owns the relate, so when tableA is joined to layerC, the relate is not removed.
- Relate tableB to layerC, then join tableA to layerC: This scenario does not work. This is different from the previous scenario in that layerC owns the relate. Therefore, when tableA is joined to layerC, the relate is removed.
Relates owned by the join table are not affected by the join. These relates can't be accessed by the target table or layer, however.
Relationship classes are never removed as a result of a join or unjoin operation. They can be used regardless of whether the data has been involved in joins or relates.