In this topic
- About working with feature data
- Workspaces
- Datasets and schema
- Querying data
- Converting and transferring data
- Versioning
- Relationship classes
- Topologies
- Geometric networks and network datasets
- Distributed geodatabase
- Other data sources
About working with feature data
As mentioned in the summary, a geodatabase is a repository of geographic data built on industry standard relational database technologies. The geodatabase library gives developers fine-grained access to all the components of a geodatabase and can also be leveraged against other supported data sources, such as shapefiles, CAD layers, and coverages. This provides developers with a generic object model regardless of the underlying data source, which facilitates code reuse.
A geodatabase is a workspace (or container) of simple datasets, such as feature classes and tables, and of complex datasets, such as geometric networks, topologies, and terrains. The types of geodatabases are as follows:
- Personal
- File
- ArcSDE
For more information about the differences between geodatabase types, see Types of geodatabases in the ArcGIS Desktop Help system.
Prerequisites
Basic understanding of the geodatabase is assumed throughout this topic. Geodatabase objects, such as feature classes and fields, will be discussed from a development perspective using both theory and code examples, but minimal geographic information system (GIS) theory accompanies this discussion. Illustrations of class diagrams are utilized throughout this topic to help explain the relationships between the classes and interfaces of the geodatabase. It is also assumed that readers of this topic have basic programming skills and experience with object-oriented languages.
Workspaces
In many cases, the IWorkspaceFactory and IWorkspace interfaces will be the main entry point for developers using the geodatabase library. A workspace is a container of spatial and nonspatial datasets, such as feature classes, raster datasets, and tables. This can mean a personal geodatabase, file geodatabase, ArcSDE geodatabase, folder of shapefiles, folder containing CAD drawings, or numerous other data sources. The IWorkspace, IWorkspace2, and IFeatureWorkspace interfaces, among others, provide developers with the ability to create and delete datasets, and obtain references to the datasets to read or write to them.
Workspaces cannot be instantiated directly; instead, data source-specific workspace factories must be used to create them by utilizing the IWorkspaceFactory and IWorkspaceFactory2 interfaces. To connect to a workspace, users can provide IWorkspaceFactory with a connection string, a path to the geodatabase, or the geodatabase's properties using the IPropertySet interface.
Workspace factories are singleton objects, meaning that only one instance can be created per Component Object Model (COM) apartment, and further calls to a constructor returns a reference to the existing object.
Workspaces are not singleton objects, but requesting a workspace with the same properties as an existing instance returns a reference to it (referred to as unique instancing). For example, if a file geodatabase factory is asked to create a workspace for Sample.gdb, then asked to create a workspace for Sample2.gdb, these objects will be different. However, if a subsequent request for Sample.gdb is made, a reference to the first object will be returned. See the following illustration:
For more information, see Creating geodatabases, Connecting to geodatabases and databases, and Querying workspace properties.
Datasets and schema
The following models are used to access and create datasets in the geodatabase:
- Dataset model - Classic model for accessing and creating datasets in the geodatabase and is used for simple datasets, such as tables and feature classes, as well as some complex datasets, such as geometric networks and topologies.
- Dataset extensibility model - Used for complex datasets introduced at ArcGIS 9.1, such as network datasets, terrains, representations, and cadastral fabrics.
Opening datasets
The dataset model can be utilized through several interfaces, including IFeatureWorkspace and IFeatureClassContainer. Existing datasets can be opened by calling open methods on these interfaces and providing the dataset name as a parameter. If you're using the IFeatureClassContainer interface, the dataset's class ID or index in the container can also be provided as a parameter.
To create datasets, the dataset's properties are passed as arguments to the dataset-specific creation method. For example, to create a feature class, use the IFeatureWorkspace.CreateFeatureClass method. The dataset properties include the name of the dataset, the fields, and how it should be stored in the geodatabase.
The IDatasetContainer2 interface can be used to open existing datasets with the dataset extensibility model. This interface is implemented with and accessed through workspace and feature dataset extensions. Some datasets exist at the workspace level (for example, representations) while others are contained in feature datasets (for example, terrains). The IDatasetContainer2 interface has several properties to return datasets based on the name of the dataset (much like the dataset model) or the index of the dataset within the container.
When creating a dataset in this model, the IDatasetContainer2 interface is used again but with a data element as a parameter. A data element is an object that can be created and populated to precisely configure the new complex dataset. Once the data element is configured, it can be passed to the IDatasetContainer2.CreateDataset method. For more information on data elements, see the Name objects and data elements section in this topic.
The following illustration shows the relationships between the types involved in the creation of a dataset in this model, specifically a network dataset (which are created in feature datasets):
A feature dataset is a special type of dataset, usually accessed through the IFeatureDataset interface. Rather than storing data, it acts as a container for other datasets, and maintains the extent of its contained datasets and a common spatial reference. Many complex datasets, such as topologies and geometric networks, must be contained by a feature dataset.
For more information, see Creating feature datasets.
Tables and feature classes
Tables are a type of dataset containing zero or more rows (or objects) with one or more columns (or fields). All rows in a table have the same columns and a single value (or no value) associated with each column. The ITable interface provides developers with the ability to view and modify the table's schema, add or remove rows from the table, and perform queries on the table.
Tables are sometimes referred to as object classes, the distinction being that an object class's rows represent entities with special properties and behavior, such as validation and subtypes. Object classes must be registered with the geodatabase and require an ObjectID field, which is a sequential and unique identifier for each object in the class. From a database perspective, this could be viewed as a primary key. Numerous interfaces give object classes more functionality than a regular table (such as IObjectClass and ISubtypes), but basic table operations, such as queries and row creation, are still performed on an object class using the ITable interface.
A feature class is an extension of an object class that contains entities with a spatial attribute, known as features. Feature classes implement interfaces such as, IFeatureClass and IGeoDataset to provide functionality and access to properties that nonspatial data does not require. Like an object class, an ObjectID field is required, but unlike an object class, an additional Shape field is required to store the geometries of the contained features. The following table compares the properties of these three types of datasets:
Table |
Object class |
Feature class | |
Contains |
Rows |
Objects |
Features |
Row creation method |
ITable.CreateRow |
ITable.CreateRow |
IFeatureClass.CreateFeature |
Supports subtypes |
False |
True |
True |
Supports domains |
False |
True |
True |
Required fields |
None |
ObjectID |
ObjectID, shape |
The following illustration shows the hierarchical relationship between the datasets and their components:
For more information, see Table basics and Feature class basics in the ArcGIS Desktop Help system; Opening datasets, Creating tables, Creating feature classes, and Creating annotation and dimension feature classes.
Fields
Tables, object classes, and feature classes are composed of fields, which are synonymous with columns. The two main interfaces used to interact with fields are IField (to retrieve information about fields) and IFieldEdit (to modify the properties of a field). Fields are accessible through several interfaces, such as IFields and IIndex.
When creating fields for use in a new dataset, such as a feature class, use a fields collection. Similar to individual fields, fields collections have separate interfaces - IFields and IFieldsEdit - for reading and writing. Once a fields collection is populated, it can be passed as an argument to a dataset creation method, such as IFeatureWorkspace.CreateFeatureClass. Class description objects can be used to automatically generate the required fields for a dataset. For more information, see IObjectClassDescription and IFeatureClassDescription.
See the following illustration:
If fields need to be added to or removed from an existing dataset, the IClass interface (or an interface that extends IClass) must be used rather than IFieldsEdit. The IClass interface has the following methods for directly working with fields:
For more information, see Creating fields and Working with fields.
Domains
Object classes in a geodatabase can use rules to enforce different types of constraints on their objects. The rules include attribute, relationship, topology, and connectivity rules. The most common - attribute rules - are created by using domains. Domains are used to specify permissible values that can be assigned to a field. The following are the different types of domains in a geodatabase:
- Coded value - Coded value domains specify a list of valid values, each with a string representation.
- Range domains - Range domains specify a range of valid values through minimum and maximum numeric values.
Domains are defined at a workspace level and are accessed using the IWorkspaceDomains and IWorkspaceDomains2 interfaces. There are two domain classes - CodedValueDomain and RangeDomain - accessible through the IDomain interface (for common members) and through the ICodedValueDomain and IRangeDomain interfaces for type-specific functionality.
Domains can be assigned to fields at the time of field creation using the IFieldEdit interface or after the field is created, using the IClassSchemaEdit interface on the object class. Because domains exist at a workspace level, multiple fields across many object classes can share a single domain. This is especially powerful in ArcSDE geodatabases, where multiple users and schemas exist.
For more information, see A quick tour of attribute domains in the ArcGIS Desktop Help system, Creating and modifying domains, and Assigning domains to fields.
Subtypes
Subtypes are a way to partition objects in a single object class into groups with similar rules and behavior. Although subtypes share a common set of fields, each of these groups can have its own attribute, relationship, topology, and connectivity rules, as well as different default values at creation time. An example of this is parcels of land that are often divided into residential, commercial, and industrial subtypes. Subtypes are defined in a single object class, cannot be shared, and are accessible through the ISubtypes interface.
From a developer's standpoint, it is important to remember to apply each set of rules to every subtype of a class to match the intended business rules. This includes domains in particular, which are subtype specific if subtypes are used. Having a proper default subtype code is also important, and there are considerations when creating features as well.
For more information, see A quick tour of subtypes in the ArcGIS Desktop Help system and Creating subtypes.
Name objects and data elements
Name objects are pointers to geodatabase objects. They provide a way to retrieve information about objects without opening them. All name classes extend the abstract Name class, which can be accessed using the IName interface to get an object's name (as a string) and to open the object.
An example of when name objects are useful is when programmatically browsing a workspace. The IWorkspace.DatasetNames property returns an enumerator over IDatasetName objects. The enumerator can be iterated through and each object inspected until the applicable dataset is found. When it is found, the dataset name object can be cast to the IName interface, and the IName.Open method can be called to open the dataset.
Data elements are similar but used for different purposes. They are commonly utilized in geoprocessing, Web services, and the dataset extensibility model. Data elements completely describe the structure of a dataset and each type of dataset in a geodatabase has a corresponding data element. For example, a DEWorkspace instance describes a workspace and an instance of DETable describes a table.
Unlike name objects, data elements cannot be opened directly but are generally passed to methods that use them to create or open datasets indirectly, such as IDatasetContainer2.CreateDataset. Also, unlike name objects, data elements implement the IPersistStream and IXMLSerialize interfaces, meaning they can be serialized into binary or Extensible Markup Language (XML) format.
For more information, see Using the schema creator.
Schema locks
Schema locks are used to prevent clashes between users when changing the geodatabase structure. The following types of schema locks exist:
- Shared - Shared locks are applied automatically by the geodatabase when users are accessing an object, for example, opening a feature class.
- Exclusive - Exclusive locks are applied by promoting a shared lock to an exclusive lock through the ISchemaLock interface. As the name implies, only one exclusive lock can be applied to an object at one time. If a developer applies an exclusive lock, it is the developer's responsibility to release or demote that lock. Use exclusive locks when the schema of a dataset is being changed. This includes adding or deleting fields, building or rebuilding indexes, and applying attribute domains.
For more information, see Using schema locks.
Querying data
The geodatabase API provides several approaches to querying and retrieving feature data. Which approach developers take depends on what information the application is starting with - for example, a single known ObjectID or a set of attribute and spatial constraints - and what kind of information should be returned.
For more information, see Querying geodatabase tables and Sorting tables.
Query filters
There are several ways to query tabular datasets in a geodatabase, and deciding which to use depends on the data type, query type, and how the results will be utilized. The simplest way to query a table is to pass a query filter to the ITable.Search or ITable.Select methods, which return search cursors and selection sets, respectively. Query filters, which are accessed through the IQueryFilter interface, allow a WHERE clause to be specified, which defines attribute conditions that rows must meet to be included in the results. They also allow developers to select a subset of the table's fields to be returned, through the IQueryFilter.SubFields property. If the search, update, or select methods are called with a null value in place of the query filter, the resulting cursor or selection set contains every row in the table.
The following illustration shows the types used to create and use search cursors:
To include spatial constraints on a query, an extension of the query filter, known as a spatial filter (accessed through ISpatialFilter), can be used. Like the query filter, it allows a WHERE clause to be defined, but it also has properties to assign a query geometry (such as, a point or polygon) and a spatial relationship (for example, intersects, touches, contains). Spatial queries are ideal for tasks, such as finding all the buildings in a certain area or finding which parcels a stream flows through.
QueryDefs are temporary objects that can be created at the workspace level and allow a WHERE clause to be applied to one or more tables to generate a cursor. Typically, these are used when more than one table is used in a query.
For more information, see the Joining data section in this topic and Executing spatial queries.
Cursors and selection sets
Cursors provide a way to sequentially step through a series of records. The following are the types of class cursors:
- Search cursors - Inspect and edit rows.
- Update cursors - Edit and delete rows.
- Insert cursors - Add new rows to the table.
To determine whether to use a search cursor or an update cursor during editing, see Updating features. Cursors can be created from many interfaces, including ITable, IFeatureClass, ISelectionSet, and IQueryDef.
Cursors allow records to be iterated through only once, in one direction. They do not support behavior such as resetting, moving backwards, or making multiple passes (create cursors to make multiple passes).
All cursors are accessed through the ICursor and IFeatureCursor interfaces. These interfaces have similar methods, the main difference being that features are used instead of rows as arguments and return values when a feature cursor is used. When searching, updating, or inserting, it is the developer's responsibility to know the type of cursor being used and to make the appropriate calls based on that information. For example, ICursor.InsertRow should not be called on a search or an update cursor (doing so results in errors).
Selection sets provide functionality slightly different from cursors. They contain subsets of a table's records, stored as a list of ObjectIDs. They can be created in different ways, but within the geodatabase API, the most common is the ITable.Select method, which returns an ISelectionSet. Selection sets have certain advantages over cursors, such as iteration through the list of IDs multiple times without having to re-execute the query. It is also possible to create aggregated sets (ISelectionSet.Combine) for a single table, or manually add or remove known values from the set.
Joining data
The following are the common methods for joining data using the geodatabase API:
- Create a QueryDef object to join multiple tables in a single workspace based on a list of tables and a WHERE clause.
- The creation of a virtual query table (or feature class) based on an existing QueryDef object.
- Create a relationship query table (RelQueryTable). The join conditions for a RelQueryTable can be specified by a relationship class or a memory relationship class.
For more information, see the Relationship classes section in this topic.
QueryDefs have the advantage of being able to join multiple tables but with the restriction that all tables must be contained in the same workspace. Similar to a query filter, two properties, IQueryDef.WhereClause and IQueryDef.SubFields, allow attribute constraints and a subset of fields to be defined. Any primary foreign key relationships used to join the tables should be part of the WHERE clause property. The IQueryDef.Evaluate method can be used to return a cursor for the joined data, enabling iteration through the records.
Query tables are virtual tables or feature classes derived from an existing QueryDef. This is done by creating a Table Query Name object, providing it with the QueryDef through the IQueryName2 interface, then opening it. Query tables allow developers to join tables like QueryDefs, but also provide the functionality of the ITable and (if applicable) the IFeatureClass interfaces.
RelQueryTables can be created from relationship classes, which maintain relationships within the geodatabase or from memory relationship classes, which temporarily define relationships between two object classes regardless of their data source. This is important, because it means that RelQueryTables can join an ArcSDE feature class with a dBASE table or a feature class in an Access geodatabase with a feature class in a file geodatabase. QueryDefs, on the other hand, do not have this capability. RelQueryTables implement the ITable and (if a shape field is present) the IFeatureClass interfaces, meaning the tables can be treated like a tabular data structure after being created.
For more information, see Joining data.
The following table shows the benefits and drawbacks of each join method:
QueryDef |
Query table |
RelQueryTable | |
Can span data sources |
False |
False |
True |
Joins persisted as |
Cursor |
Table |
Table |
Matches all 1:N candidates |
True |
True |
False |
Converting and transferring data
Several interfaces exist in the geodatabase API to load data into and export data from a geodatabase. The IFeatureDataConverter interface has methods for loading simple data from a non-geodatabase source (such as, a shapefile) into a geodatabase and allows a great deal of control over the operation such as, the application of a query or spatial filter. The downside of this control is that calling the interface's methods requires significant overhead.
To copy data from one geodatabase to another, the IGeoDBDataTransfer, IGdbXmlImport, and IGdbXmlExport interfaces can be used. The first of these is essentially the copy and paste functionality as it exists in ArcCatalog, while the second two allow serialization to and from an XML document. As with loading from and exporting to non-geodatabase sources, geoprocessing tools exist that can be more suitable for some workflows.
Before implementing solutions using these interfaces, consider the geoprocessing framework as an alternative. Geoprocessing tools give coarser-grained control over the conversions, but implementation time can be reduced drastically. Several tools exist that relate to dataset conversions and loading, including FeatureClassToFeatureClass, TableToTable, and Copy.
For more information, see Converting and transferring data, Converting simple data, and Copying and pasting geodatabase datasets.
Versioning
Versioning allows multiple users to edit spatial and tabular data simultaneously in a long transactional environment. Users can directly modify the database without having to extract data or lock features in advance. The API provides functionality to create and administer versions, register and unregister classes as versioned, detect differences between versions, and reconcile and post versions.
A workspace that supports versioning, or a VersionedWorkspace, is a subclass of workspace. Its corresponding interface, IVersionedWorkspace, has properties and methods for returning specific versions of the workspace (such as, DefaultVersion). Information about a specific geodatabase version (that is, name and description) can be obtained through the IVersion interface, and more fine-grained details related to the IVersion object can be reached through the VersionInfo method and its return type, the IVersionInfo interface. The IVersion interface can also be used to create a version that requires an existing version to be the parent of the new version. When the version is created, the parent and child versions are identical.
The IVersionEdit interface is used to reconcile and post a version with a target version. Once reconciled, the object provides the ability to work with different reconcile states, such as the pre-reconcile state and the common ancestor. You can only post a version that has first been reconciled with any of its ancestor versions.
To view the changes made in a version to a single table, the IVersionedTable interface can be applied to a table and a difference cursor generated.
For more information, see Understanding versioning in the ArcGIS Desktop Help system, Reconciling versions, Finding differences between versions, Listening to versioned events, and How to merge conflicting geometries during a reconcile.
Archiving
Archiving provides the functionality to record and access changes made to the data, or a subset of the data in a versioned geodatabase. It can be used to view a historical snapshot of a geodatabase (through the IHistoricalWorkspace interface) given a moment in time or to compare the contents of a geodatabase between two different moments.
Datasets can have archiving enabled through the IArchivableObject interface. Historical versions of workspaces can be accessed at connection time or through the IHistoricalWorkspace interface. A historical workspace, once connected to, can be accessed the same way as a regular workspace but is read-only.
For more information, see Geodatabase archiving in the ArcGIS Desktop Help system.
Relationship classes
Relationship classes are associations between two object classes in the geodatabase and similar to relationships in a database management system. Relationship classes can specify one of the following three cardinalities between their classes (the origin and destination classes):
- One-to-one cardinality
- One-to-many cardinality
- Many-to-many cardinality
An extension of the relationship class - an attributed relationship class - allows geodatabase users to add fields to the relationship class.
Relationship classes are stored in workspaces or feature datasets, and can be accessed and created through the IFeatureWorkspace and IRelationshipClassContainer interfaces, respectively. To view the relationship classes associated with a specific object class, use the IObjectClass.RelationshipClasses property. The relationship class can be accessed through the IRelationshipClass and IRelationshipClass2 interfaces. Relationships can optionally contain relationship rules, which allow tighter control over cardinality (including subtype specific rules) through the IRelationshipRule interface.
Attributed relationship classes are relationship classes that allow attributes to be assigned to each specific relationship. Attributed relationship classes inherit the functionality of standard relationship classes but also act as a table, and as such, implement additional interfaces including ITable.
For more information, see Relationship class properties in the ArcGIS Desktop Help system and Creating relationship classes.
Topologies
Topologies are collections of one or more feature classes in a geodatabase, with rules that model how the features share coincident geometry. A typical example of this is a topology containing public roads, parcels of land, and buildings as polygons. Parcels should not be allowed to overlap - neither should public roads and parcels - and buildings should be contained within a parcel. Topologies allow these types of rules to be defined and provide validation tools to identify features that violate the rules, as well as features that are explicitly allowed to violate the rules.
Much like geometric networks, topologies have an associated graph that is returned through the ITopology.Cache property and used with the ITopologyGraph interface. This graph provides editing capabilities that take these rules into account when a coincident geometry (such as, a boundary shared by two parcels) is modified.
Basic information about a topology can be accessed through the ITopology interface. Rules can be added and deleted, and error features can be promoted to or demoted from rule exceptions through the ITopologyRuleContainer interface. Individual error features can be accessed through the IErrorFeatureContainer interface.
For more information, see An overview of topology in ArcGIS in the ArcGIS Desktop Help system, Creating a topology in the geodatabase, Checking for topology error features in a geodatabase topology, and Listening to the OnValidate event for a geodatabase topology.
Geometric networks and network datasets
The following types of networks exist in the geodatabase and are used for the following purposes:
- Geometric networks - Primarily for the utility and natural resource industries.
- Network datasets - For transportation networks.
Geometric networks are built on the classic dataset model, whereas network datasets are built on the dataset extensibility model. Geometric networks are a type of graph used to represent network topology that consists of two or more feature classes. They store the following types of network features:
- Junctions
- Edges
A simplified example is an electrical grid consisting of transmission lines (edge and line features) and transformers (junction and point features). They can be created and modified using the INetworkLoader interface in the Network Analysis library. The logical network API, which is recommended for traversal and connectivity analysis, includes the INetwork, IForwardStar, and INetTopology interfaces.
Network datasets are built on the dataset extensibility model, meaning they leverage data elements and the IDatasetContainer2 interface for creation and updates. Unlike geometric networks, which are composed of complex features, network datasets contain simple features, meaning they do not have custom behavior. To perform connectivity analysis, there are interfaces and methods that are analogous to those of the geometric network; specifically, INetworkForwardStarEx and INetworkForwardStarAdjacencies.
For more information, see What are geometric networks? and What is a network dataset? in the ArcGIS Desktop Help system; Creating geometric networks within a geodatabase, How to create a network dataset, How to create a multimodal network dataset, How to access source features referenced by a network dataset, and How to programmatically traverse a street network.
Distributed geodatabase
The distributed geodatabase objects are for working with checkout information for active check outs in a geodatabase. They include the Replica, ReplicaDataset, and ReplicaDescription objects, which act as specifications for distribution. For information about the objects and interfaces used to create and synchronize distributed geodatabases, see GeoDatabaseDistributed.
For more information, see the following:
- Understanding distributed data (in the ArcGIS Desktop Help system)
- How to initialize a GeoDataServer object
- Adding and deleting GlobalIDs
- How to create a replica in a connected environment
- How to synchronize a replica in a connected environment
- How to create a replica in a disconnected environment
- How to synchronize a data change message in a disconnected environment
- How to synchronize an acknowledgement message in a disconnected environment
- How to export a dataset to XML
- How to import a dataset from XML
- Getting a list of schema differences between replicas
- How to add a feature class or table to an existing replica
- Using replica creation events to extend the replica creation process
Other data sources
The geodatabase API was designed around the capabilities of the geodatabase, but it is also used to access data sources other than geodatabases. For example, the geodatabase API can be used to access CAD data, shapefiles, and coverages. Developers working with these data sources should be aware that the behavior of the geodatabase API can vary between data sources. Some functionality might not be supported; for example, the classes and interfaces used to work with domains and subtypes will only work properly with geodatabases. Some data sources have functionality that is not supported by (or is radically different from) the geodatabase API and require developers to use additional types (this is the case for developers working with CAD data).
Query classes and query cursors
Query classes and query cursors are components, introduced at ArcGIS 10, that provide developers with a way to read data from spatial databases using just a Structured Query Language (SQL) query. Using a spatial database with a query class or query cursor does not require having ArcSDE installed on the database, and columns containing spatial types such as SQL Server geometry and geography types can be used like a typical shape field.
To connect to a spatial database so that query classes and query cursors can be used, utilize the workspace factory and workspace pattern in the same way as when connecting to other data sources. Except, rather than using the IFeatureWorkspace interface to open datasets, use the ISqlWorkspace interface to create query classes and query cursors.
For more information, see Query classes and cursors, Working with query classes, and Working with query cursors.
See Also:
WorkspacesOpening datasets
Creating and modifying schema
Querying data
Converting and transferring data
Versioning
Relationship classes
Topologies
Geometric networks
Network datasets
Distributed geodatabases
Other data sources
Customizing the geodatabase
Development licensing | Deployment licensing |
---|---|
ArcGIS Desktop Basic | ArcGIS Desktop Basic |
ArcGIS Desktop Standard | ArcGIS Desktop Standard |
ArcGIS Desktop Standard | ArcGIS Desktop Advanced |
Engine Developer Kit | Engine |