Transactions are packages of work that make changes to databases. Geographic information system (GIS) databases, like other database applications, must support update transactions that enforce data integrity and application behavior. In many cases, you can use the database management system's transaction framework for managing edits and updates to geodatabases.
However, GIS users universally also have some specialized transactional requirements. For example:
- Often, multiple records are updated in unison as a single transaction.
- Numerous transactions must span long periods of time (sometimes days and months, not just seconds or minutes).
Additionally, editors need to be able to undo and redo changes. Editing sessions can span several hours or even days. Often, the edits must be performed in a system that is disconnected from the central, shared database.
Because GIS workflow processes may span days or months, the GIS database must remain continuously available for daily operations, where each person connecting to the data might have a personal view or state of the shared GIS database. In a multiuser database, the GIS transactions must be managed on the database management system's short transaction framework. The geodatabase plays a key role during these operations by managing the high-level, complex GIS transactions on the database management system transaction framework.
In many cases, long transaction workflows are critical to a GIS. In most instances, these are made possible through the use of an enterprise geodatabase and ArcGIS to manage updates to the central GIS database using versioning, which you can read more about below.
The following are examples of GIS data compilation workflows that require a version-based transaction model:
- Multiple edit sessions—A single GIS update may require numerous changes that span multiple edit sessions occurring over a few days or weeks.
- Multiuser editing—Multiple editors often need to concurrently update the same spatially integrated features. Each editor needs to work with a personalized data state, viewing individual updates and ignoring updates by other editors. Eventually, each editor needs to post and reconcile updates with the other editors to identify and resolve any conflicts.
- Checkout/Check-in transactions—It is often necessary to check out a portion of a geodatabase for a particular area or district to a personal computer and update that information in a disconnected session that could last for days or weeks or, increasingly, to take into the field for mobile editing and updating. Those updates must be posted to the main database.
- History—Sometimes it is advantageous to maintain a historical version of each feature in a geodatabase, even after that particular version has been updated, to maintain a copy of the retired and changed features in an archive or to track an individual feature's history—for example, parcel lineage or feature update properties in a national mapping database.
- Transfer of change-only updates—Enterprise geodatabases and spatial data infrastructures in which information is shared across a range of organizations are collaborative efforts that require the sharing of updates across the Internet in a well-defined extensible markup language (XML) schema for sharing change-only updates between geodatabases.
- Distributed geographic database replicas—A regional geodatabase may be a partial copy of a main corporate geodatabase for a particular geographic region. Periodically, the two geodatabases must be synchronized by exchanging updates.
- Loosely coupled replication across database management systems—Often, GIS data must be synchronized among a series of geodatabase copies (replicas), where each site performs its own updates on its local geodatabase. Often, the geodatabases are only periodically connected via the Web. On a scheduled basis, the updates must be transferred from each geodatabase replica to the others and their contents synchronized. Many times, the database management systems are different—for example, replicating datasets between Microsoft SQL Server, Oracle, and IBM Db2.
The geodatabase transaction model: Traditional versioning
The geodatabase mechanism for managing these and many other critical GIS workflows is to maintain multiple states in the geodatabase and, most importantly, to do so while ensuring the integrity of the geographic information, rules, and behavior. This ability to manage, work with, and view multiple states is based on versioning. As the name implies, versioning explicitly records versions of individual features and objects as they are modified, added, and retired through various states. Each version explicitly records each state of a feature or object as a row in a table along with important transaction information. Any number of users can simultaneously work with and manage multiple versions.
Versions enable all transactions to be recorded as a series of changes to the database through time. This means that various people can work with multiple views or states of the geodatabase. The goal is open, high-performance, multiuser access. For example, the system must go fast and must productively support the use of datasets containing hundreds of millions of records accessed by thousands of clients simultaneously.
The geodatabase transaction model based on versions is relatively simple—updates are recorded in change (delta) tables.
Versions explicitly record the object states of a geodatabase in two delta tables:
- The Adds table
- The Deletes table
Simple queries are used to view and work with any desired state of the geodatabase, for example, to view the data state for a point in time or see a particular editor's current version with his or her edits.
Versioned geodatabases and datasets are used to manage long transactions in each database management system by leveraging the database's short transaction framework.
In the version table example, a parcel (number 45) is updated to become parcel number 47. Using versioning, the original parcel is saved in the Deletes table and the new parcel is saved in the Adds table. Other geodatabase system tables record version information about the transaction such as the time and sequence of each update, the version name, and the state ID of each update. Each version also has its own security and access privileges.
This enables most clients to work with the Default version while various editors simultaneously make updates to their own versions of the data over time.
Numerous updates can be made to each version, and editors connect to and work with the update version as they make additional edits to the data. When editors are ready to share the updates with the rest of the enterprise, reconcile and post operations are performed to commit the edits held in the update version to the parent version. A resolution process is used to identify and reconcile any potential conflicts during this process.
To learn more about versioning, see An overview of traditional versioning.