This topic presents the application of versioning—how this technology can be applied within an organization—and illustrates the version configurations that are available.
Workflows vary greatly among organizations. They often progress in discrete stages, with each stage requiring the allocation of a different set of resources and business rules. Typically, each stage in the overall process represents a discrete unit of work, such as a work order. To manage each work order, you can create a separate, isolated version and modify it. Once you're satisfied the work is complete, you can integrate the changes into the published version of the database. Working with versions this way gives you the ability to accommodate the most basic of workflow processes as well as the most complex.
You will most likely adopt either the concurrent editing of the published database, with many editors modifying the DEFAULT version, or some combination of the other configurations. An understanding of the organizational and business requirements and an appreciation of the pros and cons of each configuration will help you choose what's best for your organization.
For the sake of simplicity and geodatabase management considerations, a recommended best practice is to either maintain a flat version tree or have multiple editors concurrently editing the DEFAULT version.
Concurrent editing of the published database
Many users can edit the same version simultaneously, so the simplest way to support multiuser editing is for many editors to directly edit the DEFAULT version.
As each editor starts editing the DEFAULT version, an unnamed, temporary version is automatically created in the edit session. This temporary version is accessible only to the current editor. When the editor saves his or her work or ends the edit session, the changes recorded in the temporary version are posted to the DEFAULT version.
If another user has edited the DEFAULT version since you've started editing, when you save your work, ArcGIS automatically reconciles and posts the changes. You are notified that the version has been changed and must save again to accept the changes made by other editors. You can bypass this warning message by enabling autoreconciliation on the ArcMap Options dialog box. Whether or not you bypass this message, if there are conflicts, you must resolve them with the conflict resolution dialog box before you can successfully save your edits.
Learn more about settings for saving data
Learn how to resolve data conflicts
Pros:
- This strategy supports simple database modifications well, because users do not have to create new versions to edit data. This is appropriate when the units of work are fairly small or when persistent design alternatives are not required.
- If there are no conflicts, saved edits are posted directly to the DEFAULT version.
Cons:
- The DEFAULT version is constantly changing and is vulnerable to accidental or unauthorized modification; therefore, the database administrator may need to create database backups more frequently.
- Long-duration transactions or the creation of alternative design versions that span many edit sessions are not supported.
- Only one reconcile operation per geodatabase can be active at any given time. If there are frequent reconcile and post operations from various edit sessions, editors saving their changes may have to wait for any active reconcile and post processes to complete. In a large, multiuser geodatabase, it is better to avoid situations where many users reconcile and post to a common version. Reconciling and posting exclusively locks the version; while this lock is in place, other users are prevented from completing their tasks.
Multiple projects
If you're managing multiple projects or work orders, you'll require a more structured approach to workflow management. Discrete work units involving many edit sessions and spanning a number of days, weeks, or months can be maintained without affecting the DEFAULT version. Examples of these discrete work units could be a highway improvement scheme, the installation of a new phone service, or an ongoing maintenance project for a gas pipeline.
When a work order or project is initiated, a version is created as a child of the DEFAULT version. One or more editors can work on this version until the work order or project is complete. When all the modifications to a version have been completed, the editor or ArcSDE administrator reconciles with the DEFAULT version, resolving any conflicts that arise. He or she then posts the modifications to the DEFAULT version, integrating the modifications into the published database. At this point, the child version can be removed.
User access permissions to the DEFAULT version may be restricted to enforce this workflow and ensure that the DEFAULT version is not modified. The ArcSDE administrator might set the permission of the DEFAULT version to protected; this allows users to continue to view the DEFAULT version but restricts their access level to read-only. Any editor wanting to modify the data must create a new version.
If your read-only users do not require the ability to see changes as soon as they are posted to the DEFAULT version, you could create a protected, static version from the DEFAULT version for them to use. This version should be created after the database has been compressed and the indexes and statistics rebuilt. Doing so ensures that all the rows required to represent the read-only version are stored in the base table and that the database is performing optimally. In this scenario, no changes are being made to the read-only users' version of the database (FastTrak in the illustration below), so version difference queries don't have to be performed and the database statistics and indexes do not become out of date or fragmented. After each scheduled database compression, this version would be re-created, allowing the read-only users access to changes made since the last database compression.
Pros:
- Simplicity: Each work unit is logically segregated by version.
- Long-duration transactions spanning many edit sessions are supported, as well as the creation of alternative designs, allowing editors to develop proposals without affecting the production database.
- Creating a new version from the DEFAULT version protects the production view of the database from unintentional modification. Individual work projects are integrated with the production database when completed.
- Batch reconcile/post processes are supported.
Cons:
- As with any multitier version configuration, the more rows that are maintained in the version delta tables, the greater the potential impact on version query performance. This overhead can be minimized by compressing the database regularly and updating the DBMS statistics.
Multiple projects with subprojects
Complex projects require a more elaborate workflow structure than that provided by either the concurrent editing or the multiple project approach. These projects may further divide into multiple functional or geographic units from which a more complex versioning hierarchy will develop. For example, a project to design and construct a new shopping mall might have distinct construction phases subdivided into east and west sections or be subdivided by construction activities, such as building, utility installations, or landscaping.
For large projects involving different teams and numerous discrete units of work, a multiple-tier version tree is an effective way to organize the workflow. The teams working on different aspects of the same project create their own version to maintain a private view of their updates. Once the project has been completed, the versions can be reconciled and posted back to the DEFAULT version and become an integral part of the published database.
Pros:
- Supports complex projects
- Supports long-duration transactions spanning many edit sessions
- Supports automated batch reconcile and post processes
Cons:
- You must reconcile and post versions in order, starting with the versions farthest removed from the DEFAULT version and moving backward. In other words, the third-level versions (great grandchildren) of the DEFAULT version must be posted to their parents, which are second-level versions (grandchildren) of the DEFAULT version. These second-level versions can then be posted to the first-level versions (child versions) of the DEFAULT version. Finally, the first-level (child) versions can be posted to the DEFAULT version.
After each child version is posted to its respective parent, the child version can be deleted.
- Reconciling and posting can only take place between versions in the direct path; it is not possible to reconcile and post across version paths.
- Maintaining a complex version tree has some associated performance costs: the more rows in the version delta tables, the greater the potential impact on query performance.
Phased projects
Many projects evolve through a prescribed or regulated group of stages that require engineering, administrative, or legal approval before proceeding to the next stage. For example, within the utility domain, common project stages include working, proposed, accepted, construction, and as built. This particular process is essentially cyclical: a work order is initially assigned to an engineer and modified over time as the project evolves through the various stages before full integration with the production database.
In this approach, a version is created to represent each stage of this process: initial design or proposed version, an approved version, and a version for the construction phase. As the project advances through the various project milestones, each stage is reviewed and approved, then superseded by the next version until the last stage is reached and completed. The older versions can be maintained for historical reference or deleted as required.
Once the project is complete, the constructed version can be reconciled with, and posted directly to, the DEFAULT version without having to reconcile and post with the previous versions in the lineage.
Pros:
- This method is suitable for projects that evolve through a series of stages, where each stage must be isolated as a distinct unit of work.
- As with all other multiple-tier configurations, this workflow allows editors to develop proposals and design alternatives without affecting the production database.
- Changes can be posted directly to DEFAULT, which eliminates the overhead of progressively posting changes up the version tree to the DEFAULT version.
Cons:
- Not suitable for batch reconcile and post processes
Archiving
A key requirement for many projects is the preservation of various states of the database as it changes over time. Some of the typical queries a geodatabase may have to support include the following:
- What did the database look like at a given time?
- How has a particular feature changed over time?
- Given that a feature was removed from the database on a certain date, what features currently exist where the deleted feature used to be?
A common requirement for maintaining a historical record is to preserve an archive of the DEFAULT version, since it usually represents the published version of the database. Changes to the DEFAULT can occur as a result of edits to the DEFAULT or changes being reconciled and posted to it from other versions. A geodatabase can be set up to automatically record these changes. This functionality is built into the geodatabase; no additional data modeling or application customization is required to support automated archiving.
Some projects require an archive of a version other than the DEFAULT. Since a version represents the state of its parent version at the time you create it, you can create a version with the sole purpose of recording what its parent version looked like at a particular point in time. As an example, a new historical version could be created from the design version. When the design version is reconciled and posted to its parent version, the historical version would remain as a record of the design at a particular point in time.
For more detailed information on archiving, see The archive process.
Distributed data management
Some projects require two or more distant offices to work on the same data. Each office requires local access to the database, and so each creates a copy of the database. When a change is made to the data in one location, the change must also be applied to the data in the other location. To keep the copies of the databases synchronized, the sites can transfer changes between each other on a scheduled basis. This capability is referred to as geodatabase replication.
Replication also allows you to take a subset of the geodatabase on the road and edit it offline; a common requirement for field maintenance crews. Once you return to the office, you can reconnect to the network and merge the changes back into the production database.
Replication is also helpful to anyone who would otherwise have to edit data over a slow network. In this case, replication allows you to extract a subset of the data to your local machine so that you can work on it without having to communicate over the network. Once you're finished editing, you can transfer the changes over the network, merging them back into the production database. For more information, see Scenarios using distributed data.