python •  • 

Multiple Field Key to Single Field Key for Relates using ArcPy

An issue with multi-field unique keys (composite key) is that we can’t use them to setup a join or relate between objects (Feature Classes or Tables) in ArcMap. In this article we will use Python and the ArcPy module to update a unique Multiple Field Key to Single Field Key for Relates using ArcPy.

Multiple Field Key to Single Field Key for Relates using ArcPy?

Python scripting isn’t the only way to do this as there are some OOTB tool combinations that can give us the same result. As well, there are some ArcMap/Pro tricks and database objects that can all be potential solutions. But Python is fun, fast, and can make quick work regardless of the data format used.

The Scenario

For this scenario lets assume we have a Feature Class (FC) and a Table with a 1:M relationship. The 4 fields that make each record unique in the parent FC are: F1, F2, F3, F4. These four fields also exist in the related table, although there can be zero to many related records here. Please note, in this scenario it is assumed the 4 field combination in the parent FC is unique with only once instance of the combination. If there is a chance your data has some duplicates, you can use the Summary Analysis tool with the case fields option to do some QC before you begin. This solution works on most editable formats including File/SDE Geodatabases and Shapefiles.

Multi-field relationship diagram Multi-field relationship diagram

There are two ways we can create the unique IDs for the objects:

  1. Use the ObjectID field as the unique key on the parent FC, and add a new field to related table and populate with the same IDs.
  2. Add a new field to both tables, allowing the field names to be identical. Then use a number sequence or the ObjectID to populate the new fields. We will use this approach as it gives more flexibility if needed.

And let’s try to do it in 30 lines of code or less.

The Approach

Since we are updating 2 objects at the same time we will need to use the ArcPy Editor class. If you’ve done simple ArcPy insert/update/delete statements before you may not have used this class - but it should be considered even if not required as it allows use of edit sessions and operations to manage transactions.

The Editor class can be used to start and stop edit sessions and edit operations for file, personal, and enterprise geodatabases and shapefiles. The Editor class can be used to start an edit session with versioned or nonversioned datasets. ~ Esri Online Help

Using the Editor class is pretty straight forward - pass the edit workspace, and start editing:

edit = arcpy.da.Editor(edit_ws)
edit.startEditing(False, False)

Once we are editing, we will use a cursor to update each of the parent FC’s linkID based on the ObjectID. Again, we could just use the ObjectID directly in this case, but the example also shows how to update 2 objects at the same time. As we pass over each parent record we will jump to the related table to find all matching items and update with the same linkID. The field LinkID is assumed to already exist with a Long Integer field type.

To loop and update the parent FC, we start with an UpdateCursor:

with arcpy.da.UpdateCursor(xs_fc, xs_att) as cursor:
  for row in cursor:
    #FYI: row[0] is LinkID field, row[5] is the ObjectID
    row[0] = row[5]

Now that the LinkID field has been updated with the current ObjectID in the parent, we need to find all of the features with the same multi-field combination in the related table to update with the same LinkID. For this, we create a cursor on the related table passing in an optional query parameter so we only loop the records matching the parent attributes (F1,F2,F3,F4). The code has this section separated into a function for readability purposes since it looks very similar to the parent FC cursor snippet. The query parameter is the part of the code making sure we have the multi-field relationship matching between the two objects.

xs_tbl_qry = xs_att[1] + " = '{0}' AND ".format(row[1]) 
  + xs_att[2] + " = '{0}' AND ".format(row[2]) 
  + xs_att[3] + " = '{0}' AND ".format(row[3]) 
  + xs_att[4] + " = {0}".format(row[4])

Pseudo query from above: F1 = val AND F2 = val AND F3 = val AND F4 = val.

Once we’ve looped through all parent/child relationships, we just need to save and stop our edit session to commit the changes.

The Code

If we remove all the commenting lines, this only takes 19 lines of code, so we did stay in our 30 line max.  The workspace is set to a File Geodatabase, however this also works on SDE Geodatabases and Shapefile/DBF workspaces.

#Name: Multiple Field Key to Single Field Key for Relates using Python and ArcPy
#Author: Bryan McIntosh
#Description: For updating a single LINK_ID field in both the parent and child table
# based on a natural 4 column composite index relationship.
import arcpy
edit_ws = r'C:\SpatialTimes.gdb' #Edit workspace
xs_fc = r'C:\SpatialTimes.gdb\CrossSection' #parent FC
xs_tbl = r'C:\SpatialTimes.gdb\CrossSection_RatingCurves' #related table
xs_att = ['Link_ID','F1','F2','F3','F4','OID@']
def fnUpdateXSTBL(qry, linkID):
with arcpy.da.UpdateCursor(xs_tbl,xs_att, qry) as cursor2:
for row in cursor2:
row[0] = linkID
edit = arcpy.da.Editor(edit_ws) #Set the edit workspace
edit.startEditing(False, False) #Note: change second param to true if using SDE WS
with arcpy.da.UpdateCursor(xs_fc, xs_att) as cursor:
for row in cursor:
row[0] = row[5] #Makes the LinkID equal to the current ObjectID of parent
cursor.updateRow(row) #Update the parent row
#Set the query for the multi-field natural key (fields are F1, F2, F3, F4)
xs_tbl_qry = xs_att[1] + " = '{0}' AND ".format(row[1]) + xs_att[2] + " = '{0}' AND ".format(row[2]) + xs_att[3] + " = '{0}' AND ".format(row[3]) + xs_att[4] + " = {0}".format(row[4])
fnUpdateXSTBL(xs_tbl_qry, row[5]) # Send the query and parent ObjectID to update the related table

Other Options You Say?

Not interested in using Python? Below are some additional approaches that can also work:

  1. Create a new field and concatenate the multi-fields into a big string: This solution works most of the time, but joining based on a long text based field isn’t desired nor efficient. If using a Shapefile/DBF with lots of rows, this can also bloat the size of the objects, possibly even hitting the 2GB limit. In any case, when updates are made to either table in the relationship, the fields will need to be recalculated (could automate very easily).
  2. Use database views or Query tables to create a virtual field on both tables. This is very similar to option 1 but is dynamically updated. Database views are server side while query tables are client side and not available to all users. Although query tables also work on non RDBMS formats.
  3. Use ModelBuilder. There are numerous ways this can be accomplished, for example: using option #1 above as the temporary join, then swapping out with a unique number ID. The unique number can be generated using the Frequency Tool or Summary Statistics Tool. Then create the temp concatenate field on the 1:M objects, and the output from the stats tool. Use the concate field to temporarily join, then update the final relate ID field using the OID from Stats. Delete the temp concatenate field when done. Lots of issues with this approach since you will need a schema lock on the objects, as well as permissions to add/delete fields (fine for local files, not so much for database users).

Related Links

If you found my writing entertaining or useful and want to say thanks, you can always buy me a coffee.