Map Service to Shapefile with Python Part 2 - Iteration

This post builds on the previous Extract a Map Service Layer to Shapefile using Python post. It seems as though many are interested in getting past the pesky limits ArcGIS Server has on querying/extracting data to a shapefile. The initial post does mention this limitation, which is typically 1000 or 5000 features by default depending on the version (and configurable by admins). Part 2 will focus on getting past that limitation to all the juicy goodness… errr, all the data.

Note 2016-10-31: Sample code updated to sort objectIDs before creating groups.

Before We Begin

It might be a good idea to refresh on the details of part 1 as we aren’t covering the same pieces again. And before we go too far, the same disclaimers apply: Make sure you have permission/rights; use this code for good; Don’t bombard servers with your requests; Donate to SpatialTimes (worth a try).

Map Service to Shapefile with Python, No Limits

The first thing to check is the maximum query results allowed by ArcGIS Server. Since this can be configured by the Admin, it might be worth performing a manual query to see what the limits are. For this example, I’m assuming the typical Max Results = 1000 (most Admins don’t bother changing the default).

Get All ObjectIDs, Yes All of Them

Once the maximum is set, we can extract all of the ObjectIDs from the layer and begin to divide this into chunks of 1000 (or other value). Although the maximum results are set, the &returnIdsOnly=true parameter doesn’t have this same restriction. You can return all IDs regardless of how many there are. This result sets up the rest of our queries since you can’t assume ObjectIDs are sequential with no gaps. Update 2016-10-31: The IDsJSON variable originally assumed the IDs were in order which isn’t always the case (added the sorted function).

IDsRequest = serviceURL + serviceMap + "/" + str(serviceLayerID) 
  + "/query?where=1%3D1&returnIdsOnly=true&f=pjson"
IDsResponse = urllib2.urlopen(IDsRequest)
IDsJSON = json.loads(IDsResponse.read())
IDsSorted = sorted(IDsJSON['objectIds'])

Divide and Conquer

Now that we have all the IDs, we need to turn these results into usable chunks, then setup our extraction queries - each chunk will run another query. A list will be created for the first 1000, then the next 1000, and the next. The last list will be any remainder (probably <1000).

args = [iter(AllOIDs)] * maxRequests
ListOfLists = ([e for e in t if e != None] for t in itertools.izip_longest(*args))

Putting it all Together

We are now ready to take the Map Service to Shapefile with Python. We have the Map Service, all IDs, and divided those IDs into n lists. To query each list, we just use a python loop and make the same request as in the original post.

for SingleList in GroupOfLists:
    Query the service with ObjectID query
    Make a shapefile

This will create a unique shapefile for each chunk/iteration. If you want to merge/append into a single output, you can add a few lines of code to the end, or just use the Merge Tool or Append tool.

Below is the full code with the only changes required to the URL, Service, and Layer ID variables.