I’ve had some question regarding large sets of ortho imagery tiles and how you can turn this into one seamless raster (and maintain performance). While the new Mosaic Datasets are great, what if you only have 9.3 - here is one way to merge them together using the Raster Mosaic tools available in both 9x and 10x. “Mosaic Large Images to a Single Raster in ArcGIS” is a case study based on ArcGIS Desktop 9.3 with decent, yet older, hardware - then comparing some different compression options and visual results. Please note, if you have 10x, many newer options exist and I would recommend taking a good look before jumping into this approach (although still a viable option).

The Scenario

Using 7,600 uncompressed 4-band TIFF orthos, each 1km in size and 120MB (approx. 1TB total). You can store the output to a file Geodatabase (can handle huge rasters, lots of viewers) or SDE (adds enterprise access and security). Before you start it’s recommended to try your approach with a subset of data to see how long it will take and if the quality of the results meet your needs.

The Steps [processing time]

  • Create a file Geodatabase [< 1 min.]

    • Even if you are planning to save your raster in SDE, you might want to consider processing the mosaic into a file Geodatabase first. This way you can limit the processing required and run locally for faster results. Copying afterwards does add to the total processing time - but also makes sure there aren’t any issues with an input raster and you don’t need to worry about indexes or logs getting bloated if you make any mistakes.
  • Create a Raster Catalog: Make it “unmanaged” to refer to images without loading/copying [< 1 min.]
  • Run ‘workspace to raster catalog’ tool (all TIFFs in one directory) [10 min for all 7k files]
  • Set the spatial reference properties of the catalog to same as data source [< 1 min.]
  • Run ‘Raster Catalog to Raster Dataset’ tool: This varies depending on the approach. Two different approaches we tested. Note: Create pyramids afterwards, not as part of the process. [3 days (old hardware)]

    • Option 1 - LZ77 compression: Before running the tool, open the environment settings and select not to create pyramids, also set compression option to LZ77. We also unchecked “raster statistics” since we didn’t want the default colour stretching. Raster stats are on by default but leads to the usual question “why does the colour look different now?” after processing imagery (although yes, you can force this off in ArcMap in the layer properties but it resets each time).
    • Option 2 - JPEG compression. We decided on 80% compression, no specific reason why but the results were almost as good as LZ77. The quality and pixel size will impact the visual quality - again, best to play with a small sample area to get a better idea of what works for you.
  • If your final destination is SDE, copy your raster at this point using best practices for loading large data to your enterprise database. When loading to SDE set the environment setting to create pyramids. If keeping in the file Geodatabase, you will need to create the pyramids still - We used the default options and number of levels. You can also try setting Bilinear vs Nearest neighbour to see which you like better.

The Results

Using the method above, both LZ77 and JPG compression options generated outstanding results. The LZ77 (lossless) compression has a slightly better visual output at the cost of extra hard drive space. If your tests show little visual difference, I would recommend JPG compression to save space and network load - however, the clarity of LZ77 is hard to beat, especially if you have a decent network.

  • File Size: When using LZ77, we found the output size was the same as the total input size (1TB). The compression helped, but adding pyramids added back the savings. JPG with 80% compression was 350GB (including pyramids). Advantage JPEG.
  • Load times: Load times were comparable for LZ77 and JPG, but if your network is slow or constantly under heavy load it might make a difference since LZ77 using more bandwidth based on file size. Small advantage JPG.
  • Visual output: Most people don’t have an issue with JPG quality - just don’t show them LZ77 and they won’t know what they are missing. Depending on the landscape, there can be some differences that are hard to overlook. Overall, there is a difference when using lossless compression - your intended use of the data will help you determine which is more important. Advantage LZ77.

Overall, a small advantage to JPG compression - but depending on the raster, LZ77 definitely has its place.

A few lessons learned

  1. Test, test test. Pick a small representative area and run many different tests until you find the right settings that give you results that meet your specific needs.
  2. Overall, a good rule of thumb is that pyramids will add 33% more storage space.
  3. Ensure that you don’t build pyramids until the very end of processing (i.e. when uploading the last tile or in a new process after the entire load process is complete).   By default ArcGIS will rebuild pyramids for the entire raster dataset with each new tile.  This adds an excessive amount of processing time.
  4. Use JPG (75-80) compression unless you determine a requirement for LZ77.  To me this is the best tradeoff for image quality vs file size/performance.  SIDs and JPEG2000 need to be decompressed on load which adds to draw time.  LZ77 is perfectly good but it might be a bit slower than JPG because of the larger network load with most users not seeing any difference.

Final Thoughts

The goal of this case study was to try various methods of creating raster mosaics as well as reviewing compression and visual outputs on a large scale. Both of the compression types listed worked great. Other approaches and settings were also tested but they didn’t meet our final requirements or the compression options forced applications to decompress which can create some additional client load.

There are many other approaches including Mosaic datasets or using ArcGIS server and creating a tiled service. This is one approach that works very well if you are using 9.x and can be leveraged by both server and desktop applications - and yes, even 10x. This approach also gives you the option to store the data inside an Enterprise or file Geodatabase.