Monday, June 16, 2014

Validating a File GeoDatabase Function

I recently created a bunch of data loading scripts that load data from some URL and copy it down local.  Nothing too special, but I noticed and interesting issue.  The file geodatabase (fgdb) would sometime randomly get corrupted and instead of the ArcGIS software seeing the fgdb as a database, it would see it as a folder in the ArcGIS software.  I create a little function to check the fgdb using the arcpy.Describe() to check the workspace ID and if it was a folder instead of a fgdb, I would erase and recreate the corrupted fgdb.  This seems to resolve the issue.
import os
import arcpy
from arcpy import env
def validate_fgdb(fgdb=env.scratchGDB):
    """
       Checks to see if the FGDB isn't corrupt.  If
       the FGDB is corrupt, it then erases the database
       and replaces it with a new FGDB of the same name.
       Input:
          fgdb - string - full path of the file geodatabase
       Output:
          Boolean
    """
    try:
        fgdb = str(fgdb)
        if os.path.isdir(fgdb) and \
           fgdb.lower().endswith(".gdb"):
            print 'workspace possible fgdb'
            desc = arcpy.Describe(fgdb)
            if hasattr(desc, "workspaceFactoryProgID") and \
               desc.workspaceFactoryProgID != "esriDataSourcesGDB.FileGDBWorkspaceFactory.1":
                shutil.rmtree(fgdb, ignore_errors=True)
                arcpy.CreateFileGDB_management(out_folder_path=os.path.dirname(fgdb),
                                               out_name=os.path.basename(fgdb))
                return True
            elif hasattr(desc, "workspaceFactoryProgID") and \
                 desc.workspaceFactoryProgID == "esriDataSourcesGDB.FileGDBWorkspaceFactory.1":
                return True
            else:
                return False

        elif os.path.isdir(fgdb) == False and \
             fgdb.lower().endswith(".gdb"):
            arcpy.CreateFileGDB_management(out_folder_path=os.path.dirname(fgdb),
                                           out_name=os.path.basename(fgdb))
            return True
        else:
            return False
    except:
        arcpy.AddError(arcpy.GetMessages(2))

Basically all it does is take a path to anything and sees if it 1 exists, 2, if it is a fgdb. I did add some additional benefits. If the path ends in ".fgdb" but does not exists, it will create the fgdb.

I have a bunch of theories on why the fgdb gets corrupted, but the leading offender is erasing tables from the fgdb causes something to get hosed in the database.  Table overwrites tend to cause this issue to happen as well.  .


Anyway enjoy and hopefully you can not eliminate this issue with your scheduled tasks or services.