Tuesday, May 8, 2012

Finding Duplicates (3 Ways)

In this post I will show some methods of finding duplicate values in an enterprise SDE database.
The general query is quite simple.  We will perform a SELECT on the table and return the values and the number of times it repeats within that table
First I will use cx_Oracle module to query a table and return both the count and field value that has duplicates:
import cx_Oracle
#.... connection 
information
#.... create cursor
columnName = "ClientName" 
table = "clients" 
sql = "SELECT %s, count(*) from %s GROUP BY %s HAVING count(*) > 1" % (columnName, table, columnName) 
print cursor.execute(sql).fetchall()
That will return a 2 column list of tuples where as a developer you can then parse that information as needed.

Let's assume you want to do it a more ArcPy way.  ArcSDESQLExecute class can be used to perform the same SQL statement.  The ArcSDESQLExecute class provides a means of executing SQL statements via an ArcSDE connection (source: webhelp.esri.com).

# Two ways to create the object, which also creates the connection to ArcSDE. 
# Using the first method, pass a set of strings containing the connection properties: 
# sdeConn = arcpy.ArcSDESQLExecute("gpserver3","5151","#","toolbox","toolbox") 
# Using the second method pass the path to a valid ArcSDE connection file 
# sdeConn = arcpy.ArcSDESQLExecute("data\Connection to GPSERVER3.sde")
sdeConn = arcpy.ArcSDESQLExecute(r"c:\temp\connectionFile.sde"

columnName = "ClientName" 
table = "clients" 
sql = "SELECT %s, count(*) from %s GROUP BY %s HAVING count(*) > 1" % (columnName, table, columnName) 

results = sdeConn.execute(sql)
for result in results:
   print result # List with [column value, # of duplicates]


You can always turns to ArcToolbox for help if you do not like using SQL. Using Summary Statistics you can get the count of the values in a specific field. It most likely will be slower for large data sets compared to the other two method mentioned above, but it can be used for any supported data set in ArcGIS.

Happy Duplicate Finding

Monday, May 7, 2012

ArcPy - List Columns for Table

In my previous post, I showed you how to use pure SQL in oracle to list the fields of a table.  This can be done using ArcPy as well.
ArcPy has a function called: ListFields (dataset, {wild_card}, {field_type}) where all you are required to pass is the dataset. The wild card and field type are limiters that can help slim down the results.
sdeFile = r"c:\temp\connectionFile.sde" 
fc = sdeFile + os.sep + "myFeatClass" 
fields = arcpy.ListFields(fc) 
for field in fields:
   print field.name

The code returns a field object from which you can access a whole set of properties. This method is preferred when examining spatial data because it provides the data in a format that the Arc system can read and understand.
Enjoy

Tuesday, May 1, 2012

Listing Oracle Columns Using Python

I've started coding a set of helpful functions for oracle and python using the cx_oracle library for python.  Just download the library and install the one associated with your python install.

In this example, I will list a set of columns in a table using python. First we need to connect to the oracle database
import cx_Oracle
username = "user" 
password = "*****" 
ip = "111.11.11.11" 
port = 1521 
SID = 'ORCL' 
table = 'mytablename' 
dsn_tns = cx_Oracle.makedsn(ip, port, SID) 
db = cx_Oracle.connect(user,passwword, dsn_tns)

All I did was create the entry like it would appear in the tnsname.ora file then pass that information along with the username and password to the database.
Great, now we are connected to the oracle database.  Now the database can be queried.

cursor = db.cursor()
results = cursor.execute("select * from %s where 1=0" % table)
print results.description


[('OBJECTID', <type 'cx_oracle.number'="">, 39, 22, 38, 0, 0), ('NEWNAME', <type 'cx_oracle.unicode'="">, 50, 100, 0, 0, 1)]

Now you do not have to use arcpy to list the fields in a table.  Some other ideas you could do with your enterprise database is: rename fields, create views, etc...

Enjoy

Friday, April 27, 2012

Simple enum implementation in python

When using Arcpy, sometimes I always felts that the parameters that require specific entries, like workspace type for layer.replaceDataSource() would be best served if they were an enumerated value.
Python has rejected the enum type, but it does not mean you cannot create your own enum type. Let's get started:

 def __enum(**enums): 
    return type('Enum', (), enums)
Now we can define the enum values as needed.

WorkSpaceType = __enum(ACCESS_WORKSPACE="ACCESS_WORKSPACE",
                       ARCINFO_WORKSPACE="ARCINFO_WORKSPACE",
                       CAD_WORKSPACE="CAD_WORKSPACE",
                       EXCEL_WORKSPACE="EXCEL_WORKSPACE",
                       FILEGDB_WORKSPACE="FILEGDB_WORKSPACE",
                       NONE="NONE",
                       OLEDB_WORKSPACE="OLEDB_WORKSPACE",
                       PCCOVERAGE_WORKSPACE="PCCOVERAGE_WORKSPACE",
                       RASTER_WORKSPACE="RASTER_WORKSPACE",
                       SDE_WORKSPACE="SDE_WORKSPACE",
                       SHAPEFILE_WORKSPACE="SHAPEFILE_WORKSPACE",                     
                       TEXT_WORKSPACE="TEXT_WORKSPACE",
                       TIN_WORKSPACE="TIN_WORKSPACE",
                       VPF_WORKSPACE="VPF_WORKSPACE")

To use the newly created type, just call it like a property on any object.
>>> print WorkSpaceType.CAD_WORKSPACE
CAD_WORKSPACE

Enjoy

Friday, April 6, 2012

Snippet: Add a Spatial Index To All Feature Classes

Ever need to run a batch spatial index in a work space? Yes, well here is a snippet to help you out. import arcpy
from arcpy import env

env.workspace = arcpy.GetParameterAsText(0) # workspace parameter
fcs = arcpy.ListFeatureClasses()
for fc in fcs:
   arcpy.AddSpatialIndex_management(fc)
print 'fin'
Enjoy

Thursday, April 5, 2012

Python Snippet, Writing Text To A File

Let's say you have a program that runs a long process, and you want to write information to a text file. Using the file IO functions, you can easily do this:

data = ["a","b","c","d"]
myFile = open("info.dat", 'w',0)
for item in data:
    myFile.write("%s\n" % item)
myFile.flush()
myFile.close()
del myFile
del data


Now you have a text file where each item in the list is on a new line.

Enjoy

Wednesday, April 4, 2012

Adding Python Extensions to ArcCatalog File Types

When writing complex python processes, sometimes you do not want to put all the logic in a single python file.  This means if you need to make an edit during testing from ArcCatalog, you will have to have file explorer open and switch between screens.

As a programmer, this is unacceptable, so make life easier and add the PY extensions as a viewable file extension in ArcCatalog.
  1. To do this, bring up ArcCatalog and select Customize from the menu bar.
  2. Next, select 'ArcCatalog Options'
  3. Click on the 'File Types' tab
  4. For Standard Installations:
    • If you did a standard installation, the computer should already have python installed, so click on 'Import File Type From Registry'
    • Select the PY extension in the list
  5. For Non-Standard Installations, or Step 4 does not work:
    • For 'File Extension' type PY
    • 'Description Type' - Python File
    • Set the icon to: C:\Python26\ArcGIS10.0\DLLs\py.ic
  6. Press OK to save your changes
  7. Press Apply then OK one more time
  8. Now you can see your python files in ArcCatalog
Enjoy