See me live and in-person at the Esri DevSummit 2018!
Tomato's optional.
Like Pokemon, you gotta catch them all.
Another GIS Blog
Python, .NET, C++, GIS, and Computer Mysticism
Thursday, February 8, 2018
Monday, July 10, 2017
ArcGIS API for Python
Sorry for not posting folks, but I've been busy working on the ArcGIS API for Python.
Tomato throwing is optional.
See ya there
For those who don't know I wrote ArcREST with another colleague of mine, Mike. Since the announcement of the ArcGIS API for Python, we have decided to retire ArcREST, though if pull requests are submitted to fix critical issues, we will do our best to merge them.
So what does that leave the people with to work with the REST API? There is a great package from Esri, the ArcGIS API for Python. Get it here (https://developers.arcgis.com/python/guide/install-and-set-up/).
It has tons of great features for everyday users, content producers and administrators. Another benefits is that you get the new Spatial DataFrame!
The Spatial DataFrame (SDF) is built on top of the Panda's DataFrame to display 2-D data with labelled columns. The SDF can pull data from services and local feature classes. It is a quick way to merge your online content to local content, and republish it out in a new service or save as a feature class.
At v1.2 of the API, we added support for ArcGIS Server, and enhance the Enterprise ArcGIS and ArcGIS Online support.
I highly recommend you start moving the ArcGIS API for Python.
I look forward to seeing and hearing what you do with the Python API.
If you want to see me at Esri User Conference 2017, I will be presenting the following sessions:
Session | Location | Date/Time |
Administering ArcGIS Enterprise and ArcGIS Online with Python | SDCC - Ballroom 06 D | 07/11/2017 - 8:30 am - 9:45 am |
Automating Enterprise GIS Administration using Python | SDCC - Demo Theater 11 | 07/11/2017 - 4:30 pm - 5:15 pm |
Administering ArcGIS Enterprise and ArcGIS Online with Python | SDCC - Ballroom 06 C | 07/13/2017 - 8:30 am - 9:45 am |
Tomato throwing is optional.
See ya there
Labels:
2017,
ArcGIS API for Python,
Esri,
UC
Wednesday, December 14, 2016
Microsoft Compiler for Python 2.7
Doesn't everyone hate this message:
I sure do, and I solved it by downloading a helpful program from Microsoft! Don't believe me, just google it! Install Microsoft's compiler for Python 2.7 from here: https://www.microsoft.com/en-us/download/details.aspx?id=44266 and most of the pip installs should work!
Enjoy
Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat).
I sure do, and I solved it by downloading a helpful program from Microsoft! Don't believe me, just google it! Install Microsoft's compiler for Python 2.7 from here: https://www.microsoft.com/en-us/download/details.aspx?id=44266 and most of the pip installs should work!
Enjoy
Labels:
Python
Thursday, December 1, 2016
So long VBA and Thanks for all the Memories
Microsoft has stopped providing fixes or troubleshooting for VB and VBA. Esri just announced the same in the 10.5 release. It's time to update that old code.
Some options moving forwards for your old desktop applications are:
It IS time to re-evaluate the application and see how it fits into the other development frameworks.
VB/VBA is officially dead.
Check out the article here: (https://blogs.esri.com/esri/arcgis/2016/11/14/arcgis-desktop-and-vba-moving-forward/)
Some options moving forwards for your old desktop applications are:
- .NET
- Python
It IS time to re-evaluate the application and see how it fits into the other development frameworks.
VB/VBA is officially dead.
Check out the article here: (https://blogs.esri.com/esri/arcgis/2016/11/14/arcgis-desktop-and-vba-moving-forward/)
Labels:
ArcGIS 10.5
Monday, September 19, 2016
Configuring Juypter Notebook Startup Folder
By default when you install jupyter notebook (formally iPython), the product will point to Window's My Document folder. I find this to be less than optimal because My Documents can contain a mishmash of various documents. To change the start up directory, there is a run time option where you can specify a folder, but that is not a permanent solution. A better solution is to create a configuration file.
After Jupyter is installed (I used anacoda's distribution of Python 3.5), navigate to the folder containing the jupyter.exe
The ipython notebooks should now be saved in your new directory.
After Jupyter is installed (I used anacoda's distribution of Python 3.5), navigate to the folder containing the jupyter.exe
- Type the following:
jupyter notebook --generate-config
- This will generate an entry in your user profile:
~/.jupyter
- Edit the jupyter_notebook._config.py file and find
c.NotebookApp.notebook_dir
- Uncomment the entry and enter in your path
- Save any file changes and start jupyter
The ipython notebooks should now be saved in your new directory.
Friday, August 19, 2016
Panda Dataframe as a Process Tracker (postgres example)
Sometimes you need to keep track of the number of rows processed for a given table.
Let's assume you are working in postgres and you want want to do row by row operations to do some sort of data manipulation. Your user requires you to keep track of each row's changes and wants to know the number of failures with the updates and the number of successful updates. The output must be in a text file with pretty formatting.
There are many ways to accomplish this task, but let's use Pandas, arcpy.da Update Cursor, and some sql.
Now we have a function that will return a dataframe object from a SQL statement. It contains 3 fields; Table_Name, Total_Rows, and Processed. Table_name is the name of the table in the database. Total_Rows is the length of the table. Processed is where you are going to modify every a row gets updated successfully. Errors is the numeric column where if an update fails, the value will be added to.
So let's use what we just made:
count_df = create_tracking_table(sde, tables)
for table in tables:
with arcpy.da.UpdateCursor(table, "*") as urows:
for urow in urows:
try:
urow[3] += 1
urows.updateRow(urow)
df.loc[df['Table_Name'] == '%s' % table, 'Processed'] += 1
except:
df.loc[df['Table_Name'] == '%s' % table, 'Errors'] += 1
The pseudo code above shows that whenever an exception is raised, 'Errors' get 1 added to it, and when it successfully updates a row 'Processed' gets updated.
The third part of the task was to output the count table to a text file which can be done easily using the
with open(, 'w') as writer:
writer.write(count_df.to_string(index=False, col_space=12, justify='left'))
writer.flush()
So there you have it. We have a nice human readable output table in a text file.
Enjoy
Let's assume you are working in postgres and you want want to do row by row operations to do some sort of data manipulation. Your user requires you to keep track of each row's changes and wants to know the number of failures with the updates and the number of successful updates. The output must be in a text file with pretty formatting.
There are many ways to accomplish this task, but let's use Pandas, arcpy.da Update Cursor, and some sql.
#--------------------------------------------------------------------------
def create_tracking_table(sde, tables):
"""
creates a panadas dataframe from a sql statement
Input:
sde - sde connection file
tables - name of the table to get the counts for
Ouput:
Panda Dataframe with column names: Table_Name, Total_Rows and
Processed
"""
desc = arcpy.Describe(sde)
connectionProperties = desc.connectionProperties
username = connectionProperties.user
sql = """SELECT
nspname AS schemaname,relname,reltuples
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema') AND
relkind='r' AND
nspname='{schema}' AND
relname in ({tables})
ORDER BY reltuples DESC;""".format(
schema=username,
tables=",".join(["'%s'" % t for t in tables])
)
columns = ['schemaname','Table_Name','Total_Rows']
con = arcpy.ArcSDESQLExecute(sde)
rows = con.execute(sql)
count_df = pd.DataFrame.from_records(rows, columns=columns)
del count_df['schemaname']
count_df['Processed'] = 0
count_df['Errors'] = 0
return count_df
Now we have a function that will return a dataframe object from a SQL statement. It contains 3 fields; Table_Name, Total_Rows, and Processed. Table_name is the name of the table in the database. Total_Rows is the length of the table. Processed is where you are going to modify every a row gets updated successfully. Errors is the numeric column where if an update fails, the value will be added to.
So let's use what we just made:
count_df = create_tracking_table(sde, tables)
for table in tables:
with arcpy.da.UpdateCursor(table, "*") as urows:
for urow in urows:
try:
urow[3] += 1
urows.updateRow(urow)
df.loc[df['Table_Name'] == '%s' % table, 'Processed'] += 1
except:
df.loc[df['Table_Name'] == '%s' % table, 'Errors'] += 1
The pseudo code above shows that whenever an exception is raised, 'Errors' get 1 added to it, and when it successfully updates a row 'Processed' gets updated.
The third part of the task was to output the count table to a text file which can be done easily using the
to_string()
method.
with open(
writer.write(count_df.to_string(index=False, col_space=12, justify='left'))
writer.flush()
So there you have it. We have a nice human readable output table in a text file.
Enjoy
Labels:
ArcGIS 10.4,
ArcPy,
pandas,
Python
Wednesday, August 3, 2016
More on Pandas Data Loading with ArcGIS (Another Example)
Large datasets can be a major problem with systems that are running 32-bit Python because there is an upper limit on memory use: 2 GB. Most times programs fail before they even hit the 2 GB mark, but there it is.
When working with large data that cannot fit into the 2 GB of RAM, how can we push the data into DataFrames?
One way is to chunk it into groups:
This code takes an iterable object (has next() defined at Python 2.7 or __next__() for Python 3.4) and makes other iterators of size n where n is a whole number (integer).
Example Usage:
Some considerations on 'n'. I found the following effects the size of 'n': number of columns, field length, and data types.
When working with large data that cannot fit into the 2 GB of RAM, how can we push the data into DataFrames?
One way is to chunk it into groups:
#--------------------------------------------------------------------------
def grouper_it(n, iterable):
"""
creates chunks of cursor row objects to make the memory
footprint more manageable
"""
it = iter(iterable)
while True:
chunk_it = itertools.islice(it, n)
try:
first_el = next(chunk_it)
except StopIteration:
return
yield itertools.chain((first_el,), chunk_it)
This code takes an iterable object (has next() defined at Python 2.7 or __next__() for Python 3.4) and makes other iterators of size n where n is a whole number (integer).
Example Usage:
import itertools
import os
import json
import arcpy
import pandas as pd
with arcpy.da.SearchCursor(fc, ["Field1", "Field2"]) as rows:
groups = grouper_it(n=50000, iterable=rows)
for group in groups:
df = pd.DataFrame.from_records(group, columns=rows.fields)
df['Field1'] = "Another Value"
df.to_csv(r"\\sever\test.csv", mode='a')
del group
del df
del groups
This is one way to manage your memory footprint by loading records in smaller bits.Some considerations on 'n'. I found the following effects the size of 'n': number of columns, field length, and data types.
Labels:
ArcGIS 10.4.x,
ArcPy,
pandas
Subscribe to:
Posts (Atom)