In Feb. 2009 I had a fun and productive exchange with Jon Goodall about accessing and processing water data from CUAHSI HIS WaterOneFlow (WOF) and WaterML using Python. We did this openly via our blogs, so our exchange remained as a public resource; that was part of the fun, as I think these posts and comments advanced the use of Python with CUAHSI HIS cyberinfrastructure; even now, a Google search for “python waterml” returns our blog posts at the top! Neat. I’ve been largely out of touch with CUAHSI HIS for a while, but I’m very interested in catching up; more about that on a later post. I’ve wanted to contact Jon to ask what was new with Python and HIS, and was pleasantly surprised to see his very recent post on this topic. But enough background.
Jon updates a 2009 post by using the new HIS Central catalog web services. This is powerful and very cool, for the reasons he states. In trying to run his code, I ran into several small issues that I fixed and also came up with a few minor enhancements. I’d like to share those, to add to this public resource and see if I can motivate Jon to add another post 😉 My code is listed at the end; it’s really the same as Jon’s in function and structure. Note: I’m using Python 2.5 on Windows Vista, with versions on Numpy and matplotlib/pylab that are not very recent.
- In an
timeSeries.values._countwas being compared to an integer value; but
_countis returned as a string, so that failed. I cast it as an int.
- The datetime values extracted from the time series responses were being passed to
pylab.date2num()as is. However,
val._dateTimeis a string (in ISO format), so
date2num()failed. Converting the returned string values to python datetime objects fixed it.
Updates, Enhancements and Clarifications
fnmatchmodule was no longer used, so I removed it.
- The bounding box arguments as the defined in the HIS WSDL are supposed to be floating-point numbers. They were strings in Jon’s code, but they worked; I changed them to floating-point, and they still worked.
pylab.plot_date()can now handle python datetime objects, so there’s no need to use
- I set up a simple lambda function,
isodtstr2dt(), to apply the conversion from ISO datetime string to python datetime object and make the code clearer.
- Multiple references to “value” or “values” in different contexts made the code somewhat confusing. I renamed the object returned by
values_obj, and set up the assignment
tsvalues_obj = values_obj.timeSeries.valuesto also clarify the subsequent list-comprehension Numpy array statements.
from suds.client import Client import xml.etree.ElementTree as ET import os import pylab import numpy as NY import datetime as DT # lambda function to convert ISO datetime string to python datetime object isodtstr2dt = lambda isodtstr: DT.datetime.strptime( isodtstr, "%Y-%m-%dT%H:%M:%S") #URL to the HIS Central API HIS_Central_URL = "http://water.sdsc.edu/hiscentral/webservices/hiscentral.asmx?WSDL" #Search parameters xmin = -81.25 xmax = -80.84 ymin = 33.84 ymax = 34.24 keyword = 'Streamflow' start_date = '2000-01-01' end_date = '2000-12-31' #query HIS Central for time series client = Client(HIS_Central_URL) response = client.service.GetSeriesCatalogForBox2( xmin, xmax, ymin, ymax, keyword, '', start_date, end_date) # Returns list of dicts holding metadata for each matching series series_array = response #for each series, call GetValues on the data server for series in series_array: client = Client(series.ServURL) values_obj = client.service.GetValuesObject(series.location, series.VarCode, start_date, end_date) tsvalues_obj = values_obj.timeSeries.values #if more than 10 values, then add to plot if int(tsvalues_obj._count) > 10: val_a = NY.asarray([float(val.value) for val in tsvalues_obj.value]) dt_a = NY.asarray([isodtstr2dt(val._dateTime) for val in tsvalues_obj.value]) pylab.plot_date(dt_a, val_a, '-o', markersize=2.5) pylab.savefig("ts_plot.png")