A common situation in the spatial data world is having discrete measurements of a continuous variable. Every place in the world has a temperature, but there are only a finite number of thermometers: how should we reason about places without thermometers and how should we model temperature?

For many use cases, the right way to model a continuous spatial variable is a raster: a regularly spaced grid where each square in the grid contains a value of the variable. This works for temperature and precipitation; it works for elevation and slope; it even works for travel times and wayfinding.

For this blog post, we will build up a temperature surface for Washington State, using the discrete temperature measurements of a set of Department of Transportation (WSDoT) weather stations.

Getting the Data

In conceiving of this post, I envisioned finding a single page with a download of weather stations and associated weather data. Little did I know, few governments assemble that data in a convenient, easy-to-access form.

However, I did find this old page from the WSDoT that has the virtue of being really fast to access and actually containing real-time temperature data. Behind the map it turns out the temperature data live as HTML overlays. To turn the web data into a temperature table, I created a Rube Goldberg SQL query, that combines some of my favourite bits and pieces:

• The pgsql-http extension for direct access to HTTP pages inside the database.
• The regexp_matches() function to parse out the pixel coordinates and temperature values from the page HTML.
• Some dirty math to convert the pixel coordinates into spatial coordinates.
• Lots of WITH queries to chain it all together.

The final result looks like this:

-- Read the HTML page from the web server
WITH html AS (
SELECT content FROM http_get('https://wsdot.com/traffic/weather/default.aspx')
),
-- Parse the HTML to extract the data
data AS (
SELECT regexp_matches(
content,
'left: (\d+)px;top: (\d+)px;.*station=(\d+).* title=''(.*?)''.*>(\d+)º', 'gn') as data
FROM html
),
-- Cast the data into appropriate types
rows AS (
select
SELECT::integer AS i,
data::integer AS j,
data::text AS station,
data::text AS title,
data::integer AS degf
FROM data
)
-- Re-scale/translate the pixel coordinates to map coordinates
SELECT ST_SetSRID(ST_MakePoint(
station,
i * 1437 + 184630,
(349 - j) * 1437 - 40500,
degf),3691)::geometry(pointz, 3691) AS geom,
title,
degf FROM rows;

Because it's a materialized view, you can refresh the temperature data at any time by running REFRESH MATERIALIZED VIEW mv_wadot_temp.

Interpolate a Grid

We can confirm the data are properly located in space by overlaying the points on a standard base map. Not bad! To turn our points into an full raster surface, we will use the ST_InterpolateRaster() function.

ST_InterpolateRaster() is a binding to the GDAL library, and the functions behind the gdal_grid utility command. To keep things simple, ST_InterpolateRaster() uses the same format for controlling the interpolation algorithm and parameters: "algorithm:param1:value1:param2:value2"

The ST_InterpolateRaster() function needs the following inputs:

• A collection of points, with the value of interest ('degf' in our case) on the Z ordinate.
• A string with the interpolation algorithm and parameters.
• An empty raster (defining the raster location, cell size, width and height) into which to place the interpolation result.

The SQL to generate the raster looks like this:

-- Constants
DROP TABLE IF EXISTS wa_rast;
CREATE TABLE wa_rast AS
WITH inputs AS (
SELECT
500::float8 AS pixelsize,
'invdist:power:5.5:smoothing:2.0' AS algorithm,
ST_Collect(geom) AS geom,
ST_Expand(ST_Collect(geom), 100000) AS ext
),
-- Calculate output raster geometry
-- Use the expanded extent to take in areas beyond the limit of the
-- temperature stations
sizes AS (
SELECT
ceil((ST_XMax(ext) - ST_XMin(ext))/pixelsize)::integer AS width,
ceil((ST_YMax(ext) - ST_YMin(ext))/pixelsize)::integer AS height,
ST_XMin(ext) AS upperleftx,
ST_YMax(ext) AS upperlefty
FROM inputs
)
-- Feed it all into interpolation
SELECT 1 AS rid,
ST_InterpolateRaster(
geom,
algorithm,
ST_SetSRID(ST_AddBand(ST_MakeEmptyRaster(width, height, upperleftx, upperlefty, pixelsize), '16BSI'), ST_SRID(geom))
) AS rast
FROM sizes, inputs;

The final surface fit provides an interpolated temperature for every point in the raster space! Temperature models are obviously not as simple as inverse-weighted distance, but this example should show how we can take point measurements and turn them into a continuous surface in SQL.

Conclusions

• You can use HTTP and regexp and glue to bodge together a refreshable data table from a web page.
• Surface interpolation provides a new way to leverage sparse spatial observations into domain-wide estimates.
• With more raster tools (to be discussed in upcoming posts) doing raster/vector calculations will allow for even more interesting data processing.