Using TimescaleDB with PGO, the open source Postgres Operator

Jonathan S. Katz
PostgreSQL Operator

One of the many reasons "the answer is Postgres" is due to its extensibility.

The ability to extend Postgres has given rise to an ecosystem of Postgres extensions that change the behavior of the database to support a wide range of interesting capabilities. At Crunchy Data we are big fans of PostGIS, the geospatial extender for Postgres.

Another extension we are asked about often is TimescaleDB.

TimescaleDB is an open-source extension designed to make SQL scalable for time-series data. Timescale, Inc., the company behind TimescaleDB, provides an Apache 2 edition of TimescaleDB that is packaged as a Postgres extension that provides automated partitioning across time and space.

We are often asked about the potential to deploy the Apache 2 edition of TimescaleDB as an extension within our Crunchy PostgreSQL for Kubernetes using PGO, the open source Postgres Operator. We announced that we added the Apache 2 edition of TimescaleDB to PGO 4.7, and we have brought TimescaleDB into PGO v5.

Let us look at how you can deploy the TimescaleDB extension as part of an HA Postgres cluster native to Kubernetes using the PGO Postgres Operator.

Deploying TimescaleDB on Kubernetes with PGO

For brevity, we will assume that you have already deployed PGO, the Postgres Operator from Crunchy Data in your Kubernetes environment. Read more on how you can quickly get started with PGO:

https://access.crunchydata.com/documentation/postgres-operator/v5/quickstart/

Since TimescaleDB is packaged in the PostgreSQL container, you can customize your Postgres configuration to get TimescaleDB up and running. We can do this before starting up the Postgres container by creating a manifest that looks similar to this:

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: hippo
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.5-0
  postgresVersion: 13
  instances:
    - name: instance1
      dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 1Gi
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-0
      repos:
      - name: repo1
        volume:
          volumeClaimSpec:
            accessModes:
            - "ReadWriteOnce"
            resources:
              requests:
                storage: 1Gi
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          shared_preload_libraries: timescaledb

The above loads the TimescaleDB shared library along with the other standard shared libraries loaded with PGO. You can apply the manifest with kubectl apply.

Once the cluster is initialized, connect to the Postgres cluster as a superuser. You may want to create your own account that can be a superuser, or configure access to the postgres user.

For the purposes of this exercise, we can connect to the database using kubectl exec:

kubectl exec -it -n postgres-operator -c database \
  $(kubectl get pods -n postgres-operator --selector='postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master' -o name) \
  -- psql

First, confirm that the TimescaleDB library was loaded:

SHOW shared_preload_libraries ;

should yield something similar to:

  shared_preload_libraries   
-----------------------------
 pgaudit,pgaudit,timescaledb

This setup has the added benefit that the TimescaleDB library will be loaded on any replicas added to this Postgres cluster.

Now, add the TimescaleDB extension to the database:

CREATE EXTENSION timescaledb;

To test that TimescaleDB is successfully installed, go ahead and create a "hypertable" and start inserting data:

CREATE TABLE hippos (
  observed_at timestamptz NOT NULL,
  total int NOT NULL
);

SELECT create_hypertable('hippos', 'observed_at');

INSERT INTO hippos
SELECT ts, (random() * 100)::int
FROM generate_series(CURRENT_TIMESTAMP - '1 year'::interval, CURRENT_TIMESTAMP, '1 minute'::interval) ts;

This is just a small data set so you can observe that the TimescaleDB set up works. Try using a smaller increment (e.g. '5 seconds'::interval) to generate a larger data set!

Next Steps

At Crunchy Data, we want to provide users with the benefits of trusted open source Postgres, including the powerful ecosystem of Postgres extensions. TimescaleDB provides interesting functionality for managing time series data in Postgres. By enabling the easy deployment of TimeScaleDB through PGO, you can easily deploy TimescaleDB to manage your time series data in Postgres natively in Kubernetes.

(Interested in seeing PGO in action? Join us for a free webinar on Wednesday, Dec 15th. There is also a paid, in-depth training available on November 29th.)

Newsletter