Announcing Crunchy Bridge for Analytics—interact with your data lake using Postgres commands via extension, with a vectorized, parallel query engine. Learn more in our announcement.

Fun with Postgres ASCII Map and Cardinal Directions

Disclaimer

This article will contain spoilers both on how I solved 2022 Day 23's challenge "Unstable Diffusion" using SQL, as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language.

AOC Day 23

Tech used in this Day:

For this challenge, we received an ASCII map representing where elves are standing and where they are not:

....#..
..###.#
#...#.#
.#...##
#.###..
##.#.##
.#..#..

The octothorpes are the elves. The idea is that a number of rounds happen. During the first half of the round, the elves look around and decide on which direction they would like to move, as long as the area in that direction is not occupied. In the second half of the round, they try to move. If two or more elves try to move into the same spot, then neither of them moves. Our goal is to calculate how diffuse the grid becomes after ten rounds. To start, we do our usual setup of using a foreign data wrapper to import the text file. Not that the test file above is small: the actual one is a 72 x 72 character grid.

CREATE EXTENSION if not exists file_fdw;

CREATE SERVER if not exists aoc2022 foreign data wrapper file_fdw;

DROP SCHEMA if exists aoc2022_day23_grove CASCADE;
CREATE SCHEMA aoc2022_day23_grove;
SET search_path = aoc2022_day23_grove, public;

CREATE FOREIGN TABLE aoc_day23 (line text)
  SERVER aoc2022 options(filename '/tmp/aoc2022.day23.input'
-- SERVER aoc2022 options(filename '/tmp/aoc2022.day23.testinput'
);

The first step is to transform that text file into a SQL table. We will throw things into an unlogged table, with X and Y coordinates, plus extra columns to put our proposed movements each round. Two sequences help us to put characters into the correct coordinates. Also note the WHERE a = '#' which ensures we only add the positions in which there is an elf. If the spot is empty, we do not even add it to the table (which as we will see, will be helpful later on). We use string_to_table to break each line into separate characters, and the MATERIALIZED keyword to ensure that part of the CTE does not get combined with the rest, but runs standalone.

CREATE SEQUENCE aoc;
CREATE SEQUENCE aoc2;

CREATE UNLOGGED TABLE grove (
  y INT,
  x INT,
  propx INT,
  propy INT
);

WITH x AS (SELECT nextval('aoc') AS myrow, setval('aoc2',1,false), line
  FROM aoc_day23)
,y AS materialized (SELECT *, string_to_table(line, null) AS a FROM x)
,z AS (SELECT *, nextval('aoc2') AS mycol FROM y)
INSERT INTO grove (y,x)
SELECT myrow, mycol FROM z WHERE a = '#';

CREATE INDEX groveindex on grove(x,y);

Before we create the final function, we need to do a couple more things. While writing this solution, I found that a generic query plan was much faster than having Postgres use a custom one. Quick aside: when Postgres runs the same query over and over inside a function, the query becomes a prepared statement (see the recent post on doing this inside pgbouncer). Postgres has a choice of using a generic plan to cover all possible inputs, or a custom one, which generates a plan based on the specific inputs. By default, the choice is set to "auto", which means Postgres will use custom for five times, then try a generic and pick a winner for all future runs. This strategy works well, but sometimes forcing it to one or the other works best. In our case, we want to force it to always use a generic plan from the start. Not that this only affects the current session.

SET plan_cache_mode = force_generic_plan;

The other thing we need is a way to represent the exact coordinate of a single elf. We have two coordinates, so we need a custom type to bind them into a single object:

CREATE TYPE boi AS (x INT, y INT);

Here, "boi" stands for "bundle of ints" (but please pronounce it in the most fun way possible). While I could have used Postgres' built-in point type, it uses the float data type, and can be a little finicky in some of its casting and operators, so I opted to make my own type. The function below solves parts one and two of the puzzle, hence the single arguments it takes directing it which to solve.

CREATE or replace FUNCTION grove_walk(puzzle int)
  returns int language plpgsql AS $$
DECLARE
  q RECORD; myround INT = 0; r RECORD; myarray boi[];
  dirloop TEXT = 'NSWE'; mydir CHAR; mymoves int = 0;

BEGIN

/* For this main loop, let's bail if we hit 5000 rounds */
WHILE myround < 5000 LOOP
  myround = myround + 1;

/* Every once in a while, rebuild the whole table */
  IF 0=myround%10 THEN
    RAISE NOTICE 'At round %, we are going to rebuild the table', myround;
    DROP TABLE if exists temp_grove;
    CREATE UNLOGGED TABLE temp_grove AS SELECT * FROM grove;
    TRUNCATE TABLE grove;
    INSERT INTO grove SELECT * FROM temp_grove ;
    ANALYZE grove;
  END IF;

  raise debug 'Round % start; previous moves was %', myround, mymoves;
  mymoves = 0;

  /* Loop through every elf in random order and find its proposed movement */
  <<elf>> FOR q IN SELECT ctid,x,y FROM grove LOOP

    /* Grab from 1 to 9 nearby points and put into an array */
    SELECT INTO myarray array_agg((x,y)) FROM grove
        WHERE y BETWEEN q.y-1 AND q.y+1
          AND x BETWEEN q.x-1 AND q.x+1;

    /* If nobody is nearby, we stay put! */
    IF array_length(myarray,1) = 1 THEN
      CONTINUE elf;
    END IF;

    /* Check each direction - the order changes each time per the rules */
    FOR mydir IN SELECT unnest(string_to_array(dirloop, NULL)) LOOP


      IF mydir = 'N' AND NOT myarray &&
        ARRAY[(q.x-1,q.y-1)::boi, (q.x,q.y-1)::boi, (q.x+1,q.y-1)::boi] THEN
          UPDATE grove SET propx = q.x+0, propy=q.y+-1 WHERE ctid=q.ctid;
          CONTINUE elf;
      ELSEIF mydir = 'S' AND NOT myarray &&
        ARRAY[(q.x-1,q.y+1)::boi, (q.x, q.y+1)::boi, (q.x+1,q.y+1)::boi] THEN
          UPDATE grove SET propx = q.x+0, propy=q.y+1 WHERE ctid=q.ctid;
          CONTINUE elf;
      ELSEIF mydir = 'W' AND NOT myarray &&
        ARRAY[(q.x-1,q.y-1)::boi, (q.x-1, q.y)::boi, (q.x-1,q.y+1)::boi] THEN
          UPDATE grove SET propx = q.x+-1, propy=q.y+0 WHERE ctid=q.ctid;
          CONTINUE elf;
      ELSEIF mydir = 'E' AND NOT myarray &&
        ARRAY[(q.x+1,q.y-1)::boi, (q.x+1, q.y)::boi, (q.x+1,q.y+1)::boi] THEN
          UPDATE grove SET propx = q.x+1, propy=q.y+0 WHERE ctid=q.ctid;
          CONTINUE elf;
      END IF;

    END LOOP; /* end of each direction */

  END LOOP; /* end of each elf */

  /* Remove all proposals that bump into each other */
  UPDATE grove SET propx=null WHERE (propx,propy)
    = ANY(SELECT propx,propy FROM grove where propx IS NOT NULL GROUP BY 1,2 HAVING count(*) > 1);

  /* Move each elf that is not going to run into another one */
  FOR r IN SELECT * FROM grove WHERE propx IS NOT NULL LOOP
    mymoves = mymoves + 1;
    -- Cannot use ctid below!
     UPDATE grove g SET propx=null,x=r.propx,y=r.propy WHERE x=r.x AND y=r.y;
   END LOOP;

  /* Leave if puzzle 2 is solved */
  IF mymoves < 1 THEN RETURN myround; END IF;

  /* Leave if puzzle 1 is solved */
  IF puzzle=1 AND myround = 10 THEN
    return 1;
  END IF;

  /* Start in a new direction next round */
  dirloop = right(dirloop,-1) || left(dirloop,1);

  END LOOP;

  return 0;

END
$$;

Let's break this function down line by line, starting with the first two:

CREATE or replace FUNCTION grove_walk(puzzle int)
  returns int language plpgsql AS $$

This part is where we give the function a name, set the argument to the variable `puzzle', tell Postgres what language is being used, declare that it should return a single integer. Finally, we start the body of the function with the useful dollar-sign quoting technique.


DECLARE
  q RECORD; myround INT = 0; r RECORD; myarray boi[];
  dirloop TEXT = 'NSWE'; mydir CHAR; mymoves int = 0;

Next up is the declaration section, in which we tell Postgres what variables we are going to use in this function, and what types they are. We use the RECORD types when iterating through the results of a query. The INTEGER types for simple counting. The TEXT and CHAR types keep track of which direction we are currently trying. We also assigned a variable to our new data type boi - actually, an array of them.


BEGIN

WHILE myround < 5000 LOOP
  myround = myround + 1;

The keyword BEGIN marks the actual code that will run when the function begins. We immediately enter a WHILE loop, with a safety hatch of 5000 loops, and start counting the loops with the myround variable.


/* Every once in a while, rebuild the whole table */
  IF 0=myround%10 THEN
    RAISE NOTICE 'At round %, we are going to rebuild the table', myround;
    DROP TABLE if exists temp_grove;
    CREATE UNLOGGED TABLE temp_grove AS SELECT * FROM grove;
    TRUNCATE TABLE grove;
    INSERT INTO grove SELECT * FROM temp_grove ;
    ANALYZE grove;
  END IF;

This function is going to do a LOT of updates to the grove table. Each update actually does a DELETE and an INSERT behind the scenes, due to the way MVCC works in Postgres. Those deletes are a problem, as the table becomes more bloated each round, slowing down the time it takes a round to complete. To get around this, one could tune the table to make the autovacuum daemon more aggressive, and/or do some manual vacuum cleanup. Because no other connections are using it, we can rebuild it in place by creating a temporary copy of it, truncating the original, then re-populating it by copying back from the temp table. As one might imagine, this is a heavy lift, but in this case the performance boost of completely removing all table bloat every 10 rounds is well worth it.


  raise debug 'Round % start; previous moves was %', myround, mymoves;
  mymoves = 0;

  /* Loop through every elf in random order and find its proposed movement */
  <<elf>> FOR q IN SELECT ctid,x,y FROM grove LOOP

A little debugging line goes next. By setting the level to DEBUG, we ensure the output will not appear unless the caller forces their level to debug like this: SET client_min_messages - DEBUG. We also want to track how many times each round an elf has moved, so we reset the mymoves variable to 0. Then we run a SELECT statement and pull back each row (i.e. each elf) from the table. We are going to examine every elf and see which, if any, space it desires to move to. Because all the elves act independent of each other, we can run this LOOP without the use of an ORDER BY - but remember that a lack of ORDER BY is usually a red flag except in special circumstances like this one. We also gave this LOOP a name ("elf"). The use of a name is optional for loops inside of Pl/pgsql, but can be very handy when you have many loops and need to refer to a specific one. Rather than a SELECT * we only pull back the exact rows we need - in this case the x and y coordinates, as well as the special system column ctid which represents the actual location of the row.


    /* Grab from 1 to 9 nearby points and put into an array */
    SELECT INTO myarray array_agg((x,y)) FROM grove
        WHERE y BETWEEN q.y-1 AND q.y+1
          AND x BETWEEN q.x-1 AND q.x+1;

This next part took a little trial and error to find the best way to do this. And by "best", I mean fastest way via SQL. We are going to need to look all around each elf and see which nearby spots are free. Which spots to check depend on the direction. For example, if we want to see if an elf can go north, we need to look north (N), northeast (NE), and northwest (NW). While we could do this via four separate SELECT statements for each direction, and then act on it, it is far more efficient to make a single call, and then react to that. So our statement is going to pull back all rows representing the 9 squares surrounding the current location. So we are going to look North, South, East, West, and also NE, NW, SE, and SW. We need to store all that information into a single variable, even though it will get returned as a collection of 1 to 9 rows. We always return at least one row because of ourselves in the circle of 8. All the rows get collapsed into an array by use of the array_agg function. We also need to combine the x and y into a single value, so they get put into a boi array. Because myarray is already declared as an array of the data type boi, we do not need to cast the argument to array_agg, although one could for clarity.


    /* If nobody is nearby, we stay put! */
    IF array_length(myarray,1) = 1 THEN
      CONTINUE elf;
    END IF;

This part is simple, and reflects the first rule of the puzzle in part one: if there are no elves around us, we stay put and do not populate our proposed x and y coordinates. We do this by checking the length of the array we built. If there is only one nearby elf (us), we continue on to the next elf. Note that the array_length function has an annoying mandatory second argument, specifying which part of the array to check. As our array only has a single dimension, we set it as 1. It would be nice if there were a single argument array_length function that worked for simple arrays.


  /* Check each direction - the order changes each time per the rules */
  FOR mydir IN SELECT unnest(string_to_array(dirloop, NULL)) LOOP

The rules also dictate that the cardinal directions get checked in a different order each time. So the first time through, we check north first, and if we find a match, we don't bother checking the other directions. Each round, the order of directions checked changes. We track this by modifying the four-character string (e.g. "SNWE") inside the dirloop variable. That string gets separated into a four-element array with the string_to_array function. The second argument of NULL ensures we split on every character. We then feed that to unnest as a way to walk through the items in the array as part of a loop.


    IF mydir = 'N' AND NOT myarray &&
        /* Anything to the N, NE, or NW? */
        ARRAY[(q.x-1,q.y-1)::boi, (q.x,q.y-1)::boi, (q.x+1,q.y-1)::boi] THEN
          UPDATE grove SET propx = q.x+0, propy=q.y+-1 WHERE ctid=q.ctid;
          CONTINUE elf;
      ELSEIF mydir = 'S' AND NOT myarray &&
        ARRAY[(q.x-1,q.y+1)::boi, (q.x, q.y+1)::boi, (q.x+1,q.y+1)::boi] THEN
          UPDATE grove SET propx = q.x+0, propy=q.y+1 WHERE ctid=q.ctid;
          CONTINUE elf;
/* Not shown: West and East */

In this section, for the current direction of interest, we see if there are any matching entries by adjusting the x and y values, then using the && operator to see if any of the spots we care about are already inside (overlaps) the myarray variable. This variable is an array (or list) or x/y coordinates that we know are nearby. If there are no matches, we know that the three coordinates are empty and thus we can move in the current direction. Since we still have to account for bumping into another elf trying to move to the same place, we update the table and put our new x/y coordinates as proposed x and proposed y. Once we get a match, we do not need to check the other directions, so we continue to the next elf. Because we are already inside of a loop, a bare CONTINUE would go to the next direction, not the next elf, so we use a CONTINUE elf; instead.


    END LOOP; /* end of each direction */

  END LOOP; /* end of each elf */


  /* Remove all proposals that bump into each other */
  UPDATE grove SET propx=null WHERE (propx,propy)
    = ANY(SELECT propx,propy FROM grove where propx IS NOT NULL GROUP BY 1,2 HAVING count(*) > 1);

We finish going in each direction, then finish up each elf. At that point, we have walked through every point in the grid and populated the proposed x and proposed y for each elf that can move. We then build a list of all the proposed x,y coordinates, and use GROUP BY and HAVING to only list the ones used by more than one elf. That SELECT statement gets passed to ANY, which allows the outer UPDATE to void any proposed spots in which more than one elf is trying to use. We don't need to set both propx and propy.


  /* Move each elf that is not going to run into another one */
  FOR r IN SELECT * FROM grove WHERE propx IS NOT NULL LOOP
    mymoves = mymoves + 1;
    -- Cannot use ctid below!
     UPDATE grove g SET propx=null,x=r.propx,y=r.propy WHERE x=r.x AND y=r.y;
   END LOOP;

Now that we removed the conflicting propx values, we can walk through the remaining ones and perform the move, by setting their x/y to the proposed x/y. We also null out the propx while we are here. We also increment our move count, which is important for part two of the puzzle.


  /* Leave if puzzle 2 is solved */
  IF mymoves < 1 THEN RETURN myround; END IF;

  /* Leave if puzzle 1 is solved */
  IF puzzle=1 AND myround = 10 THEN
    return 1;
  END IF;

These are straightforward - we are looking for the exit conditions for each part of the puzzle. For part 1, we still have some further calculations to do, so we leave with an arbitrary and unimportant return value of "1"


  /* Start in a new direction next round */
  dirloop = right(dirloop,-1) || left(dirloop,1);

  END LOOP;
  return 0;

END
$$;

Finally, before we end our main outer loop, we switch the direction. The rules ask us to shift the order by one each turn, such that "NSWE" becomes "SWEN" and then becomes "WENS" etc. This is easy in SQL by using the left and right functions to grab the last item and then stick it in front of the other three. Finally we end the loop, and end the function. For part one, we also need to see how many elves there are in the final area. Before we do so, we make a copy of the grove table, as part two needs to start from a pristine state.

CREATE TEMP TABLE grove_backup AS SELECT * FROM grove;
SELECT grove_walk(1);

WITH m as (SELECT (max(x) - min(x)+1) * (max(y)-min(y)+1) AS total FROM grove)
,t as (SELECT count(*) AS used FROM grove)
SELECT total - used AS aoc_2022_day23_part1 FROM m, t;

The above runs about 4.8 seconds on my system. On to part two!

AOC Day 23 - Part Two

Part two of the puzzle asks us to go way beyond 10 rounds and find the point at which no more elves have moved. In other words, until the mymoves variable is zero. We can use the same function, although our periodic table rebuild becomes more important than ever, as the final answer (for my input) was 957 rounds. Before we run the function, we do need to start with a copy of the table that was not updated by part one, so we rollback the table to the state it was in right before we ran part one.

/* PITR */
TRUNCATE TABLE grove;
INSERT INTO grove SELECT * FROM grove_backup;
SELECT grove_walk(2) AS aoc_2022_day23_part2;

That's all for this puzzle. This part took the longest yet to run of any Day - about six minutes. There are ways to speed that up a lot - such as putting things into memory instead of constant updates to the table. But the UPDATEs and SELECTs feel more true to the goal of doing this all in SQL as much as possible.

AOC Day 23 - Bonus Round!

Earlier on, I made the choice to periodically rebuild the entire table we were using to track the location of the elves. By doing so, we got a "fresh", unbloated version of the table to appear every 10 turns. However, was I correct in thinking things slow down? And was 10 a decent default? As it turns out, analyzing data and finding trends is something databases are particularly good at! The first step was to create a simple table to collect how long each round took:

CREATE UNLOGGED TABLE public.elf_timing (
  freq int,
  rebuild bool,
  round int,
  mytime timestamptz
);

Next, we run an INSERT inside our function, once per loop, and an extra one any time that we rebuild the table:

myfreq INT = 10;
...

INSERT INTO public.elf_timing SELECT myfreq, true, myround, timeofday()::timestamptz;

  /* Every once in a while, rebuild the whole table */
  IF 0=myround % myfreq THEN
    DROP TABLE if exists temp_grove;
    CREATE UNLOGGED TABLE temp_grove AS SELECT * FROM grove;
    TRUNCATE TABLE grove;
    INSERT INTO grove SELECT * FROM temp_grove ;
    INSERT INTO public.elf_timing SELECT myfreq, false, myround, timeofday()::timestamptz;
  END IF;

The true/false lets us pick out the times when we are rebuilding the table, which will give us insight later as to how long it actually takes to rebuild this table. We use timeofday() to return the current time. If we were to use now(), it would return the same timestamp each round, as it only returns the time the current transaction started with. Once those new inserts are in place, we can rerun the function and adjust the freq variable each time to see exactly how long each round takes. Our timing table starts to look like this:

  freq | rebuild | round |            mytime
 ------+---------+-------+-------------------------------
    10 | t       |     1 | 2023-11-22 00:05:25.265515-05
    10 | t       |     2 | 2023-11-22 00:05:25.744037-05
    10 | t       |     3 | 2023-11-22 00:05:26.22159-05
    10 | t       |     4 | 2023-11-22 00:05:26.691277-05
    10 | t       |     5 | 2023-11-22 00:05:27.161819-05
    10 | t       |     6 | 2023-11-22 00:05:27.644129-05
    10 | t       |     7 | 2023-11-22 00:05:28.135452-05
    10 | t       |     8 | 2023-11-22 00:05:28.635376-05
    10 | t       |     9 | 2023-11-22 00:05:29.138946-05
    10 | t       |    10 | 2023-11-22 00:05:29.647686-05
    10 | f       |    10 | 2023-11-22 00:05:29.693706-05
    10 | t       |    11 | 2023-11-22 00:05:30.117274-05
    10 | t       |    12 | 2023-11-22 00:05:30.539393-05
    10 | t       |    13 | 2023-11-22 00:05:30.972813-05

I ran the function 20 times with the new tracking information: starting with a frequency of 5 (i.e. rebuilding the table every 5 rounds), then went up by 5 until I hit 100.

First order of business: how expensive is it to rebuild that table? It's a SELECT* plus a CREATE TABLE plus a TRUNCATE plus another SELECT* plus some index rebuilding. A good amount of work. On the other hand, the table is unlogged and very, very small. So, let's calculate some numbers. What we need to focus on is the border between rebuild false and rebuild true. Specifically, we need to see how much the mytime value changes from the final true rebuild call for each round, until the next false rebuild call. When we are trying to compare rows to nearby rows, we reach for windowing functions:

WITH x AS (SELECT *,
  CASE WHEN rebuild is false THEN mytime-lag(mytime) OVER(order by mytime) ELSE null END as cc
  FROM elf_timing)
, y AS (SELECT extract(epoch from cc) AS secs FROM x WHERE cc is not null)
SELECT min(secs),max(secs),avg(secs),stddev(secs) from y;

So the first thing the CTE above will do is calculate the difference at these borders by computing the current mytime versus the previous rows mytime (via the lag function) and do this over a simple window in which we order by mytime. If the rebuild value for the row is true, we throw away the result by setting it to null. In the next part of the CTE, labeled y, we throw away all the rows that have that null value, and change the interval generated by x into a number of seconds. Finally, we run some simple statistics on our final list of numbers.

   min    |   max    |          avg           |           stddev
----------+----------+------------------------+----------------------------
 0.010417 | 0.100946 | 0.03749472956782199401 | 0.007553743791384258077350

So, on average, this rebuild took about 37 milliseconds. How does this compare to not letting it run at all? Let's peek at the two rounds before the rebuild, by adding an extra argument to the lag function to have it go back an extra row:

WITH x AS (SELECT *,
  CASE WHEN rebuild is false THEN lag(mytime) OVER(order by mtime)
        - lag(mytime,2) OVER(order by mytime) ELSE null END as cc
  FROM elf_timing)
, y AS (SELECT extract(epoch from cc) AS secs FROM x WHERE cc is not null)
SELECT min(secs),max(secs),avg(secs),stddev(secs) from y;
   min    |   max    |          avg           |         stddev
----------+----------+------------------------+------------------------
 0.037867 | 1.944933 | 0.42882609798887462559 | 0.22408780679059776592

We can see from this that our table rebuild is a success, as the normal non-rebuild runs are much more expensive. But we still need confirmation that the longer we wait to rebuild, the worse the total time is. For that, we need to compare our final run (which we know was round 957) to the first run, for each of the frequencies. A slight tweak of the CTE above gives us a nice answer:

WITH
x AS (SELECT *, CASE WHEN round=957 THEN mytime-lag(mytime) OVER(order by mytime) ELSE null END as mylag
  FROM timing WHERE round=1 OR round=957 ORDER BY mytime),
y AS (SELECT * FROM x where mylag is not null ORDER BY freq),
z AS (SELECT date_trunc('minute', (min(mylag))) AS floor FROM y),
q AS (SELECT freq, mylag, round(extract(epoch from mylag-floor),0) AS stretch FROM y,z)
SELECT * FROM q ORDER BY freq;

As before, we use a lag and a IS NOT NULL to produce a list of timings. We also add in a new CTE named z to find the lowest and nearest minute, for use in an upcoming function. For now, let's see the result:

 freq |    mylag        | delta
------+-----------------+-------
    5 | 00:05:58.344671 |    58
   10 | 00:06:19.189716 |    79
   15 | 00:06:38.778634 |    99
   20 | 00:06:59.784113 |   120
   25 | 00:07:15.632686 |   136
   30 | 00:07:38.084029 |   158
   35 | 00:07:56.465293 |   176
   40 | 00:08:10.91232  |   191
   45 | 00:08:26.602071 |   207
   50 | 00:08:47.820099 |   228
   55 | 00:09:05.50898  |   246
   60 | 00:09:20.44909  |   260
   65 | 00:09:41.765944 |   282
   70 | 00:09:56.672234 |   297
   75 | 00:10:16.567748 |   317
   80 | 00:10:37.709436 |   338
   85 | 00:10:46.580778 |   347
   90 | 00:10:59.836356 |   360
   95 | 00:11:21.743917 |   382
  100 | 00:12:06.669316 |   427
(20 rows)

Okay, we can see the total time to run our function increases at a regular rate as the delay of rebuild grows. But can we do better? What fun are boring numbers when we have a terminal that supports ANSI color codes and (some) Unicode characters? Let's create a quick function to output a bar chart. One annotated function coming right up:

CREATE OR REPLACE FUNCTION elf_graph()
returns void language plpgsql as $$
DECLARE
  myrec RECORD; len INT;
  mytext TEXT = chr(10); /* The final string to output: start it with a newline */
  mycolor TEXT;
  green   TEXT = E'\x1b[38;5;77m';
  red     TEXT = E'\x1b[38;5;196m';
  orange  TEXT = E'\x1b[38;5;214m';
  purple  TEXT = E'\x1b[38;5;165m';
  reset   TEXT = E'\033[0m';
BEGIN

/* Alas, we cannot use a WITH and a LOOP, so we need a temp table.
   This table does not need to hang around, so we add ON COMMIT DROP        */
CREATE TEMP TABLE myinfo ON COMMIT DROP AS
WITH
x AS (SELECT *, CASE WHEN round=957 THEN mytime-lag(mytime) OVER(order by mytime) ELSE null END as mylag
  FROM timing WHERE round=1 OR round=957 ORDER BY mytime),
y AS (SELECT * FROM x where mylag is not null ORDER BY freq),
z AS (SELECT date_trunc('minute', (min(mylag))) AS floor FROM y),
q AS (SELECT freq, mylag, round(extract(epoch from mylag-floor),0) AS stretch FROM y,z)
SELECT * FROM q;

mycolor = red;

FOR myrec in SELECT * FROM myinfo ORDER BY freq LOOP
  /*
     My screen is not wide enough to represent a strict 1:1 relationship,
     so we cut the size in 2 for our stretch column
   */
  len = (myrec.stretch)/2;

  /*
    For contrast, we will alternate between red and green lines.
    Also, because elves = christmas = red and green
  */
  mycolor = CASE WHEN mycolor=red THEN green ELSE red END;

  /*
     Every new row of information, we start a new line, change the color to purple,
     and output the frequency for the row. We use to_char with 999 to make sure
     the numbers are the same width and right-justified. We also orange-output our
     current length, which is roughly the number of seconds to run.
  */

  mytext = mytext || chr(10) || purple || to_char(myrec.freq, '999');
  mytext = mytext || orange || to_char(len, ' 999  ');

  /* For each two values we find, output a Unicode "FULL BLOCK" */
  mytext = mytext || mycolor || repeat(U&'\2588', len/2);

  /*
    To help make things a little more accurate, we also output
    a Unicode "LEFT HALF BLOCK" if the number was odd
  */
  IF 0 != len %2 THEN mytext = mytext || U&'\258C'; END IF;

END LOOP;

/*
  We were lazy and did not reset the color codes above, relying instead
  on the new color to clobber the old one. However, we now need to
  reset any and all colors at the end of the string
*/
mytext = mytext || reset;

/* Output the entire graph */
RAISE NOTICE '%', mytext;

END;
$$;

Let's run it and see what happens:

graphit

Avatar for Greg Sabino Mullane

Written by

Greg Sabino Mullane

November 29, 2023 More by this author