Fun with SQL in Postgres: Finding Revenue Accrued Per Day

Jonathan S. Katz
SQL

I recently wrote an example of how you can project monthly recurring revenue (MRR) in Postgres. This is a helpful metric to understand how a subscription-based business is doing and can help inform all sorts of financial and operational decisions at the company.

Since writing that example, my same friend running their SaaS business on Crunchy Bridge wanted to find out how much revenue they were accruing per day over the course of a month. When a new month started, the accrued revenue would reset. Think of this metric, as a way to know what you are actually billing in that month, so you have a sense of what your short term cash flow would be.

So the data my friend is looking for would look something like this:

    date    | revenue (k)
------------+-------------
 2021-11-02 |     200
 2021-11-01 |     100
 2021-10-31 |    3100
 2021-10-30 |    3000

As before, my friend has a utility model model: while it is a subscription-based business, the subscriptions can start/stop at any time. Thus, trying to calculate the accrual is a bit tricky.

Fortunately, Postgres is well prepared to handle this. Let's see how we can do it.

Finding Revenue Accrued Per Day

Recall that our original query to calculate the projected MRR/ARR looks like this:

SELECT
  to_char(rates.day, 'Mon DD, YYYY') AS day,
  to_char(rates.mrr_run_rate, '$99,999,999,999D99') AS mrr_run_rate,
  to_char(12 * rates.mrr_run_rate, '$99,999,999,999D99') AS arr_run_rate,
  to_char(CASE
    WHEN COALESCE(lead(rates.mrr_run_rate, 30) OVER (ORDER BY rates.day DESC), 0) <> 0 THEN
      (rates.mrr_run_rate - lead(rates.mrr_run_rate, 30) OVER (ORDER BY rates.day DESC)) /
        lead(rates.mrr_run_rate, 30) OVER (ORDER BY rates.day DESC)
    ELSE 0
  END * 100, '9,999,999D99%') AS mrr_run_rate_mom
FROM (
  SELECT 
    dates.day,
    SUM(
    CASE 
      WHEN tstzrange(dates.day, dates.day + '1 day'::interval) && subscription.period THEN
        subscription.rate
      ELSE
        0
    END) / 100::numeric AS mrr_run_rate
  FROM subscription,
    LATERAL (
      SELECT *
      FROM generate_series('2021-01-01'::date, CURRENT_TIMESTAMP, '1 day'::interval) AS day
    ) dates
  WHERE subscription.org_id not in (
    SELECT id 
    FROM account
    WHERE billable
  )
  GROUP BY dates.day
) rates
ORDER BY rates.day DESC;

Let's modify the query to find revenue accrued per day over a given month:

SELECT
  to_char(rates.month_accrual, '$99,999,999,999D99') AS monthly_accrual_to_date
FROM (
  SELECT 
    dates.day,
    SUM(
      subscription.rate * CASE
         WHEN upper_inf(subscription.period) AND tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period
            THEN EXTRACT('epoch' FROM dates.day + '1 day'::interval - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric
         WHEN tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period THEN
           EXTRACT('epoch' FROM LEAST(upper(period), dates.day + '1 day'::interval) - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric
         ELSE 0
       END
    ) / (86400 * 30) / 100::numeric AS month_accrual
  FROM subscription,
    LATERAL (
      SELECT *
      FROM generate_series('2021-01-01'::date, CURRENT_TIMESTAMP, '1 day'::interval) AS day
    ) dates
  WHERE subscription.org_id not in (
    SELECT id 
    FROM account
    WHERE billable
  )
  GROUP BY dates.day
) rates
ORDER BY rates.day DESC;

So what's going on? The heart of the query is this portion here:

SUM(
  subscription.rate * CASE
     WHEN upper_inf(subscription.period) AND tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period
        THEN EXTRACT('epoch' FROM dates.day + '1 day'::interval - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric
     WHEN tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period
     THEN
        EXTRACT('epoch' FROM LEAST(upper(period), dates.day + '1 day'::interval) - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric
     ELSE 0
   END
) / (86400 * 30) / 100::numeric AS month_accrual

Let's break it down.

Recall that we are trying to get the amount of revenue accrued on a given day over the course of a month. In our utility model, we need to consider that a subscription can start and stop within a day, or may have not started yet on that day, or may have already stopped. That's a lot of cases to consider!

We go through each day in the query (e.g. 11/01/2021, 11/02/2021 etc.) and determine how much revenue was accrued on that day. Here is how the logic works in the CASE statement works:

  • First, the query checks to see if a subscription is currently active (i.e. the range it is active in has an "infinite" upper bound). If the subscription is active, let's then check to see if it is active on the current day that we are evaluating. If it is, calculate how much revenue was accrued.
  • Otherwise, if the subscription is no longer active, check to see if the subscription was active on the day in question. If it is, calculated how much revenue was accrued.

Let's go one step further into the code. We evaluate the first case with this statement:

WHEN upper_inf(subscription.period) AND tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period

This checks to see if the subscription is still active (upper_inf(subscription.period)) and the current day we're evaluating (e.g. November 2, 2021) overlaps with the subscription (tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period). If it does, we perform the revenue accrual calculation:

THEN EXTRACT('epoch' FROM dates.day + '1 day'::interval - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric

What does this do? We're trying to get the number of seconds this subscription was for the month in question up until this day. Huh?

Let's break this down:

dates.day + '1 day'::interval - GREATEST(lower(period), date_trunc('month', dates.day))

The first thing we do is calculate the upper part of our day, i.e. we start from the end of the current day (dates.day + '1 day'::interval). At the other end, we want to subtract either from the beginning of the month or when the subscription started, whichever was more recent (GREATEST(lower(period), date_trunc('month', dates.day)).

Finally, we need to get the total number of seconds this subscription was active for. We can do that using the PostgreSQL EXTRACT function and doing it from the 'epoch'. Thus, we end up with:

EXTRACT('epoch' FROM dates.day + '1 day'::interval - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric

Now let's look at the other condition: a subscription that is no longer active:

WHEN tstzrange(date_trunc('month', dates.day), dates.day + '1 day'::interval) && subscription.period
THEN
   EXTRACT('epoch' FROM LEAST(upper(period), dates.day + '1 day'::interval) - GREATEST(lower(period), date_trunc('month', dates.day)))::numeric

Our WHEN clause checks to see if the subscription overlapped on the day in question. The calculation is then similar to the one for the subscription that is still ongoing.

Now we have how much revenue was accrued for a subscription on a given day. We can multiply that by the value of the subscription and sum it up for the given day:

SUM(
  subscription.rate * CASE ...
) / (86400 * 30) / 100::numeric AS month_accrual AS month_accrual
...
GROUP BY days.day

However, we're not done yet. Notice this part:

SUM(...) / (86400 * 30) / 100::numeric AS month_accrual

Why are we dividing by these numbers?

Months have different numbers of days. For convenience, my friend's model normalizes each month to 30 days. Additionally, while we calculate the total number of seconds each subscription lasted, the rate is actually stored as a monthly value, so we need to extract that by divinding by the total number of seconds in a month (86400). Finally, the rate is stored in cents; dividing by 100 moves it to dollars.

Last but not least, we can pretty up the display of the monthly accrual using the to_char function:

SELECT to_char(rates.month_accrual, '$99,999,999,999D99') AS monthly_accrual_to_date

Conclusion

There is more than one way to calculate a running accrual of revenue using Postgres. This method made sense for my friend's utility model. It may make more sense to use window functions in some other models.

That said, this again showcases how powerful PostgreSQL is for all types of reporting. Instead of having to pull the raw data into a spreadsheet or an application to perform the calculations, you can write a query that does all the work and the only data you have to transfer are the results!

Newsletter