Rolling percentile in StandardSQL / BigQuery

How to make the most awesome graphs
2022-12-04

Doing percentile calculation over a rolling window in Google StandardSQL isn’t obvious, as one has to fully internalize the Window concept.

Understanding the Window concept is like being granted super powers.

Let’s say you have BigQuery rows with two fields:

and you’d like to calculate the p50, p75, p90, p95 of “value” over +3/-3 days rolling window. It’s not “hard” in hindsight but it was not obvious to me and found no example of this at the time.

SELECT
  day,
  PERCENTILE_DISC(value, 0.95) OVER (day BETWEEN -1 and +1) AS p95,
  PERCENTILE_DISC(value, 0.9) OVER (day BETWEEN -1 and +1) AS p90,
  PERCENTILE_DISC(value, 0.75) OVER (day BETWEEN -1 and +1) AS p75,
  PERCENTILE_DISC(value, 0.5) OVER (day BETWEEN -1 and +1) AS p50,
FROM `dataset`

This may look like this:

Latency in a Fuchsia’s CI subcomponent with a +/-3 days window

brb switching my title to Data Scientist.