Metabase tips and tricks.
flexible date bin'ed dashboards
You want to avoid duplicating questions when your data consumers want different timeframe views over the data.
E.g Total Payments by Day, Total Payments by Month, etc
We handle this with a few tricks that we can apply easily that the data consumers understand without issue.
Helper Model
We defined a Helper Model with the time periods we offer for analytics.
We're lucky that we don't need super granular stats and can just offer down to hour
but you could use date_bin
to support things like 15 minutes
if needed.
SELECT
time_period
FROM (
VALUES
('hour'),
('day'),
('week'),
('month'),
('quarter'),
('year'),
) t(time_period)
Filtering the result set
In any query that requires the ability to date bin we use one of the following as part of the query definition
Use alongside date field filter
This works great for simple queries where you are filtering on a single date field as the date field filter has a bunch of useful settings like inclusive/exclusive end, ability to exclude days of the week etc that are often requested by our consumers
SELECT
date_trunc({{time_period}}, created_at)::date AS time_period,
SUM(amount)
FROM payments
WHERE
1=1
[[AND {{payment_created_at_field_filter}}]]
GROUP BY time_period
When you can't use the field filter
Metabase doesnt support date field filters on columns inside CTE's, and there are times where you want to filter multiple tables to the same time range.
In those cases we add a number of periods
filter that defines the how far to look back over instead
SELECT
date_trunc({{time_period}}, created_at)::date as time_period,
sum(amount)
FROM payments
JOIN users on payments.user_id = users.id
WHERE
payments.created_at >= date_trunc({{time_period}}, NOW() - ({{num_periods}} || ' ' || {{time_period}})::interval)
AND
users.last_sign_in_at >= date_trunc({{time_period}}, NOW() - ({{num_periods}} || ' ' || {{time_period}})::interval)
GROUP BY time_period
Caveat 1: Quarter
date_trunc
works with quarter
but interval
doesnt.
It would be a lovely addition1 but for now we add a case statement that we're happy with if the question actually requires this size of window.
This is how you would handle using date_bin('15 minutes', ...)
as well, you'd have to do the math in order to move back over the right number of the periods.
payments.created_at >= date_trunc(
{{time_period}},
CASE {{time_period}}
WHEN 'quarter' THEN
date_trunc('quarter', NOW()) - ((({{num_periods}})::integer * 3) || ' month')::interval
ELSE
NOW() - ({{num_periods}} || ' ' || {{time_period}})::interval
END
)
Caveat 2: Incomplete periods
We include data from the incomplete current period but we can change this by adding a less than version to complement each filter
SELECT
date_trunc({{time_period}}, created_at)::date AS time_period,
SUM(amount)
FROM payments
payments.created_at >= date_trunc({{time_period}}, NOW() - ({{num_periods}} || ' ' || {{time_period}})::interval)
AND payments.created_at < date_trunc({{time_period}}, NOW())
The majority of our data analysis is done on historic data, but of course you can flip the signs, or add both complements so the query both before and after NOW()
if your data has a more leading than lagging profile.
Optional targets
You'll probably have a tracking board, daily or monthly that the leadership team love to watch.
They will also want to compare past periods the current one without going to a new dashboard.
Combining COALESCE
with optional filters and you can inject SQL defaults.
This gets around the fact that raw date filters can't be set to today
.
Something like this works
with filtered_payments as (/* some query */)
SELECT
COUNT(*)
FROM filtered_payments
WHERE
date_trunc('day', filtered_payments.created_at) = date_trunc('day', COALESCE( [[{{date_to_look_at}} ,]] NOW() ))
Percent function
To remove some error prone casting and coalesce'ing related to calculating percentages, especially over counts, we added a function to make it more readable
CREATE OR REPLACE FUNCTION public.percent(portion numeric, total numeric, round_level integer DEFAULT 1)
RETURNS numeric
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
AS $function$
select
CASE
WHEN total = 0 THEN 0
ELSE ROUND(100 * (portion / total), round_level)
END
$function$
It replaces the following, while also handling counts that return 0
SELECT
ROUND(100 * cast(count(*) FILTER (where amount > 100) as float)/cast(count(*) as float)) AS old_big_payment_rate,
percent(
COUNT(*) FILTER (where amount > 100),
COUNT(*)
) AS new_big_payment_rate
FROM payments
Commonly accessed questions
We do periodic cleaning of the available dashboards; we'll archive things that havent been accessed in the last 6-12 months to keep the number of questions and dashboards down. The below will give you the commonly accessed queries, flipping or changing the order to last_execution will show you the ones that aren't often acessed. Use an Anti-Join to find the reports that haven't been run at all in the period you're checking.
with card_data as (
select
card_id,
COUNT(*) as execution_count,
MAX(started_at) as last_execution
from query_execution
where
started_at > NOW() - '12 months'::interval
GROUP BY 1
)
select
report_card.id,
report_card.name,
CONCAT('https://data.example.com/questions/', report_card.id),
card_data.execution_count,
card_data.last_execution
from report_card
join card_data on card_data.card_id = report_card.id
where
report_card.archived = false
ORDER BY card_data.execution_count DESC