Dynamic SQL Pivots for any database using dbt

Coupling SQL with iterative and conditional logic to create dynamic pivots for any database

Serge Gershkovich
5 min readApr 6, 2021
photo by Felipe Hoffa

There’s something about SQL and PIVOTs: Everyone wants to PIVOT their results, but there never seems to be an easy way for it.

— Felipe Hoffa

Felipe Hoffa recently posted an excellent article on creating dynamic pivots in Snowflake using javascript stored procedures. It's worth a read not only for his creative approach but also because it introduces functions that not everyone may be familiar with.

Querying a query withresult_scan(last_query_id()) , that’s so meta!

However, Felipe’s solution relies on features that are exclusive to Snowflake. So I decided to create an alternative that could be used on any database platform and parameterized to work with any table.

Using dbt, you can create your own PIVOT function and use it wherever you like. Something as simple as:

dynamic_pivot (columns_to_show, column_to_pivot, column_to_aggregate, table_name)

If you’re not familiar with dbt and how it could fit into your work stack, please see my earlier article:

Break it down

Since we can’t rely on Snowflake’s PIVOT function, we’ll need to adopt the classic pattern of CASE statements to test for pivot values. It’s a tedious approach and prone to error because expected values must be manually specified in the query.

Something like this:

As you can see, there’s absolutely nothing dynamic about this solution. What makes dynamic pivots so tricky is that there is no native way for SQL to anticipate the number of pivot columns that a query will generate.

If we want to make this dynamic, we will need to detect the pivot values in an automated way.

A distinct list of values will do:

SELECT DISTINCT pivot_column FROM my_table

Once we have this, all that’s needed is a FOR loop which can dynamically add a case statement for each distinct value.

for item in pivot_itemsCASE WHEN pivot_column = '{{item}}' THEN aggregate_column ELSE 0 END

Again, standard SQL falls short as it does not provide any mechanism for iterative looping. Luckily, there’s dbt, which allows you to augment your SQL code with Jinja (python) snippets. This means we can now leverage loops and conditional logic when constructing our SQL scripts.

Put it back together with dbt

Now that we’re broken down the problem into its parts, we can put them together through a dynamic script using dbt. We will be using what dbt calls “Analysis” for this exercise since all we are trying to do is run some SQL, not build anything permanent.

To review, the parts are:

  • input parameters (query columns, pivot column, aggregate column, and table_name)
  • getting a distinct list of pivot values
  • looping over the pivot values to generate a dynamic script

Here’s what that looks like as a script in dbt:

which compiles to this when you run the Analysis:

and returns the following results:

What makes this script work

As we discussed, the key to making this work is getting a list of pivot values which we can then iterate over.

{%- set my_query -%}SELECT distinct {{pivot_column}}  FROM {{my_table}} ;{%- endset -%}{%- set results = run_query(my_query) -%}{%- if execute -%}{%- set items = results.columns[0].values() -%}{%- endif %}

Jinja uses a run_query() function, which executes SQL and returns a result object. Using the result object properties, we obtain a distinct list of column values as an array which we can then assign to a variable for subsequent use.

Next, I want to highlight a piece of code with which you will undoubtedly become intimately familiar once you start using loops in dbt, but it may look strange at first.

{%- if not loop.last %} 

In SQL, patterns invariably break stride when they come to the end of a list, omitting a final comma. We can use loop.last to detect when we are not yet on the last item of our loop and place the corresponding comma.

('item_1', 'item_2', 'item_3')  --good
--instead of
('item_1', 'item_2', 'item_3',) --error

Analysis + parameters = Macro

If you look closely at the Analysis we’ve created, you’ll notice that except for the input variables we set at the start, there is nothing else in the script specific to our table or data values. This means that you can run this neat little script for any table and its corresponding columns—the very reason why dbt macros were invented.

By simply changing a few lines of our Analysis, we can transform this into a parameterizable Macro which can then be called from any model or subsequent analysis in our dbt project. We have effectively defined our own version of Snowflake’s PIVOT.

Now we can call this pivot function in a single line from anywhere in dbt:

Conclusion

Dynamic pivots are a frequent challenge when doing modeling or data analysis. Still, the key takeaway here is to become aware of the possibilities of what you can achieve when you couple SQL with conditional and iterative logic.

The code and data sample from this article is available on GitHub, and you can sign up for a free dbt account to experiment on your own. I encourage you to try it and see what other interesting use cases you come up with.

Happy pivots and any other iterative creations that you can think of!

--

--

Serge Gershkovich
Serge Gershkovich

Written by Serge Gershkovich

I am the author of "Data Modeling with Snowflake" and Head of Product at SqlDBM. I write about data and cloud cost optimization.

No responses yet