A while back, I found myself deep in the trenches of prepping a finance dataset for a performance dashboard. The data, sourced from ExactonLine, was neatly organized by projects, each with its own start and end date. The task seemed simple enough: provide a month-by-month breakdown of each project’s lifecycle, incorporating projections, work in progress, invoices, and costs. No big deal, right?
Naturally, my first instinct was to reach for the ‘trusty’ recursive CTE—after all, it’s the Swiss Army knife of SQL. It worked like a charm at first, slicing through the problem, delivering those monthly breakdowns. But as the dataset grew, so did my headaches. The queries became tangled, performance took a nosedive, and debugging turned into quite a puzzle.
Of course, there’s always more than one way to skin a cat (though no cats were harmed in the making of this dataset), and I soon realized that maybe, just maybe, the recursive CTE wasn’t the knight in shining armour I’d hoped for. So, I began exploring alternative methods—ones that were more efficient, easier to maintain, and didn’t require a detective to figure out. This is the story of how I ditched the recursive approach for something more streamlined, robust, and, frankly, less of a headache.
Initially, the recursive CTE was employed to generate a continuous list of months for each project, starting from the project start date and continuing until the end date. This helped in tracking invoices, calculating cumulative revenue, and forecasting future WIP (Work in Progress).
While recursive CTEs are functional, they often come with performance challenges, especially when scaling or handling large datasets. They also add complexity to the query and can be difficult to troubleshoot or optimize.
The logic now relies on Snowflake’s sequence generator using seq4() combined with a date-spine to generate the month breakdown.
row_number() using seq4() function). Each number represents a month offset from the project start date, making the query non-recursive.Recursive CTE:
-- Generate month breakdown using seq4()
month_series as (
select
p.id,
p.project_code,
p.account,
p.account_code,
p.account_name,
p.budget_amount,
p.project_start_date,
p.project_end_date,
p.description,
p.type,
p.type_description,
p.classification_description,
dateadd(month, seq.seq, date_trunc('month', p.project_start_date)) as project_month
from projects p
join (
select
row_number() over (order by seq4()) - 1 as seq
from
(select seq4() as seq from table(generator(rowcount => 10000))) as t
) seq
on seq.seq <= datediff(month, p.project_start_date, coalesce(p.project_end_date, last_day(current_date)))
)
Let’s break it down and check the join:
join (
select
row_number() over (order by seq4()) - 1 as seq
from
(select seq4() as seq from table(generator(rowcount => 10000))) as t
) seq
on seq.seq <= datediff(month, p.project_start_date, coalesce(p.project_end_date, last_day(current_date)))
This generates a sequence of numbers (representing months) using row_number(), which then gets joined to each project based on the number of months between the project start and end dates.
It replaces the need for recursion by simply creating a numerical series that aligns with the project’s timeline. It’s an easy-to-understand, flat logic that removes the need for complex recursive traversal.
One of the key challenges was handling Work in Progress (WIP) calculations and distributing values across time. This requires calculating cumulative revenue, costs, and projecting WIP evenly across remaining months for each project.
Using a Sequence Generator made further calculations more efficient. Window functions now work seamlessly across the sequence of months without needing recursive logic, further boosting performance. The simplified logic allowed easier integration of Work in Progress future calculations, and made the breakdown across months more accurate and easier to maintain.
While recursive CTEs may be extremely useful for specific tasks, leveraging built-in Snowflake features like sequence generators can lead to cleaner, more scalable, and robust solutions.
Happy Days!
Thank you for reading this blog.
Also check out our other blogs page to view more blogs on Power BI, Tableau, Alteryx, and Snowflake here.
Work together with one of our consultants and maximize the effects of your data.
Contact us, and we’ll help you right away.