I am writing this blog between projects. It's one of those "bench" moments that I've learned to use for reflection and skill-building. This time, I want to share something I've been thinking about for months: how AI has changed the way I work as a data engineer.
This isn't a think piece about whether AI will replace us. It won't. But it has changed what a productive day looks like for me, and I think it's worth being honest about how.
The biggest mistake I used to make with AI tools is treating them like Google. You type a question, you expect a perfect answer. That's not how it works.
I treat AI as a peer. A very fast, very knowledgeable colleague who sometimes confidently tells me things that are wrong. Just like a real colleague, the quality of the conversation depends on how well I communicate what I need.
This means I don't just ask "write me a dbt model." I provide context: the schema, the business rules, the constraints. I ask for an outline before the code. I correct its assumptions. I iterate. A good result usually takes 3-5 exchanges, not one.
High-value output is negotiated, not requested.
Here are the tasks where AI saves me the most time on a project:
On my last project, I worked with dbt and Snowflake at a large logistics company. The codebase had plenty of SQL that had grown organically over time. Triple-nested joins, hardcoded schemas, and no modularity.
AI is great at this kind of grunt work. I paste in a messy query, explain the target architecture (staging → intermediate → mart), and ask it to propose a decomposition plan. It gives me a starting point in minutes that would've taken me an hour to sketch out manually. I then review, adjust, and iterate.
The key word here is starting point. I never deploy AI-generated code without reviewing it. But having that first draft to react to? That's where the time savings come from.
This is probably where AI has the highest ROI for me. Writing schema.yml files with proper column descriptions, not_null tests, unique tests, and relationship tests is important but repetitive. AI handles it well because the patterns are predictable.
I provide the model SQL, and I ask for the full YAML block with business-value descriptions. Not just "the order date" but "the UTC timestamp when the customer completed checkout, used as the primary date dimension in revenue reporting." That level of description matters for the team, and AI can generate it faster than I can type it.
On the same project, I built a dbt macro that selected the Snowflake warehouse size based on the data timeframe to process. The logic wasn't complex, but getting the Jinja syntax right with all the edge cases (NULL handling, default values, variable scoping) took some back-and-forth with AI.
I've found that AI is particularly helpful when I know what I want to build but need help with the how of a specific syntax or framework.
When I'm stuck on a query that doesn't behave as expected, I've started asking AI to "explain lines 15-20" rather than staring at the code for twenty minutes. This targeted approach works much better than asking "what's wrong with this code?"
I also use it as a first-pass code reviewer. I ask it to act as a skeptical reviewer and find three reasons the logic might fail in production. It doesn't catch everything, but it catches the obvious things. Missing WHERE clauses, potential NULL issues, non-idempotent operations, which frees up my human reviewers to focus on the business logic.
Being honest about the limitations is just as important as knowing where to use it.
It's a yes-man. AI tools are trained to be helpful and agreeable. If I ask "is this the best approach?" it will almost always say yes. I've learned to never give it my solution first. I ask for its approach, then compare. The quality of feedback improves dramatically this way.
It hallucinates. On my last project, the team used GitHub Copilot because the premium version doesn't use customer code for training (a strict company policy). Even with Copilot, I've seen it suggest functions that don't exist or were deprecated versions ago. You always need to verify against the actual documentation.
It drifts. In long conversations, AI gradually forgets the rules you set at the beginning. I told it to use snake_case consistently, and by message 10, it had switched to camelCase. The workaround is simple: either use a custom system prompt (like Gemini Gems) to lock in your standards, or add a periodic reminder to the conversation.
Security is your responsibility. I never paste API keys, passwords, or PII into any AI tool. For sensitive projects, I anonymize table and column names before sharing code. This is a non-negotiable habit.
If I had to summarize how AI fits into my daily work, it looks like this:
I start a task by dumping context into the AI: schema, business rules, constraints. Then I ask for an outline or approach, not the final code. I review the approach, correct what's wrong, and only then ask for the implementation. Once the code is there, I refine it iteratively: fix a join key here, add idempotency there, improve the formatting last.
This process (sandbox, refine, chain) consistently produces better results than a single "write me the code" prompt. It takes 10-15 minutes of iteration, but it replaces what would have been 1-2 hours of writing from scratch.
In my previous blog post, I mentioned a conversation with an in-house Data Architect who recommended gaining more development experience to eventually move into a strategic role. AI is accelerating that path.
By automating the repetitive parts of development (boilerplate code, test generation, documentation) I have more time to focus on architecture decisions, stakeholder conversations, and understanding the business problem. These are the skills that lead to a strategic role, and AI is freeing up the bandwidth to develop them on the job rather than only during bench time.
I don't see AI as a threat. I see it as a tool that shifts my value from typing code to making decisions about code. And that feels like the right direction.
If you're a data engineer who hasn't yet integrated AI into your workflow, here's what I'd suggest based on my experience:
Fun fact: traffic on StackOverflow has seen sharp declines since the release of GitHub Copilot in October 2021 and ChatGPT in November 2023. The way we find answers is changing. The engineers who learn to work with AI, while keeping their critical thinking sharp, will be the ones who thrive.
At DDBM, we help companies build modern data architectures using tools like Snowflake, dbt, and Matillion. If you're curious about how AI-augmented data engineering could help your team, feel free to reach out.