Context Engineering: The Bottleneck to Better Continuous AI

Back in 2019, I gave a presentation about building contextual AI assistants at the O'Reilly AI Conference. While AI has progressed dramatically since then, one challenge has remained constant: the art and science of providing the right context at the right time.
Today, as platform teams at large organizations explore Continuous AI, context engineering has emerged as the primary bottleneck preventing them from automating more of their software development workflows.
Context engineering isn't just about writing better prompts—it's about systematically solving the problem of getting relevant information to AI when its needed. For platform teams responsible for enabling thousands of developers, this challenge is particularly acute.
I really like the term “context engineering” over prompt engineering.
— tobi lutke (@tobi) June 19, 2025
It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.
The context crisis in large organizations
Most organizations are sitting on vast amounts of valuable context that could make their AI dramatically more effective, but this information exists in a form that's largely inaccessible to both humans and AI.
The context you need was never written down. Critical decisions were made in Zoom calls and hallway conversations. The reasoning behind architectural choices, the lessons learned from past incidents, and the institutional knowledge that guides day-to-day decisions. Much of this exists only in people's heads. When those people leave, that context often leaves with them.
When context has been documented, it's scattered everywhere. Organizational knowledge is spread across Confluence pages, SharePoint sites, Jira tickets, Slack threads, email chains, and countless other systems. Even when developers know this information exists, they often don't know where to find it or how to search for it effectively.
The documented context isn't optimized for AI consumption. Traditional documentation was written for humans to read sequentially. AI systems need structured, specific, searchable information that can be quickly retrieved and understood in isolation. They need the right permissions to access it.
For platform teams, this means your AI tools are operating with a fraction of the context they need to be truly helpful. Your developers get generic suggestions instead of recommendations that understand your organization's specific patterns, constraints, and best practices.
The challenge of relevance at scale
Even when you have comprehensive documentation, determining what context is relevant for any given task becomes exponentially harder at enterprise scale.
Context windows create hard constraints. You can't just throw everything at AI and hope it figures out what's important. You need to make intelligent decisions about what context to include or point it towards, but those decisions require understanding both the immediate task and the broader organizational context.
Human intuition is not easy to document: A senior developer might instinctively know that a particular architectural decision from two years ago is relevant to today's feature request, but encoding that kind of contextual awareness into systems that serve hundreds of developers is incredibly difficult.
Bad context is worse than no context: When irrelevant or outdated information makes it into your context, you get context poisoning—where the AI becomes overconfident in incorrect information and propagates errors throughout your codebase.
Asking individual developers to manually gather context (copy / pasting documentation snippets, @file references, etc.) frequently doesn't work for platform teams trying to enable consistent, reliable AI assistance across large development organizations.
The maintenance problem
Context engineering isn't a one-time setup problem; it's an ongoing maintenance challenge that gets harder as your organization grows.
Information becomes stale quickly. Your API documentation from last quarter might be completely wrong after recent refactoring. Your architectural decision records might not reflect the lessons learned from the incident that happened last week. Unlike code, which often breaks visibly when it's out of date, context frequently degrades silently.
There's no clear ownership model. Who is responsible for keeping the context that feeds your AI up to date? In most organizations, this falls into the gap between development teams (who are focused on shipping features) and platform teams (who may not have deep domain knowledge about every service).
Context relevance changes over time. Information that was critical six months ago might be completely irrelevant today, but your AI systems don't automatically know this. You need systems for not just updating context, but for determining when context should be deprecated or removed entirely.
For platform teams, this creates a sustainability challenge. Building AI systems that works well initially is hard enough but building systems that continue to work well as your organization evolves requires a fundamentally different approach to context management.
When context engineering isn't enough
As context windows grow larger and retrieval systems become more sophisticated, there will still be situations where stuffing more context into prompts isn't the right solution.
Some knowledge is better encoded in model weights. If your organization has specific coding patterns that appear in thousands of files, or domain-specific terminology that shows up everywhere, fine-tuning models to understand these patterns might be more effective than trying to provide examples in context every time.
Cost and latency matter at scale. When you're serving AI assistance to thousands of developers making millions of requests per day, the cost of large context windows could add up quickly. It might be more economical to invest in model customization than to pay for massive context windows and long running agent explorations.
Quality vs. quantity trade-offs become critical. In large organizations, there's often so much potentially relevant context that you need sophisticated systems for ranking, summarizing, and compacting information. At a certain point, you might get better results from a smaller model that's been trained on your specific context than from a larger model trying to process that same context in real-time.
The key insight for platform teams is knowing when to solve context problems with better retrieval and when to solve them with better models. This decision depends on your specific constraints around cost, latency, privacy, and the nature of your organization's context.
Measuring what matters
The biggest gap in context engineering today is evaluation. Most organizations deploying AI tools have limited visibility into what's working and what isn't.
You need development data to improve context systems. The interactions between your developers and AI tools generate valuable signals about context quality. When developers reject suggestions, edit generated code, or repeatedly ask for the same type of information, that data can guide improvements to your context engineering systems.
Traditional metrics miss the point. Measuring response time or accuracy on synthetic benchmarks doesn't tell you whether your AI tools are helping developers ship better code faster. You need metrics that connect AI assistance to real business outcomes: reduced time to resolution for incidents, faster onboarding for new team members, fewer bugs in production.
Context problems surface as symptoms elsewhere. When your AI tools generate code that doesn't follow your organization's patterns, when they suggest deprecated APIs, or when they give advice that conflicts with your architectural principles, the root cause is usually inadequate context engineering. You need an instrumented system to tie these back.
Platform teams need systematic approaches to collecting this feedback and using it to continuously improve their context systems. This isn't just about better AI. It's about building organizational capabilities for knowledge management and sharing.

Building better context systems
For platform teams ready to tackle context engineering systematically, there are practical steps you can take today.
Start with rules. Instead of hoping AI will correctly learn your standards, encode them as explicit rules that can be stored alongside your code. Tools like Continue rules let you define context that gets automatically included when relevant.
Make more context accessible. Adopt MCP tools that allow your AI systems to query your existing data sources like Linear, Jira, Confluence, your service catalog, your incident management system, etc. Making context accessible is often more valuable than making it perfect.
Invest in context discovery. Your developers often know what context would be helpful but don't know where to find it. Adopt systems and tools that make it easy to discover and share relevant context like Continue Hub.
Measure the right things. Store your development data. Build dashboards on top of it and tie it to business outcomes that matter: developer productivity, code quality, time to market. Use this data to guide your context engineering investments.
Plan for maintenance. Context engineering isn't a project; it's an ongoing capability. Build systems and processes that can evolve as your organization grows and changes.
The organizations that figure out context engineering will have a significant advantage in the AI-native future of software development. Their AI tools won't just provide generic assistance. They'll provide assistance that understands and reflects their specific way of building software.
Context engineering is hard work, but it's the kind of hard work that platform teams are uniquely positioned to tackle. And for organizations serious about AI-native development, it's work that can't be avoided much longer.
If you are working on context engineering for Continuous AI in a large organization, I'd love to chat! You can find me (@tydunn) on the Continue Discord or reach out via this form.