226: The Eye of context (The Dungeon of martech architecture, part 2)

Logos of GrowthLoop, GrowthBench, MoEngage, and Knak with the caption 'Proudly brought to you by'.

What’s up folks, welcome to our 4 part series of crawling through the dungeon of martech architecture. You’ve arrived at Part 2: The Eye of Context.

Summary: A clean data warehouse is great and all, but agents operating on that data without anything to help them causes “believable nonsense.” Data quality stops agents from misrepresenting what the warehouse contains but you need context engineering to putt the right meaning, rules, and situational information in front of the model at the right moment. Agents also need a semantic layer and systems that can reason about why something happened.

In this Episode…

Recommended Martech Tools and Agencies 🛠️

We only partner with products and agencies that are chosen and vetted by us. If you’re interested in partnering, reach out here.

🔄 GrowthLoop: The agentic, composable CDP that drives compound growth by uniting your cloud data + AI into one marketing engine.

🔌 GrowthBench: Twilio’s top-tier consulting partner, turning your Twilio investment into a customer engagement engine

📧 MoEngage: Customer engagement platform that executes cross-channel campaigns and automates personalized experiences based on behavior.

🎨 Knak: Go from idea to on-brand email and landing pages in minutes, using AI where it actually matters.

Welcome back to the Dungeon of Martech Architecture.

You’ve arrived at part 2. If this is your starting point, check out part 1 where we cleared the first floor’s boss in 2 forms: The False Truth King in the CRM, and The Export Hydra that spread it everywhere. That said, if you already have a data warehouse, you might be able to start right here.

Here is your quick guide to the floors ahead:

Episode 1: CRM Gravity. You’ll conquer the source of truth and discover that the data warehouse replaces the CRM with portable audiences.

Episode 2: The Eye of Context. You’ll learn why AI fails without shared meaning, why context engineering is the layer between data and agent authority, and why the industry built the wrong kind of meaning infrastructure in 2012.

Episode 3: The Correlation Masquerade. You’ll escape the correlation trap and build the causal memory layer that separates agents that optimize correctly from agents that confidently scale the wrong behavior.

Episode 4: The Dispatch Tower. You’ll tackle the governance chaos of 30 vendors all claiming authority, and confront the interface decision that most organizations already made without realizing it.

Let’s start our descent.

FLOOR 2: THE EYE OF CONTEXT — AI Hallucinations, Data Quality, and Context Engineering

A dimly lit, eerie hallway featuring ornate, dark furniture, large mirrors, and two figures walking towards a purple-lit archway, surrounded by overgrown plants and a mysterious atmosphere.

The layout of the second floor down the dungeon of martech architecture actually looks pretty fancy. It’s cozy, it looks modern, the whole palace is lined with mirrors. But it’s a bit creepy because once you look a little closer at the reflections, you notice that some of the details are off. 

The boss on this floor is low key danger that sneaks up on way too many teams – not like the big flashy monsters from the past 2 floors.

Let’s say you have a new AI system running on your marketing data. You’ve got it producing stuff like scores, recommendations, campaign ideas. Initially, it actually looks solid.

  • There’s no obvious AI sentence structures in the summaries, they read well.
  • The scores next to accounts seem to make sense: higher ones next to well known brands and lower ones are gmail accounts.
  • The campaign ideas are actually pretty fresh, you can tell that it’s tailored for your ICP.
  • Every output is delivered with impeccable confidence.

So the next step is asking yourself… how would you know if it was wrong?

There’s a lot of obvious hallucinations that you probably catch when you chat with GPT or Claude, like totally inventing stuff. I’m talking about the details. The kind of wrong that passes the first glance. 

We’ve all seen that viral post on r/analytics about a company that found out AI has been making up analytics data for 3 months. Whether this post is from a real story or not, some versions of this are happening inside companies today. 

Screenshot of a Reddit post discussing an AI tool that has been generating false analytics data for three months, causing concern and frustration among users.

But this is happening everywhere right now. Even at some of the top AI companies on the planet. 

I’ve talked to technical marketing leaders that have greenlit agentic tools at their startups before the data definitions were settled. One of them called it ‘believable nonsense’, and that term kinda stuck with me. It’s the most dangerous form of hallucination, it sneaks up on you.

This floor is harder than the last because the traps on the previous level were visible, once you knew what to look for: CRM exports nobody trusted, audience logic duplicated across platforms, copies drifting from the original. You felt that, you saw that. 

This floor’s failures are designed to look like success, until you dig into the details and look under the hood.

Let’s look at the origins of The Hallucination Oracle boss. 

Why AI Produces Believable Nonsense

A colorful, robotic parrot perched on a branch with flowering yellow leaves against a gradient background of deep blue and pink.

Humans are wired to trust smooth, confident talkers. It’s actually baked into our evolution and how our brains develop from infancy. Studies on babies and brain scans show this is an innate thing that kicks in early.

A person who sounds certain usually knows something, right?

LLMs and AI systems break this calibration, they produce fluency without necessarily producing correctness. At first glance, the output sounds right for structural reasons, not evidential ones. And once something sounds right, we engage with it differently. We forward it. We build on it. We present it to stakeholders who don’t have the context to question it.

This is similar to the The False Truth King boss from the first floor in episode 1, but this is at the intelligence layer: output that has been processed, synthesized, and returned with all the structural markers of a trustworthy answer, but the problem is the reasoning underneath it is hollow.

That’s Jason Dobbs, Head of Marketing & GTM Engineering at Kumo. He greenlit agentic analytics and predictive workflows at his startup before the team had settled on shared data definitions. The system produced outputs that looked reasonable right up until someone started asking follow-up questions:

“What made it dangerous wasn’t really like the obvious hallucination. What we were seeing looked polished enough to be operational. At first glance the scores looked precise, the summary sounded coherent, the recommendations felt data-backed. But the moment you ask the simple follow-up questions — why did you choose this account? What data drove this decision? — the logic started to thin out. The lesson wasn’t that the data was bad or the warehouse was bad or the model was bad. What was wrong is we were trying to automate ambiguity. We were asking AI to solve for confusion that we hadn’t yet ourselves solved for internally. And once you do that, you enter the danger zone because the failure is essentially believable nonsense.”

JASON DOBBS, Episode 221

This is some scary stuff right? When this believable nonsense gets trusted long enough to make it into a campaign, a decision, a board slide. Obvious hallucinations are easy to catch. Confident, polished, data-backed nonsense gets through more often than we think.

The good news is that we already have a weapon perfect to slay this boss, we shaped it in the last episode: the data warehouse. It houses data. Data is how we defeat believable nonsense… but we need to enhance it.

BOSS BATTLE: The Hallucination Oracle

An ominous figure, the Hallucination Oracle, sits on a throne in a dark, cluttered room filled with retro technology and flickering screens, with the title 'BOSS BATTLE: THE HALLUCINATION ORACLE' displayed prominently above.

The boss on this floor is The Hallucination Oracle, the ultimate creator of believable nonsense: output that sounds right, passes the first look, makes it into the deck, and gets trusted before anyone asks a follow-up question.

You have 3 potions.

*Inventory check.* 3 potions before the boss.

Game inventory screen displaying three potions: an Uncommon Data Quality Potion, a Rare Context Eng Potion, and an Epic Semantic Potion. The right side features a description of potions and their purpose.

[UNCOMMON POTION] Data Quality: Cleans and governs the foundation. Stops the agent from lying about what the warehouse contains. Required before the second potion works.
[RARE POTION] Context Engineering: Supplies the right meaning at the right time. The gap between a governed warehouse and a trustworthy agent.
[EPIC POTION] The Semantic Layer: Represents domain concepts, relationships, constraints, and the inference rules agents need to reason about your business. The upgrade the industry skipped in 2012.

Data QualitySemantic LayerContext Engineering
AskIs the data clean and trustworthy?Do we have shared definitions for what the data means?What extra meaning, rules, and situational information does the model need right now?
FocusMissing values, duplicates, schema consistency, and lineageMetrics, entities, relationships, business definitions, and governance rules that create a common language across the organizationSupplying the right context at the right time so the AI interprets data correctly instead of only processing raw facts

Let’s pull out the first one.

Data Quality: When Agents Read Your Messy Data

A robotic figure with illuminated eyes sitting at a table, eating spaghetti with a fork. The scene is set in a dimly lit, industrial environment, surrounded by scattered papers.

Good old data quality. If you work in any ops or data role, you’ve pitched a data quality project before, and you’ve had it deprioritized, many times. AI changed that.

How AI Made Data Quality a CEO Priority

AI surfaced bad data problems and made fixing them feel urgent. For the first time in most organizations’ histories, database cleanup became something executives wanted to fund.

What a time to be alive folks. It’s almost funny… almost. We spent decades trying to get a budget for data quality. Then AI arrived, and suddenly cleaning the database became a prerequisite for the thing the CEO just announced as a strategic priority.

That’s Danielle Balestra, Fractional Marketing Technology Executive.

“Everyone wants to leverage this technology. It’s about making sure it’s not hallucinating. Guess what everyone’s focusing on? Their data. People are actually starting to clean their databases.”

DANIELLE BALESTRA, Episode 163

That’s Tiankai Feng, Data & AI Strategy Director at Thoughtworks and Author of Humanizing Data Strategy.

“I call it the comeback of data quality. Before, data quality was not sexy. Everyone complained about it; nobody wanted to take care of it. Now everyone realizes AI with bad data is just bad AI.”

TIANKAI FENG, Episode 179

The comeback is real, and for most organizations it’s long overdue.

That’s Lourenco Mello, Director of Product Marketing at Snowflake.

“the reality is one thing that will never change is the garbage in garbage out component of this, right? Which is if your data foundation is garbage, so too will your AI. You can have a beautiful UI, you can have the right kind of prompts in there, but the insights that’s going to surface are going to be directly linked to what you have on the back end.”

LOURENCO MELLO, Episode 142

What Data Quality Actually Means

For me, data quality looks like this:

  • your records are accurate and deduplicated
  • your field definitions are agreed upon across every team that reads them
  • your data pipelines run on a known schedule with a named owner
  • when something looks wrong you can trace it back to the source
  • two people asking the same question get the same answer, and that answer has an author

Keith Jones, who co-runs GTM systems at OpenAI and has built and broken more martech stacks than most, states the prerequisite:

“If you’re going to try to do anything involving agentic orchestration, you have to be really confident in the less sexy elements. The things that if we’re all being honest, we know we need to do and we probably could do better. Things like having really clear, understandable definitions of what the different data means in your organization. Having a consistent level of syntax for that data — when I say X equals Y, I need to say that every single time. And then you need to make sure your data is as consistent, tagged, categorized, codified as much as possible. Because the agents — if your data model is a mess, if field names mean different things in different reports, if you’re still reconciling accounts across three tools because no one agreed on a single syntax — agents will not fix that. They will magnify it. They will move fast and confidently in the wrong direction.”

KEITH JONES, Episode 170

And Austin Hay, Martech, Revtech, and GTM systems advisor, and AI builder, writer, and ex-founder. He talked about the tension between new capabilities and old foundations:

“Unstructured data is really cool. But guess what? Unstructured data is not cool if it has to eventually be sent into a structured format, and you don’t have some concept of doing that. Some of the best companies are maintaining what’s good about old data structures while allowing and importing the ideas from agentic experiences.”

AUSTIN HAY, Episode 151

How to Enforce Data Quality Standards Across Teams

Getting the foundation right requires more than documentation. Ana Mourao, who managed data infrastructure at various companies, built an enforcement mechanism for a global enterprise. She called it the data template, the formal definition of what the CDP would accept, and she built the consequences directly into the workflow:

“The data template is the language that the CDP understands. If it doesn’t come formatted in that language, it won’t make its way to the unified profile. The data template is a living document.”

ANA MOURAO, Episode 159

Forms and data sources that matched the template’s structure flowed automatically into the unified customer profile within 36 hours. Data that didn’t match got stuck. It required someone to go back to the source, reformat manually, and upload through a separate process. The friction was intentional for Ana. Teams that paid attention at the beginning got clean, automatic data flow. Teams that ignored the template got extra work every time. The behavior changed fast.

Ana worked with every regional team to make the template something they helped build. When a regional team needed to add a new field (say, collecting occupation data for a new campaign), the template evolved. It’s a living document because the business is a living thing. The standard has to change when the business changes, consistently, everywhere, at the same time.

Potion 1 (data quality) enhances the weapon we wield to weaken the boss but doesn’t finish it. Clean, governed, traceable data stops the agent from lying about what the warehouse contains.

Screenshot of an inventory menu displaying various potions, including Data Quality Potion, Context Eng Potion, and Semantic Potion, along with their descriptions and rarity classification.

The Hallucination Oracle boss has a dirty second trick: an agent reading perfectly clean data and still misunderstanding what it means, acting outside the scope it was given, or producing a confident answer that passes the first glance for exactly the wrong reasons. Potion 2 (context engineering) is built for that gap. 

Take it out.

Context Engineering: What It Is and Why It’s Not the Same as Prompt Engineering

A large, stylized skull with a vibrant, intricate brain design inside, featuring a figure standing on scaffolding and examining the brain, set against a dark, industrial background.

Clean data is necessary. Teams that fix their schemas, tighten their pipelines, and eliminate duplicates often find the AI still doesn’t work reliably. The real problem is what the agent does with the data.

In my research in context engineering, I came across Chris Lema, who has spent 25 years in tech leadership and now builds and debugs AI systems in enterprise deployments. 

Screenshot of the article titled 'AI Context Failures: Nine Ways Your AI Agent Breaks' by Chris Lema, dated February 9, 2026, with a brief description about AI context failures.

He wrote an article where he describes a pattern he sees in nearly every organization that runs into this wall. A team invests weeks in the retrieval layer, getting the right documents, the right CRM data, the right knowledge base, and gets the context pipeline working beautifully. Then the agent produces garbage. The team blames the data. They add more documents. They tweak the retrieval. They adjust the chunking strategy. None of it helps. Because the data was never the problem.

Most AI context failures happen after retrieval. The agent has the right information. It just does the wrong thing with it. And until teams can name the specific way it’s going wrong, they keep fixing the wrong layer.

A lot of this shows up tactically as RAG: retrieval-augmented generation. But RAG is the mechanism, not the strategy. The strategy is deciding which context deserves to be retrieved, when, for which agent, under which constraints.

“The data isn’t the problem. What you’re doing with it is.”

CHRIS LEMA, chrislema.com, February 2026

Context engineering controls what the AI knows; prompt engineering controls how you talk to it.

Prompt EngineeringContext Engineering
InputHow you talk to the AIWhat the AI knows when it answers
OutputCan make it sound smarterMakes it more dependable

The State of Martech 2026 report named “context” the word of the year for the industry, and it applies directly to what’s on this floor. 

Context, the report argues, is the difference between AI that generates plausible output and AI that creates meaningful value. It’s the difference between “send an email” and “send the right email, to this customer, at this moment, with awareness of what they’ve already done, what they’re trying to accomplish, what we promised them, what we’re allowed to say, and what the brand should sound like while doing it.”

Illustration comparing generic email campaigns with personalized email strategies, highlighting the benefits of tailored communication.

The report describes 3 kinds of context that have to come together before that sentence becomes possible.

A Venn diagram illustrating the intersections of Company Context, Customer Context, and Systems Context, highlighting their distinct components related to goals, strategies, and customer preferences.
  1. Customer context: the customer’s situation, intent, history, and moments that matter. What you’d want to know about why a specific person is engaging with you right now.
  2. Company context: your goals, strategy, brand, processes, capabilities, and governance. What you and your people know (or should know) about who you are and how you operate.
  3. Systems context: what your stack can actually access, connect, and deliver. The deep customer insight in your data warehouse and the sharp brand strategy in a Google doc only matter if a system can act on them at the right moment.

Where all 3 converge (your goals, your customer’s needs, and your systems’ ability to deliver in real time) is what the report calls Golden Context. Value engineering identifies the value. Context engineering makes it actionable. Together they’re what the 2026 marketing architecture actually centers on.

Venn diagram illustrating the intersection of Company Context, Customer Context, and Systems Context, with 'Golden Context' highlighted in yellow at the center.

Where ID Resolution Complicates Things

Sonal Goyal, founder of Zingg and author of the Learning From Data newsletter wrote about the piece of the context problem that keeps getting skipped: most organizations still don’t have clean customer identities. In the AI era, that’s the gap between golden context and garbage context.

An infographic titled 'Identity Resolution: The Universal Truth That Does Not Decay', illustrating the concept of identity resolution in marketing, featuring various components like CRM, support, billing, and web activity. It includes a central profile for an entity named Timothy Chen and layers such as semantic, memory, decision, orchestration, and growth.

Most stacks feed AI 3 different versions of the same customer before a model ever runs. 

  • In the CRM, Miriam Dom is miriam.dom@cashmeregoatfarm.com. 
  • In the e-commerce platform, she’s prepotentes.mom@gmail.com with 3 purchases this quarter. 
  • In the support system, she’s miriam_d with 5 open tickets and an urgent issue. 
  • In the MAP, she signed up for the newsletter under miriam.dom+dontemailme@cashmeregoatfarm.com
Infographic comparing 'Garbage Context' and 'Golden Context' for managing customer data, highlighting fragmented identity versus unified profiles.

Every system has a different stranger in it, and none of them know they’re looking at the same person. 

“When a customer exists as three separate records across your systems, you don’t have customer context. You have customer fragments.”

SONAL GOYAL, Learning From Data, May 2026

Scrunch ad

Your buyers are already asking ChatGPT about your brand. Do you know what it’s saying?

Every day, ChatGPT, Claude, Gemini, Perplexity decide which brands get named, which pages get cited, and which products gets recommended—often before a buyer ever lands on your site.

Scrunch makes your brand the answer. We’ll show you how AI talks about your brand today, where the citation gaps are, and the page-level fixes that put you back in the answer.

Curious what it says about your brand?

→ See how AI reads my site

What the Context Bundle Requires

Jason Dobbs describes what that layer has to include before any agent gets real authority over a workflow:

“Prompt engineering can make something smart. Context engineering is what makes it dependable. “The warehouse is often where the richest relational signal already lives. The issue isn’t usually that the business has zero signal. The issue is that teams jump from “here’s a big pile of our data” to “let’s throw an agent on top that should be able to act on any scenario without defining that layer in between.”

JASON DOBBS, Episode 221

That layer in between is the work. Before any agent gets real authority over a workflow, Jason argues it needs a context bundle. The first 3 items are prerequisites that data quality has to answer before context engineering can start:

  1. Shared definitions that sales, marketing, and rev ops all actually agree on
  2. Trusted access to the right records with the right joins and freshness
  3. A named owner who is accountable when something goes wrong

The last 2 are where context engineering actually begins:

  1. Clear authority boundaries on what the system can do without human review
  2. An eval path to validate outputs against historical examples before the system goes live

Think of it the same way you’d onboard a new hire: here’s what we mean, here’s what you have access to, here’s when you escalate.

Lindsay Rothlisberger, Director of GTM Innovation at Zapier, built this layer from scratch for her go-to-market organization. Her team started with shared definitions. The RevOps analytics lead went through every key term and documented it so agents would actually have something to work from:

“We put a lot of time into context around the data definitions. I’d say that is such an important place to start. We had our RevOps person who works specifically on analytics really go through and build a lot of context into lead definitions, opportunity definitions. If someone asks for a lead report, make sure you’re asking them; are they asking for an MQL report? Are they asking for something else?”

LINDSAY ROTHLISBERGER, Episode 223

Her team organized the shared brain into 2 distinct layers. 

  1. The slow-moving layer (company strategy, ICP, data definitions, playbooks) requires deep upfront investment and changes infrequently. 
  2. The fast-moving layer (daily decisions, routing changes, experiments, anything sitting in Slack threads and meeting notes) is the part most teams haven’t solved yet. 

The slow layer can be built in a sprint. The fast layer is where context rot lives.

Slow-Moving LayerFast-Moving Layer
ContentCompany strategy, ICP, data definitions, playbooksDaily decisions, routing changes, experiments, Slack threads, meeting notes
InvestmentDeep upfront workOngoing operational effort
Change frequencyInfrequentContinuous
Core problemDocumentation problemDecision provenance problem
Status for most teamsCan be built in a sprintLargely unsolved

The State of Martech 2026 report describes context as operating in 6 pace layers, each with its own decay rate.

  1. Moment Context changes by the second: your customer’s real-time state, the current query. 
  2. Session Context spans minutes and hours: the actions they’ve taken, the conversation, what they’ve consumed. 
  3. Journey Context shifts over days and weeks: buying stage, engagement pattern, active intent signals. 
  4. Relationship Context spans months and quarters: account history, preferences, LTV. 
  5. Company Context shifts over quarters and years: brand, strategy, capabilities, governance. 
  6. Market Context evolves over years: industry structure, macro trends, regulations.

The more granular and timely the layer, the faster it loses value. An intent signal from a morning browsing session may be worthless by afternoon if the customer has moved on or simply lost interest. The report calls this the context distribution problem: a real-time signal that can’t reach the right system before it decays is worthless regardless of where it lives.

The part that ruffles everything is that AI accelerates the oscillation across all of these layers. It lets systems respond faster in the moment, but only if the slower layers are stable, accessible, and aligned underneath. 

  • A real-time agent that knows the current query but nothing about the customer relationship, the company’s commitments, or the governance boundaries is purely reactive. 
  • Fast without grounding is how an agent produces customers who are genuinely angry about an experience that looked personalized on the outside.

Golden Context, according to the report, is when the fast layers and slow layers move together: immediate enough to be relevant, grounded enough to be trusted. That 2-layer structure is the right architecture for any organization trying to manage context at this level.

An inventory screen showcasing various potions, including an epic potion titled 'The Semantic Layer,' alongside a data quality potion and a context engineering potion, each with distinct labels and color coding.

Potion 2 (context eng) closes the gap between what the warehouse contains and what the agent understands. There is a third potion (the semantic layer), and it is the layer the industry got wrong 15 years ago. Take it out.

Why the Industry Built the Wrong Semantic Layer in 2012

A dimly lit library filled with bookshelves, featuring a staircase leading to an upper level. A person sitting at a desk with a computer, wearing headphones, surrounded by vintage technology and illuminated by a hanging lamp.

The Semantic Layer Is an Organizational Problem

Scott Brinker, who co-authored a March 2026 report on AI-era architecture, described the organizational consequence of skipping this work: without a shared semantic layer, every agent becomes its own island of interpretation. 

You can have perfect warehousing, clean data, and well-scoped prompts, and still end up with 3 agents taking 3 different actions based on contradictory assumptions about what “customer” means, or what “pipeline” counts, or what this quarter’s ICP actually is. The shared definition problem is organizational and it’s nothing new for RevOps teams balancing marketing and sales definitions. Each team encodes meaning into the data they own, and agents inherit those local dialects.

David Chan, Managing Director at Deloitte Digital is one of my favorite voices in martech. I had the pleasure of interviewing him in the earlier days of the podcast and probably should have him back on since I’m always reading his blog. At Deloitte he runs their CDP and marketing transformation practice.

He actually wrote a detailed response to that same report. His piece argues that Scott’s composable canvas vision is directionally right, but the friction points blocking it are organizational, not technical. He names 4:

  1. Decision paralysis: when every component is swappable, teams freeze because nothing forces a choice. 
  2. Semantic misalignment: internally, sales, marketing, and rev ops can’t agree on what “customer” or “pipeline” or “qualified lead” actually means, and externally, 15,000+ SaaS vendors encode their own incompatible versions of those definitions with no industry standard to align them. 
  3. False novelty: the new architecture may be a more flexible expression of what already existed, not the paradigm shift it’s being sold as. 
  4. And governance overload: AI-generated custom software creates accountability and maintenance problems that most teams aren’t set up to handle. 

The semantic misalignment point is really important to this upcoming boss battle. David says:

“We’ve spent the last two decades solving system integration. And we’re about to spend the next two decades failing at semantic design. Book it.”

DAVID CHAN, Re: The New Martech Stack for the AI Age, March 2026

His conclusion: the constraint is agreement. The technology is the easier half.

Knowledge Architecture: The Bet the Industry Got Wrong in 2012

The deeper issue is one that took 15 years to surface: the industry built the wrong kind of meaning infrastructure.

I’m a big fan of the Metadata Weekly newsletter, recently rebranded to Context & Chaos. The author behind is Jessica Talisman, a semantic engineer and information architect who’s led architecture at big brands like Amazon and Adobe before running her own consulting practice. 

In January 2026, in an article titled ‘Ontologies, Context Graphs, and Semantic Layers: What AI Actually Needs in 2026, she argued that in 2012, 2 bets were placed simultaneously. One gave us consistent dashboards. The other gave us drug discovery AI and reasoning systems that intelligence agencies trust with life-or-death decisions. 

BI Industry BetHealthcare / Life Sciences Bet
TechnologyLookML / metrics layersFormal ontologies and knowledge graphs
GoalDefine metrics once, govern centrallyRepresent domain concepts, relationships, and constraints
ProducedConsistent dashboardsDrug discovery AI, intelligence agency reasoning systems
For agentsAnswers what X isAnswers why X was allowed to happen

The assumption behind the metrics bet, that consistent calculation would produce consistent meaning, turned out to be wrong. Knowing that revenue is calculated as SUM(order_total) WHERE order_status = ‘completed’ doesn’t tell you why revenue dropped in Q3, which customers are at risk, or what to do about it. Semantic layers modeled metrics. They failed to represent an organization’s reality.

The distinction matters for agents in a way it never did for dashboards.

Semantic LayerContext Graph
AnswersWhat X isWhy X was allowed to happen
ModelsMetrics and calculationsConcepts, relationships, constraints, and inference rules
Built forDashboardsAgents
LimitationConsistent calculation does not create consistent meaningThe layer the industry skipped building

Without that second kind of infrastructure, decision traces stay trapped in Slack threads and email chains, institutional memory no system can index and no agent can read from.

“The era of metrics-based thinking is coming to a close. AI systems need to understand what your domain is: the concepts, the relationships, the constraints, the inference rules. It’s a knowledge architecture problem.”

JESSICA TALISMAN, Context & Chaos, January 2026

The teams who do this well in the next decade will spend more time on meaning design than on pipeline engineering. The pipeline is necessary. Meaning is what makes it useful.

The practical starting point is less intimidating than the terminology implies. Pick one entity your agents act on (usually “customer” or “opportunity”) and define 5 things about it: 

  1. what it is, 
  2. what states it can be in, 
  3. what relationships it has to other entities, 
  4. what decisions get made about it, 
  5. and who has authority to change those decisions. 

Write it down. That document is the seed of a context graph. Most teams have never done this manual exercise for even one entity. Doing it for one forces every assumption about that entity into the open, and those assumptions are usually where the contradictions between teams live.

How Context Rot and Fragmentation Break AI Agent Performance

An abandoned retro computer setup in a surreal landscape, surrounded by colorful mushrooms and vibrant purple clouds, featuring old monitors, a keyboard, and disorganized wires.

Just build a context bundle sounds way easier than it is in practice though. Phase 2 of our The Hallucination Oracle boss on this floor is less visible and more structurally damaging. Context rot produces answers that were right when the context was current and degrade silently as the organization changes around them. The slow layer stays frozen. The fast layer keeps moving. The gap accumulates invisibly until an agent acts on assumptions the team stopped believing months ago.

Fragmentation is the spatial version: the right context exists across the stack in pieces, with no single system holding all of it at once. Each piece is accurate. The synthesis never happens. A governance decision in a Slack thread, a campaign exception in a Google doc, an ICP update in a deck no one can find: the agent acts on whatever it reaches first. Build the shared context layer before the boss scales. The 4 questions above are the diagnostic. If you cannot answer all of them, the agent cannot either.

The 2 Failure Modes: What Context Rot and Fragmentation Look Like in Practice

In the same article I referenced earlier, Chris Lema catalogued a bunch of failure modes when it comes to context. The two that stood out to me as the most expensive are:

  1. Context rot: Early in a long interaction or multi-session workflow, the agent receives critical context. As more input accumulates, that early signal gets buried under volume and stops surfacing when it matters. The agent still technically has the information. It just can’t surface it when it matters.
  2. Context fragmentation: The agent has all the relevant pieces distributed across its input but fails to synthesize them into a coherent picture. Each fact is accurate on its own. The agent can retrieve any individual piece. It never connects them into the insight that would actually be useful. Chris specifically calls this the most costly of the 9 failure modes he has catalogued.

Both failures have the same structural cause: context treated as an infinite resource when it isn’t. The models will keep improving. Context windows will keep expanding. That does not fix the structural problem. The more impactful questions about the LLMs are:

  • Are they getting the right context?
  • Is it surfaced when it matters most?
  • Can it synthesize it into a coherent picture?
  • Is it fresh?

BOSS BATTLE: Rotten Context Mage

A digital illustration of a zombie-like mage with a tattered dress and a large pointed hat, standing amidst piles of old computers and mushrooms. The background features a glowing, ominous circle, and the text 'BOSS BATTLE: ROTTEN CONTEXT MAGE' is prominently displayed in bold lettering.

The second version of the boss on this floor is context rot and fragmentation. 

We actually don’t need anything new for this battle, as ops people most of us have an innate knack for organization. 

*Inventory check.* 

Screenshot of a skills menu in a game, featuring the 'Organised Context Brain' and 'Decision Provenance' skills, with descriptions and status indicators.

[COMMON SKILL UPGRADE] Organised Context Brain.
[UNDERRATED SKILL] Decision Provenance

How to Build a Shared Context Layer for AI Agents 

A futuristic scene depicting a large, brain-like entity with multiple tendrils, suspended in a dimly lit industrial setting, surrounded by smaller robotic figures.

Context rot and fragmentation come from context that was never organized. Lindsay’s team built a shared brain in Google Drive as the first step: structured folders at the company, department, team, and working group level, with all existing documentation converted to markdown files before the folders were populated with anything new.

“We built a shared brain in Google Drive to start. Our initial version was a Google Drive, and it’s very structured folders. Company level, department level, team level, and working group level. We started by populating context from across the org into those Google Drive folders. And first, I will say converting the materials in those folders into markdown files. That was our first iteration of building the shared brain.”

LINDSAY ROTHLISBERGER, Episode 223

The sprint took 4 weeks, and most of the work was converting documentation the organization already had into a format agents could actually use, not writing new content from scratch.

How to Keep Your AI Context Layer From Going Stale

Lindsay Rothlisberger is working through these problems at Zapier from the operational side. Her team started being deliberate about closing out Slack threads with explicit decisions and documenting what was agreed in meetings, not just action items but the actual reasoning behind them. The joke about it captures what it actually requires:

“My partner is sort of in a similar field, and we joke: ‘Are you gonna have to start saying at the end of meetings, like, let the record show that…’ In a way that an agent can understand. What I’m thinking about is having an agent that collects all that context from Slack, from meeting notes, and puts it into a decision log at the end of the week. So we can see: here are all the decisions that were made or things that were discussed. Then having a human be able to review that and say — okay, this is an update we wanna make, this is not — and having agents then update the context layers.”

LINDSAY ROTHLISBERGER, Episode 223

The slow context layer (strategy, ICP, definitions) is a documentation problem. The fast layer (routing changes, pricing decisions, experiment outcomes) is a decision provenance problem. Both have to be solved for an agent to work from anything more than a frozen snapshot of what the organization used to believe.

Decision Provenance: The Memory Your Context Layer Is Missing

There is one more component the context layer needs, and it’s a new class of company data: decision provenance (first introduced in 2018). The reasoning chains and approval histories that explain why consequential choices were made. 

  • Why did the team hold the campaign? 
  • What service incident justified the exception? 
  • What precedent did leadership invoke when the policy changed?

In a world where agents take actions faster than humans can document them, that institutional memory has to live somewhere the next agent, or the next human, can read from it. Without it, the context layer has no memory of what it already decided, and no way to audit whether an agent’s action was consistent with intent.

Dael Williamson, EMEA Field CTO at Databricks, described the structural risk underneath all of this: without standardized semantic layers and programming interfaces, organizations risk creating isolated pools of intelligence, replicating the data silos of the past in a new, harder-to-see form. The data silos took 15 years to dismantle. The intelligence silos are forming now, one proprietary model at a time.

Decision provenance sounds way fancier than it is. The minimum viable version is a shared document, updated weekly, with one entry per consequential decision: what was decided, who made the call, what alternatives were considered, and what evidence was used. Almost like a “I told you so ledger” haha. That document is what an agent reads instead of reconstructing logic from Slack threads. Once the habit exists, the tooling to automate collection becomes obvious. 

Lindsay’s agent-driven decision log at Zapier, who we heard from earlier, is the automated version of this. Start with the manual version first.

For anyone who wants to go deeper on solving context rot and fragmentation beyond the practitioner approaches covered here, the Prompting Guide’s context engineering section covers the architectural patterns in more detail: layering context across system, task, tool, and memory levels; adjusting it dynamically as task complexity and execution history change; and embedding constraints at the exact decision point rather than loading everything into a static system prompt. The core principle is that context works best when delivered just before the agent needs it to decide. 

For a more research-oriented angle on coherent synthesis across multiple sources, the NJUNLP context-synthesis project on GitHub explores generating synthetic background context from short instruction-answer pairs to train models to handle longer, multi-source inputs without losing coherence between them.

Testing Whether Your Context Layer Works

A woman sitting in a chair, engaging in conversation with a humanoid robot, both depicted in a colorful, illustrated style against a bright yellow background.

There’s a quick fun diagnostic for whether an AI system has been built on a stable context foundation or a shaky one. 

That’s Istvan Meszaros, Founder and CEO of Mitzu.io, who built warehouse-native analytics at scale and was featured in a few parts of episode 1. He describes the test he runs before trusting AI outputs in production. Imagine a board meeting where someone asks a data question, a number appears in the chat box, and the room asks: do we trust this?

“I have a good test — the Turing test. Ask the AI the same question twice. If it gives you the same answer, that’s most likely okay. Even with the newest models, I very often still get two different answers for the same question.”

ISTVAN MESZAROS, Episode 180

If the answer changes depending on when you ask, the system is generating a plausible response, not reasoning from evidence. In a board meeting that’s embarrassing. In a production agent stack it’s a different category of problem.

The fix is building the context layer that makes outputs traceable in the first place. Istvan describes what that looks like when it works:

“The end-to-end traceability of why we see a number on the chart is possible with this type of technology. We just present the SQL query that we generate. This end-to-end traceability is the one that is making it not a black box anymore — it’s an open horizon. You can look into how things work. You can see the joins. Data analysts, marketing ops people can review it line by line.”

ISTVAN MESZAROS, Episode 180

That’s the difference between a context layer and a black box. One produces an answer. The other produces an answer you can audit.

Boom, you made it through. 

NEW ACHIEVEMENT: The Meaning Layer Is Live

An open book surrounded by vibrant cosmic elements, with a headline stating 'New Achievement: The Meaning Layer Is Live' in bold text.
  • Data quality stops agents from lying about what the warehouse contains. 
  • Context engineering closes the gap between governed data and what the agent understands. 
  • The semantic layer represents what your domain actually means: the concepts, relationships, constraints, and inference rules that dashboards never needed and agents cannot work without. 
  • Context rot and fragmentation are defeated by building and maintaining that shared layer before agents run on top of it.

What you have on the other side of this floor: agents that interpret data with the right context, shared definitions that hold across every system, and a meaning layer that carries into the next floor.

Two floors down. Two bosses with 4 forms, defeated.

See you on the next floor, martech crawler. 

Episode Recap

Digital illustration depicting a mysterious figure seated among an array of vintage computer monitors, titled 'Humans of Martech: Part 2 - The Eye of Context'. The background features a dark, intricate design representing a tech-themed environment.

A clean data warehouse is great and all, but agents operating on that data without shared definitions, without context, and without a semantic layer that represents what the data actually means is the next problem. It causes “believable nonsense.” The dangerous kind is polished and precise-looking, the kind that passes the first glance, gets forwarded to stakeholders, and becomes the basis for a campaign before anyone asks what assumptions the agent was working from.

Data quality, context engineering, and the semantic layer are 3 separate problems, and most teams collapse them into 1. Data quality stops agents from misrepresenting what the warehouse contains. Context engineering closes the gap between what the warehouse contains and what the agent understands, putting the right meaning, rules, and situational information in front of the model at the right moment. The semantic layer is the one the industry skipped in 2012, when the BI world bet on metrics layers and the healthcare industry bet on ontologies. Metrics layers produce consistent dashboards. Ontologies produce systems that can reason about why something happened. Agents need the second kind.

The context layer runs at 2 distinct speeds. The slow layer, covering strategy, definitions, and playbooks, is a documentation problem and can be built in a sprint. The fast layer, covering daily decisions, routing changes, and anything currently living in Slack threads and meeting notes, is a decision provenance problem that most teams haven’t solved. Context rot accumulates in the gap between them: the slow layer stays frozen while the fast layer keeps moving, and agents act on assumptions the team stopped believing months ago.

Lindsay Rothlisberger’s approach at Zapier offers the most concrete starting point: build the shared brain in Google Drive, organized at the company, department, team, and working group levels, with all existing documentation converted to markdown before anything new gets added. That sprint took 4 weeks. The harder work was the fast layer, where the practice shift was closing Slack threads with explicit decisions and documenting the reasoning behind them, not just the action items.

The honest admission running through this episode is that context is a maintenance problem, not a build problem. The slow layer buys you a foundation. The fast layer demands ongoing operational effort. The test for whether any of it is working is the one Istvan Meszaros offers: ask the AI the same question twice. If the answer changes, the system is generating a plausible response rather than reasoning from evidence. Listen to the full episode for the deeper breakdown on context rot, fragmentation, and how to start building a context graph from scratch.

Full episode ⬇️ or Back to the top ⬆️

Logos of GrowthLoop, GrowthBench, MoEngage, and Knak with the caption 'Proudly brought to you by'.

✌️


Cover art created with Midjourney (check out how)

Ask the Humans of Martech archive
Search 6,600+ transcript clips from real conversations with martech practitioners. Describe the problem you’re working on.

All categories

Monthly archives

See all episodes

Future-proofing the humans behind the tech

Leave a Reply