Signals The System Around the Model
Context Architecture

The System Around the Model

Building an AI Behavior & Context Architecture Framework. Prompting is the visible surface — but reliable behavior inside workflows is shaped by the system around the model: the context, state, roles, boundaries, and runtime conditions that have to be designed, tested, and reviewed.

Author Chrys Li
Published May 17, 2026
Series Context Architecture
Sections 10

I have been asked many versions of the same question: how do we prompt the AI so it behaves the way we want?

It is a reasonable question. The prompt is the visible surface. It is the part people can read, edit, argue with, and quickly improve. But prompting alone is too narrow for systems that need reliable behavior inside workflows.

01 — The Model Is Not the System

A model's behavior is shaped by the system around it.

A model's behavior is shaped by the system around it:

Once those conditions matter, the design problem changes. The work becomes behavioral architecture.

A central Model surrounded by orbiting context elements — workflow state, role and identity, permissions and data boundaries, retrieved knowledge, language rules, output format, uncertainty rules, escalation and session state — assembled into a runtime context packet that produces the response.
The model is not the system — its behavior is shaped by the context architecture around it.

The system needs to define the operating conditions that shape the model's response before the model is asked to produce one. That is the purpose of an AI Behavior & Context Architecture Framework: to make the system around the model explicit enough to design, test, review, and improve.

02 — The Missing Layer

There is a layer almost nobody designs on purpose.

Visible interface decisions are useful as starting points for AI initiatives, but they rarely define enough of the operating environment.

Illustration of the missing layer between the interface and the model.
There is a layer almost nobody designs on purpose.

A workflow-aware AI system may need to understand:

Those decisions shape behavior more than wording alone. A prompt can describe desired behavior. A context architecture gives the model the conditions required to perform that behavior inside the workflow.

The practical question then becomes:

What does the model need to know, at this moment, to behave correctly?

That question should be answered by the system, not improvised inside the model response.

03 — Prompting Sits Inside Runtime Architecture

Prompts are instructions. Workflows are stateful operating environments.

A prompt can define tone, role, constraints, and output expectations. It can also encode behavioral guidance. That work still matters.

Runtime architecture defines what the model receives when it is asked to act.

In a workflow, the model may need a structured packet containing:

This packet becomes the model's operating frame for that moment.

Without this frame, the model is left to infer too much from the user message and whatever static instruction was provided. That can produce acceptable answers in simple interactions and unstable behavior in complex workflows.

With a runtime context packet, teams can decide what gets included, what gets excluded, what gets prioritized, and what the model should do with each type of context. The prompt still exists. It just carries less unsupported weight.

04 — From Chat Interface to Workflow Intelligence

A chat interface responds to messages. A workflow intelligence layer participates in a process.

That participation requires awareness of task state. The system has to know whether the user is exploring, drafting, reviewing, editing, comparing, approving, submitting, or recovering from an interruption. Each state changes what context matters.

For example, in a public service request and feedback workflow, the assistant may help a resident, business owner, or internal service team describe a question, issue, complaint, feedback item, or service request. The same workflow may also require the system to:

The same need applies to summarization, classification, translation, drafting, recommendations, and decision support. AI capabilities become more useful when the workflow defines the conditions for their use.

05 — A Practical Framework Structure

Make the system reviewable in layers.

A behavior and context architecture framework should make the system reviewable in layers.

The framework, structured as numbered layers from system overview through behavioral specifications, context architecture, testing and diagnostics, and SD/UX workflow.
The framework — a set of reviewable layers.

Each folder contains documentation that answers different kinds of system question:

00-system-overview

This layer defines the purpose of the AI system and the principles that guide its behavior.

Example document

This document should define operational principles that can guide design, implementation, and testing. Useful principles might include:

A principle becomes useful when it can be tested against behavior.

01-behavioral-specifications

This layer defines how the AI should behave in specific roles or workflows.

Example documents

A behavior spec should define:

For a public service request and feedback triage assistant, the behavior spec might include:

This is behavior architecture. The system is defining how the assistant should act before those expectations are compressed into prompts, flows, or code.

02-context-architecture

This is the core operational layer.

Example documents
Context architecture shown as a designed artifact rather than an accidental side effect of implementation.
Context architecture is a designed artifact.

This layer defines what context exists, where it comes from, when it is included, how it is prioritized, and how it should be interpreted. It should answer questions such as:

This layer turns context assembly into a design decision instead of an accidental side effect of implementation.

Context assembly depicted as a set of gates that decide what is included, excluded, prioritized, and labeled.
Assembly is a set of gates.
03-testing-diagnostics

This layer defines how the AI system will be tested and diagnosed.

Example document

AI systems need tests for behavior, reasoning, retrieval use, language handling, permissions, and workflow fit.

A weak test asks:

Was this answer good?

A stronger test asks:

Did the system behave correctly given the context it received?

The second question makes failures easier to diagnose because it connects output quality to system conditions.

04-sd-ux-workflow

This layer connects AI behavior to the user experience.

Example document

It defines how the AI interaction appears to the user across the workflow. It should include:

This layer should always be aligned with the behavior and context layers. If the interface asks the assistant to do something the context architecture does not support, the system will drift. If the context architecture supports a capability the UX never exposes, the value stays hidden.

Alignment between behavior specs, context architecture, and UX flows shown as the central point of the framework.
Alignment is the point.
06 — Runtime Context Architecture

What the model knows at the moment it is asked to act.

Runtime context architecture defines what the model knows at the moment it is asked to act. That context can come from several sources.

Context assembled from injected, inferred, retrieved, workflow-state, role, tenant, and language sources.
Context is assembled.

Injected context

Context deliberately passed into the model by the system. This may include system instructions, workflow rules, output schemas, tenant information, user role, and current task state.

Inferred context

Context derived from the current interaction. This may include the user's likely goal, missing information, language preference, draft maturity, or whether the user is exploring versus finalizing.

Inferred context should be labeled carefully. It can improve usefulness and create false confidence when the system treats inference as fact.

Retrieved context

Context pulled from a knowledge base, database, document store, vector index, or other source. Retrieved context needs governance. The system should define what can be retrieved, how relevance is determined, what metadata matters, and how the model should use retrieved material.

Workflow-state context

Context about where the user is in the process. A user asking a service question needs different support than someone filing a complaint. A resident following up on an existing case needs different context than a service team reviewing a routed request.

Role context

Context about who the user is in the system. Different roles may require different levels of explanation, different allowed actions, and different visibility into records.

Tenant context

Context about organizational or account boundaries. This is especially important in multi-tenant systems where retrieval must respect data separation.

Language context

Context about input language, output language, translation rules, terminology preservation, and multilingual retrieval behavior. Language handling should be specified when the system operates across languages. The system needs rules for preserving original phrasing, translating content, summarizing across languages, and asking for confirmation when meaning may change.

All of these context types need assembly rules. Context assembly rules define how the system decides what to include, exclude, prioritize, compress, label, and pass into the model.

The runtime context packet — the structured frame of context passed to the model for a single moment of action.
The runtime context packet.

We already know that AI behavior quality depends heavily on context quality. A capable model given messy context may behave inconsistently. A smaller model given well-structured context may behave more predictably than expected. The model matters, but the operating conditions matter too.

07 — Testing & Diagnostics

"Sounds good" is not a QA methodology.

Failures traced across system layers rather than blamed on the model alone.
Failures are rarely the model alone.

AI systems need QA architecture. A production AI system should be tested for behavior across realistic scenarios and failure modes.

A set of named, testable failure modes including reasoning behavior, workflow drift, retrieval misuse, over-questioning, hallucinations, language instability, permission leakage, and false related-case matches.
Failure modes you can name and test.

Important diagnostic categories include:

Reasoning behavior

Workflow drift

Retrieval misuse

Over-questioning

Hallucinations

Language instability

Permission leakage

False related-case matches

These tests should connect to the context architecture. Then when a failure occurs, the team should be able to ask:

The goal is layer-level diagnosis.

08 — Human Review Does Not Scale Cleanly

As AI systems become more operational, their documentation becomes harder to review.

A mature behavior and context architecture may include:

Each document may be clear in isolation. But they need to work together. Review needs to catch issues such as:

These become architectural consistency problems. Dense AI documentation needs review formats that make dependencies visible. AI-mediated review layers can help convert system documentation into formats humans can actually inspect. For example, AI review layers could:

This keeps human responsibility where it belongs: judgment, decision-making, and accountability. AI can help expose the parts of the system that need that judgment. This may lead to stable review agents configured to understand the architecture, check for contradictions, and help teams evaluate changes over time.

A review layer that helps the architecture inspect itself for contradictions and gaps.
The architecture reviews itself.

The goal is practical: make dense AI architecture more reviewable, traceable, and discussable. If AI systems require layered documentation, the review process needs layered support.

09 — Practical Guidance for Product & Engineering Teams

Define the system around the model before relying on model behavior.

Useful questions to get started:

These are architecture questions.

For developers and technical product teams, the AI layer becomes a runtime participant in the product architecture. For UX and system designers, the work expands into conversational behavior, context availability, workflow continuity, and failure recovery. For enterprise teams, governance belongs inside the operating structure: permissioning, retrieval boundaries, diagnostics, and review processes.

The model should receive the conditions required to behave well. Those conditions have to be designed.

10 — In Closing

The model is important. But the system around the model determines what the model can reliably do.

The next phase of AI system design is about operating environments for models. Prompting remains part of the work. Behavior architecture, context architecture, runtime orchestration, diagnostics, and review systems determine whether the AI can function reliably inside an actual workflow.

AI systems are judged by repeated behavior under changing conditions:

That system defines what the model knows, what it can access, what it should avoid, how it recovers, how it is tested, and how humans can evaluate its behavior once it is live.

That is the work.

Templates & Documentation

Sample context architecture documentation and templates are available on GitHub: AI-Behavior-Context-Architecture-Framework →

Designing the system around your model? 7modes helps teams build the behavior and context architecture that makes AI reliable inside real workflows.
Start a conversation →