Framework Mar 24, 2026 8 min read

Why We Shipped an AI Assistant Inside the App, Not on Top of It

Last updated Apr 9, 2026

TL;DR

An AI assistant inside the runtime reads the descriptor, respects RBAC, and resolves 70% of interactions to concrete actions. A bolt-on assistant sees only the DOM and delivers navigational advice. The architecture decision comes before the model choice.

Two assistants, same request

In 2022, the default way to add AI to an enterprise app was to bolt a chat widget on top of the existing UI. Three years later, that architecture still ships as the consensus answer, and it still delivers navigational advice when users want action. A credit analyst at a commercial lender asked two assistants the same question last quarter — “show me vendors onboarded in the last 30 days with missing tax forms.” The bolt-on produced a polite paragraph about how to filter. The in-runtime assistant produced the filtered list, respected her RBAC scope, and logged the query to the audit trail in 22 seconds.

The difference isn’t the model. It’s what the assistant can see.

What a bolt-on assistant actually knows

Most enterprise AI assistants live one layer above the application. They see the DOM, maybe a screenshot, and whatever the user types. They don’t see the descriptor. They don’t see the permission scope. They don’t see the query the grid just ran or the validation rules the form enforces. They’re guessing at the same structure the runtime already has in memory.

That gap is why bolt-on assistants give so much navigational advice. They tell users which button to click because clicking is the only action they can confidently describe. They can’t act on the user’s behalf because they don’t know what acting would mean.

What an in-runtime assistant can do

When the assistant is a first-class part of the runtime, it reads the descriptor directly. It knows the screen has a vendor entity, a tax_form_status field, and a created_at timestamp. It knows the current user’s RBAC scope restricts results to her business unit. It knows the filter component accepts a structured predicate, not a natural-language string.

So it doesn’t write a paragraph. It proposes a filter, shows the user what it’s about to do, and applies it on approval. The interaction surface shrinks from “navigate the app” to “tell me what you want.” Every action the assistant takes flows through the same authorization and audit paths the UI uses, because it is the UI.

The security story is simpler, not harder

The common objection to in-runtime assistants is that they expand the attack surface. We found the opposite. A bolt-on assistant that can click buttons for the user has to reimplement permission checks, or worse, run with elevated access. An in-runtime assistant that proposes descriptor-level actions inherits every check the runtime already enforces.

We don’t grant the assistant any capability the user doesn’t have. The LLM is a proposal engine. The runtime is the enforcement layer. If the user can’t approve a $50,000 invoice, neither can the assistant acting on her behalf. The audit log records both the human and the model as participants in the action.

Why this is hard to retrofit

Building this pattern into an existing enterprise app is expensive because most enterprise apps don’t have a descriptor to read. The assistant has nothing structured to bind to, so it falls back to the DOM, and the limitations follow.

This is one of the reasons we designed the descriptor and the runtime before the assistant. The assistant is almost a consequence of the architecture rather than a product added to it. Once every screen is a descriptor, an assistant that reads and writes descriptors can work across the whole application without a per-screen integration.

What users notice

The behavior change is visible in the first session. Users stop asking the assistant where things are. They start asking for outcomes. “Find the three vendors flagged for duplicate bank details.” “Start a renewal workflow for the contracts expiring in May.” “Export this filtered view to the format finance uses.”

Our internal usage data from early pilots shows roughly 70% of assistant interactions resolve to a concrete action taken on the user’s behalf, compared to under 20% for the bolt-on pattern we tested against. The remaining interactions are explanatory, and even those pull from the descriptor instead of a generic help corpus.

The takeaway

An AI assistant is only as useful as the structure it can read. Bolting one onto a finished app gets you a better help widget. Shipping one inside a descriptor-driven runtime gets you a coworker that can actually do the work. The architecture decision comes first. The model comes second. Everything else follows from that order.

Two assistants, same request#

What a bolt-on assistant actually knows#

What an in-runtime assistant can do#

The security story is simpler, not harder#

Why this is hard to retrofit#

What users notice#

The takeaway#