How does Vertically prevent AI hallucinations from corrupting the board?

Vertically decouples the language model from the database. The LLM only interprets intent and drafts messages; it never writes to the board directly. Every state change is executed by a separate deterministic rules engine with strict validation, so an incorrect model output cannot silently mutate your project data.

All posts

Product

How an AI Project Manager Keeps a Kanban Board Honest

Letting an AI touch your project board sounds risky — until you understand the architecture. Here's how Vertically separates language understanding from state changes so the board is never corrupted by a hallucination.

The Vertically Team · Product & ResearchMay 18, 2026

The promise of an AI project manager is appealing right up until the moment you imagine it making a mistake. What happens when the model misreads a message and marks the wrong task done? What if it hallucinates a deadline, or reassigns work to the wrong person? For a system of record that leadership relies on, "the AI got confused" is not an acceptable failure mode.

This concern is correct, and it shaped the most important architectural decision in Vertically: the language model is never allowed to touch the board directly.

The problem with letting an LLM write to your database

Large language models are extraordinary at understanding messy human language. They are also, by their nature, probabilistic. They generate plausible outputs, and plausible is not the same as correct. If you wire a model directly to your project database and let it execute changes, you've coupled the accuracy of your source of truth to the model's worst-case behavior. One confident-but-wrong output, and your board now contains a fabricated state that everyone downstream trusts.

That's an unacceptable risk for the system leadership uses to make decisions. So we designed around it.

Two systems, cleanly separated

Vertically splits the job into two distinct components with a hard boundary between them.

The interpreter (the LLM)

The language model's role is strictly analytical. It reads incoming emails, extracts intent — what happened, to which task, with what implication — and drafts professional replies. It is a parser and a draftsman. Critically, it has no authority to change anything. Its output is a proposed interpretation, not an action.

The rules engine (deterministic)

Every actual change to the board — moving a card, closing a milestone, reassigning an owner, recalculating a date — is executed by a separate, deterministic rules engine. This engine doesn't guess. It applies explicit, validated rules to the interpreter's proposal, and rejects anything that doesn't satisfy them. A state transition only happens if it's valid according to the rules, full stop.

The model proposes. The rules engine disposes. Nothing reaches your board without passing deterministic validation.

Why this matters in practice

This separation gives you the best of both worlds: the fluent language understanding of a modern model, and the predictability of traditional software. The model can be as flexible as it needs to be in interpreting how people actually write, while the part that mutates your data behaves like the deterministic system you'd expect of any serious tool of record.

It also makes the system auditable. Because every change flows through the rules engine and traces back to a specific message and interpretation, you can always answer the question "why did the board change?" There's a clear chain from the email a person sent to the validated rule that fired. Nothing happens by magic, and nothing happens silently.

The LLM never writes to the board — it only interprets and drafts.
A deterministic rules engine executes every change with strict validation.
Invalid or low-confidence interpretations are rejected, not applied.
Every change is traceable to the message that caused it.

Trust is an architecture decision

It's tempting to treat AI safety as something you bolt on afterward — a content filter here, a confidence threshold there. But for a system that maintains your source of truth, trust has to be structural. You earn it by designing the system so that the failure modes you're worried about are impossible by construction, not merely unlikely.

That's why Vertically keeps the model and the database decoupled. It's also why this approach maps cleanly onto frameworks like the NIST AI Risk Management Framework, which emphasize governable, measurable, and accountable AI behavior. An AI project manager should be powerful enough to save you real time and disciplined enough that you never have to double-check its work. Getting both at once isn't luck. It's the architecture.

FAQ

How does Vertically prevent AI hallucinations from corrupting the board?: Vertically decouples the language model from the database. The LLM only interprets intent and drafts messages; it never writes to the board directly. Every state change is executed by a separate deterministic rules engine with strict validation, so an incorrect model output cannot silently mutate your project data.