Beyond the Chat Box: The Answer Agent — grounded answers, or no answer at all

Post seven of a series on what live audience engagement actually looks like when there's an AI team in the room.

Of all the agents in ReactLive, the Answer Agent is the one most people are afraid of.

That's the right reaction.

For the last three years, the dominant experience of "AI answering questions" has been a chatbot confidently making things up. Hallucinated citations. Invented numbers. Plausible-sounding answers to questions whose real answers are nobody actually knows. The pattern is so common it has its own folk diagnosis: the AI is yapping.

Now imagine that, but in front of 4,000 employees, the morning after earnings, with the CFO on stage. An AI in the corner of the screen confidently answering "what does this mean for the engineering org?" with a paragraph that sounds reasonable, contains numbers nobody recognises, and references a memo that doesn't exist.

That is the failure mode that, until very recently, kept AI out of live event Q&A entirely. Hosts looked at the technology, imagined the scenario above, and made the only sensible call: not today, not in front of my audience, not at this stage.

The Answer Agent exists to make that call wrong.

Not by being a better-trained chatbot. By being a fundamentally different kind of system — one whose architecture makes the failure mode above structurally impossible.

This post is about how.

The first principle: grounded or silent

Most AI assistants will answer your question one way or another. If they don't know, they guess. If the source material is thin, they fill the gaps from model priors. If the question is ambiguous, they pick an interpretation and run with it. Confidence and accuracy come apart constantly.

The Answer Agent is built around the inverse principle: if it can't answer from trusted sources at the confidence level you've set, it doesn't answer.

Not "tries to answer cautiously." Not "answers with disclaimers." Doesn't answer.

The trusted sources are explicit and bounded:

Documents you uploaded for the event (FAQ, brand guide, last quarter's results, the deck)
The live transcript of what's been said in this event
Prior Q&A from the same event

That's it. The agent does not draw on the open web. It does not fall back on model priors. It does not speculate based on patterns from its training. If the question isn't grounded in those three sources, it gets escalated to the moderator.

The architectural implication: the most common output of the Answer Agent, by design, is to surface a question for human handling rather than to answer it. That's not a bug. That's the product working correctly.

Why "no answer" is sometimes the right answer

This is the part that takes some getting used to.

Modern AI products have trained users to expect an answer to every question, and to evaluate the AI by the quality of the answers it produces. We grade chatbots on how rarely they say "I don't know."

In a live event, that grading rubric inverts. The cost of a wrong-but-confident answer is enormous — it survives in screenshots, gets cited as company position, ends up in a journalist's notebook. The cost of "no answer, escalated to the moderator" is approximately zero — the moderator catches it, the human handles the question, and the event continues.

Asymmetric costs require asymmetric defaults. The Answer Agent defaults to silence on anything ambiguous, because the upside of an uncertain answer is small and the downside of a wrong one is large.

This is the most important design decision in ReactLive. Everything else in the Answer Agent — the three-state confidence model, the grounding rules, the source attribution — is in service of this one principle.

The three-state confidence model

The Answer Agent doesn't produce a binary "answered / not answered" output. It produces one of three states for every question:

Answered (confidence > 85%). The agent has high confidence that it can answer this question correctly from grounded sources. The answer is published — privately to the asker, or publicly to the feed, depending on configuration. The source attribution is shown so anyone reviewing can verify where the answer came from.

Possibly Addressed (confidence 60–85%). The agent has medium confidence — the question seems to be at least partly answered by something in the trusted sources, but the match isn't clean. Maybe the speaker covered a related topic but not the exact question. Maybe the FAQ has a similar but not identical entry. The asker is shown the partial answer with an explicit "this may not fully address your question" note, and the question stays in the queue for the moderator to consider whether to surface it for a fuller live answer.

Unanswered (confidence < 60%). The agent does not have grounded source material that confidently addresses this question. The agent doesn't try. The question goes to the moderator's queue with no AI-generated content attached.

The thresholds are not arbitrary. The 85% line is set conservatively — the agent is calibrated to fail toward Possibly Addressed rather than Answered when in doubt. The 60% line is set so that the Possibly Addressed bucket only contains questions where there's something useful in the sources, not just a thin pattern match.

These thresholds are the hard product gate. They are not tunable knobs the user can dial up to make the agent answer more questions. The conservative threshold is the product. Loosening it would change what ReactLive is.

How the Answer Agent actually works

Five core skills, each in service of the principle above.

Question understanding. Before the agent can answer, it needs to understand what's being asked — including questions that are phrased badly, contain typos, embed multiple sub-questions, or use organisation-specific jargon the agent has been given context for. The Setup Agent's content ingestion matters here: a question about "the Phoenix migration" can be understood correctly because the deck the host uploaded explained what Phoenix is.

Answer retrieval. The agent searches across the three trusted sources — docs, transcript, prior Q&A — and assembles candidate answer material. This is retrieval, not generation. The first step is finding the source content; generation comes later.

Answer generation. Once retrieval has produced grounded source material, the agent generates an answer in the event's voice (defined in SOUL.md). The generation step is constrained: the answer must be derivable from the retrieved sources. The agent doesn't produce sentences whose factual content comes from outside those sources.

Dedup awareness. When a question is essentially a duplicate of one that's already been answered in this event — including by the speaker, in the transcript — the agent recognises it and either points the asker to the prior answer or attaches the answer directly. It doesn't generate a fresh answer that might phrase the same content differently and confuse the audience.

Confidence scoring. Every candidate answer is scored before publication. The score determines which of the three states the question lands in. This is where the rubber meets the road — the confidence score is the gate that decides whether the audience sees an AI answer or not.

The pipeline is conservative end-to-end. Every step is biased toward "escalate this" rather than "answer this." The product takes the cost of unanswered questions onto itself rather than passing the cost of wrong answers to the audience.

The Auto-Answer Loop

Everything above is the agent's decision logic. The Auto-Answer Loop is the physical pipeline that runs that logic continuously, in real time, through the entire live event.

It works like this.

The stream comes in. ReactLive ingests the audio from whatever's broadcasting the event — Zoom, Teams, a hardware encoder, a browser tab, the hosts's microphone directly. The stream is the source of truth for what's actually being said in the room.

Whisper transcribes it as it happens. The audio runs through a real-time speech-to-text pipeline. Every sentence the speaker says becomes searchable text within seconds of being spoken. The live transcript is one of the three trusted sources the Answer Agent grounds against — the other two being the documents you uploaded and the prior Q&A from this event.

Questions arrive in parallel. While the speaker is talking, the audience is submitting questions. Each submission lands in the queue and immediately gets evaluated by the Answer Agent against the current state of all three sources — including, crucially, the transcript of what the speaker has said in the last sixty seconds.

Matching happens against the live state. Most of the work is here. Did the speaker just answer this question two minutes ago in their last point? Is there a passage in the uploaded FAQ that addresses it? Did someone else ask a version of this in the prior Q&A and get an answer? The agent looks for grounded source material across all three.

The three-state classifier decides what happens. High-confidence match → answer published, source attribution attached. Medium-confidence partial match → asker sees "this may have been addressed" with a pointer, question stays in the moderator's queue. No grounded match → question goes to the moderator clean, no AI content attached.

The moderator handles what the agent didn't. Either answers manually, marks the question for the speaker to take live, or queues it for post-event follow-up. Whatever they do feeds back into the system — so the next similar question can be handled faster.

Loop continues, every few seconds, for the duration of the event.

That's the loop. The thing that makes it work is the live transcript — without it, the Answer Agent has no idea whether the speaker just addressed the question or not. Whisper running continuously against the stream is what makes the agent able to say "the speaker just covered this two minutes ago" instead of redundantly answering a question whose answer is already in the room.

The compounding effect of hundreds of questions running through this loop in real time — most resolved automatically against the speaker's own words, the meaningful few surfaced to the human — is what closes the gap between "questions submitted" and "questions answered" that we identified all the way back in the series opener.

This is the loop most engagement tools don't close. It's the loop that justifies calling ReactLive "AI-native" rather than "AI-enabled." Every other agent in the system is in service of making this loop work — Setup gives it the grounding sources before the event starts, Protect keeps the question queue clean, Engage feeds it well-clustered submissions, Report shows what the loop did at the end. The Auto-Answer Loop is the heart.

The dial, applied to answers

Where the dial matters most for moderation (Protect) and for engagement (Engage), it matters in a different way for answers — because the what of what the agent does is the same at every level. The agent always grounds. The agent always defers when uncertain. The agent always shows its sources. What changes across the levels is whether its output reaches the audience automatically, and through what pathway.

Off. No AI answers. Questions go to the moderator's queue. ReactLive becomes a clean human-moderated Q&A platform with a live transcript and the rest of the agent system still active.

Suggest. The agent generates candidate answers and shows them to the moderator with confidence scores and sources attached. The moderator decides what to publish. This is the right setting for first events — every published AI answer has had a human review the grounding before it went out.

Assist. The agent auto-answers high-confidence questions (especially repeats and clear FAQ-style questions) and escalates anything below the threshold or anything contextually sensitive. The moderator's queue gets the meaningful calls. This is the workhorse setting.

Auto. The agent answers all questions where the confidence threshold is met. Questions below threshold still escalate. The moderator can override anything in real time but doesn't have to be in the answer queue to keep the event running. This is the setting for events where the host has confirmed the agent's calibration on this content multiple times.

The line that's worth making explicit, the same way we did for the other agents: Auto on Answer doesn't mean ungrounded. Auto means the agent doesn't wait for human approval to publish answers it's confident in. It does not mean the grounding rules relax. The agent will not answer a question without trusted sources at any level, ever — Auto included. The grounding is the product, not a setting.

What the Answer Agent does not do

It does not answer from the open web. Not on Auto. Not when asked nicely. Not when the question would be easy to look up. Open-web answering is a different product with a different risk profile, and ReactLive is the wrong tool for it. If the question requires real-time external information, the moderator handles it.

It does not fall back on model priors. When the trusted sources don't cover a question, the agent doesn't use general training knowledge to fill in. The agent's confident behaviour about your subject matter is bounded entirely by what you've given it to work with. This is the rule that makes the agent safe to deploy in regulated and high-stakes contexts.

It does not invent attribution. When the agent shows sources for an answer, those sources are the actual material the answer was derived from. No fabricated citations. No "according to the document" when the document doesn't say that. If the agent can't show its work, it doesn't answer.

It does not soften "I don't know" into "I think." The agent doesn't generate "this might be the case" or "I believe" answers when its confidence is below threshold. Below threshold, it escalates. Hedged answers are the failure mode that produces yapping; the agent is built to avoid them entirely.

Why this is the agent that converts skeptics

Of all the agents in the system, Answer is the one prospective customers ask the hardest questions about. What if it gets it wrong? What if it makes something up? What if it answers a question we didn't want answered?

The honest response is: those scenarios are the ones we built the agent specifically to prevent.

The grounding rules prevent fabrication. The confidence threshold prevents hedged-but-wrong answers. The escalation pathway prevents the agent from forcing an answer when the right move is to defer to a human. The dial prevents the host from having to commit to autonomy before they've earned trust in the agent's calibration.

We're not asking customers to trust an AI. We're asking them to trust an architecture — one whose rules are explicit, whose limits are visible, and whose default behaviour is to do less rather than more.

Once they see a few events run through this architecture, the trust comes naturally. Not because the AI is impressive, but because it's behaved. It answered the obvious questions cleanly. It deferred on the ambiguous ones. It didn't pretend to know things it didn't. It made the moderator's job easier rather than harder.

That's the bar. The agent answered the questions you'd want it to answer, and stayed quiet on the questions a human should answer. If the Answer Agent clears that bar consistently, the rest of the trust-building takes care of itself.

The Answer Agent is the heart of ReactLive. It's also the agent we're most careful with — because in a live event, the cost of being wrong is exactly the cost most products externalise to the user, and we'd rather take that cost ourselves.

Next up: The Report Agent — the debrief nobody has time to write. How the post-event agent turns transcripts, Q&A, and engagement data into the recap your team would have written if they had the time.

Join the waitlist to get early access. Three free events. Locked-in pricing.