A2RAG is a decision layer between your retrieval pipeline and your users - intercepting unsafe answers, triggering clarifications, and routing unanswerable queries before they cause damage.
Standard RAG systems are built to answer. They don't know when to stop. When your knowledge base doesn't cover the full picture, they fill in the gaps - confidently, incorrectly.
In regulated industries - insurance, legal, healthcare, HR - a wrong answer isn't just unhelpful. It's a liability.
A2RAG gives your pipeline three options instead of one: answer safely, ask for clarification, or refuse to guess.
A2RAG evaluates every query against two independent scores and routes accordingly - in under 2ms, with no changes to your existing pipeline.
Pass the user query, retrieved documents, and draft answer to A2RAG. Your retrieval and LLM stay exactly as-is.
Evidence score - does the corpus support this answer?
Completeness score - is there enough context for a specific answer?
One of three actions returned instantly. You define what happens next - show it, ask a follow-up, or escalate to a human.
A2RAG is particularly valuable where wrong answers carry real consequences. Pre-tuned domain profiles available for each.
Works with LangChain, LlamaIndex, OpenAI, Anthropic, or any custom RAG setup. No infrastructure changes. Your retrieval and LLM stay exactly as-is.
client.dashboard() opens a private browser dashboard. Your data stays on your machine.from a2rag import A2RAGClient client = A2RAGClient(api_key="your_key") # Your existing pipeline - unchanged contexts = rag.retrieve(user_query) draft_answer = llm.generate(user_query, contexts) # Add A2RAG - 3 lines decision = client.decide( query=user_query, contexts=contexts, draft_answer=draft_answer, domain="insurance", # optional preset ) if decision.should_answer: show_to_user(draft_answer) elif decision.should_clarify: # Generated follow-up question ask_user(decision.clarification) elif decision.should_abstain: # Topic not in corpus escalate_to_human()
Controlled evaluation across 200+ decision points including edge cases, partial queries, multilingual inputs, and contradicting corpora.
A2RAG is architecturally designed so user query content never reaches our servers. We return a routing decision - that's all.
We're working with a small group of developers and teams to validate A2RAG on real production corpora. No pricing yet - this is about learning together.
Each call to client.decide() is one decision. A2RAG evaluates the query, retrieved contexts, and draft answer - and returns one of three actions: answer, clarify, or abstain. One API call = one decision, regardless of outcome.
No. Query content, documents, and answers are processed in-memory and immediately discarded. We never store, log, or train on the content passing through the API. We only retain anonymous decision metadata (action, confidence, latency) to improve accuracy - and only with your consent on the free tier.
No. A2RAG sits between your RAG retrieval and your users. Your LLM still generates the draft answer. A2RAG decides whether that answer is safe to show, needs clarification, or should be withheld. Your existing LLM and RAG setup stay completely unchanged.
You define what happens. Options include: route to a human agent, return a custom predefined message, fire a webhook to your internal systems, or handle it silently in your application. Every behavior is fully configurable per customer and per domain.
Yes - A2RAG is a statistical system and decisions may be incorrect. It may answer when it should abstain (false positive) or abstain when an answer was available (false negative). Performance varies by domain, language, and corpus quality. We recommend testing on your specific corpus before production deployment. A2RAG reduces risk - it does not eliminate it.
A2RAG includes automatic language detection and works across English, Hebrew, Arabic, French, Spanish, and other languages. Performance is best on English corpora. Hebrew and Arabic operate in a specialized heuristic mode that handles right-to-left text and morphology.
Yes. Docker deployment is available for teams that need private infrastructure. Enterprise tier includes full on-premise support with no data leaving your environment. Contact us to discuss your requirements.
Most developers are up and running in under an hour. Install with pip install a2rag, pass your existing RAG output to client.decide(), and handle the three possible outcomes. No infrastructure changes required. Demo notebooks are included for Insurance, HR, Legal, and Support domains.
No. Query content is never used for training. The free developer tier optionally shares anonymous decision metadata (not content) to help improve model accuracy - this is disclosed at signup and can be disabled. Pilot and Enterprise tiers have no telemetry by default.
Join the private beta. Limited spots for developers and teams.