Private Beta · Limited spots

Your RAG answers
when it shouldn't.

A2RAG is a decision layer between your retrieval pipeline and your users - intercepting unsafe answers, triggering clarifications, and routing unanswerable queries before they cause damage.

0%
Unsafe Answer Rate
91%
Decision accuracy
<2ms
Added latency
5+
Languages
The problem

Confident answers from incomplete knowledge.

Standard RAG systems are built to answer. They don't know when to stop. When your knowledge base doesn't cover the full picture, they fill in the gaps - confidently, incorrectly.

In regulated industries - insurance, legal, healthcare, HR - a wrong answer isn't just unhelpful. It's a liability.

A2RAG gives your pipeline three options instead of one: answer safely, ask for clarification, or refuse to guess.

live example
// Without A2RAG
user › Can I return this item?
rag › Yes, returns accepted within 14 days.WRONG
User has a digital product - non-refundable.
// With A2RAG
user › Can I return this item?
a2rag › CLARIFY SAFE
"Was this a physical product or a digital download?"
User confirms → correct policy applied ✓
How it works

Three outcomes instead of one.

A2RAG evaluates every query against two independent scores and routes accordingly - in under 2ms, with no changes to your existing pipeline.

STEP 01

Your pipeline runs unchanged

Pass the user query, retrieved documents, and draft answer to A2RAG. Your retrieval and LLM stay exactly as-is.

STEP 02

Two scores computed

Evidence score - does the corpus support this answer?

Completeness score - is there enough context for a specific answer?

STEP 03

Routing decision returned

One of three actions returned instantly. You define what happens next - show it, ask a follow-up, or escalate to a human.

Conversation examples
Clarify
Answer
Abstain
User
How fast will support respond to my ticket?
Instance-specific - depends on plan
A2RAG
→ CLARIFY (conf: 0.85)
"What is your current plan? (Free / Starter / Growth)"
User
Growth plan
A2RAG
→ ANSWER (conf: 0.97)
Growth plan: 4-hour response SLA for priority tickets.
❓ Clarify first→ context provided →✓ Safe answer
User
What is the refund window for physical items?
Generic policy - directly in knowledge base
A2RAG
→ ANSWER (conf: 0.94, evidence: 91%)
Corpus fully supports draft. Safe to present to user.
✓ Direct answerNo friction added
User
What is your chargeback review timeline?
Topic not covered in knowledge base
RAG
Draft: "30-45 business days." hallucinated
A2RAG
→ ABSTAIN (evidence: 0%)
Topic not in corpus. Configured action fires → escalate / message / webhook.
🚫 AbstainHallucination blocked before user sees it
Use cases

Built for high-stakes industries.

A2RAG is particularly valuable where wrong answers carry real consequences. Pre-tuned domain profiles available for each.

🛡️
Insurance
Prevent coverage misstatements and liability from hallucinated policy answers.
  • Coverage eligibility questions
  • Claim filing timelines
  • Policy exclusion queries
  • Deductible calculations
⚖️
Legal & Compliance
Stop jurisdiction-specific speculation and unsupported legal interpretations.
  • Contract term questions
  • NDA enforceability
  • IP ownership queries
  • Regulatory compliance
🏥
Medical & Clinical
Intercept unsupported clinical information before it reaches patients or staff.
  • Drug interaction queries
  • Treatment eligibility
  • Dosage questions
  • Clinical protocol lookups
👥
HR & Employee Support
Avoid wrong policy answers that create entitlement disputes or legal exposure.
  • Leave entitlement queries
  • Parental leave policy
  • Remote work requests
  • Benefits & expense policy
🎧
Customer Support
Route plan-specific questions correctly. Stop bots from making commitments they can't keep.
  • Plan-specific feature queries
  • SLA & support questions
  • Billing & cancellation
  • Integration availability
🏦
Financial Services
Prevent speculative financial guidance and product misrepresentation.
  • Product eligibility
  • Fee & rate questions
  • Regulatory disclosures
  • Transaction dispute handling
Integration

3 lines. Any pipeline.

Works with LangChain, LlamaIndex, OpenAI, Anthropic, or any custom RAG setup. No infrastructure changes. Your retrieval and LLM stay exactly as-is.

pip install a2rag
One package. No required infrastructure changes. Works alongside your existing setup.
🎛
Configurable actions
Define what happens on abstain - escalate, custom message, webhook, or silent fallback.
📊
Local analytics dashboard
client.dashboard() opens a private browser dashboard. Your data stays on your machine.
🌍
Multilingual
Automatic language detection. English, Hebrew, Arabic, French, Spanish - no configuration needed.
integration.py
from a2rag import A2RAGClient

client = A2RAGClient(api_key="your_key")

# Your existing pipeline - unchanged
contexts     = rag.retrieve(user_query)
draft_answer = llm.generate(user_query, contexts)

# Add A2RAG - 3 lines
decision = client.decide(
    query=user_query,
    contexts=contexts,
    draft_answer=draft_answer,
    domain="insurance",  # optional preset
)
if decision.should_answer:
    show_to_user(draft_answer)

elif decision.should_clarify:
    # Generated follow-up question
    ask_user(decision.clarification)

elif decision.should_abstain:
    # Topic not in corpus
    escalate_to_human()
$pip install a2rag
Works with your existing stack
LangChain
LlamaIndex
OpenAI
Anthropic
Mistral
Cohere
Pinecone
Weaviate
ChromaDB
Any custom RAG
Early results

Tested across domains and languages.

Controlled evaluation across 200+ decision points including edge cases, partial queries, multilingual inputs, and contradicting corpora.

0%
Unsafe Answer Rate
Never answers confidently when the corpus cannot support it.
91%
Decision accuracy
Correct answer / clarify / abstain decisions across all test scenarios.
100%
Abstain precision
Every abstention was correct. Zero false refusals on answerable queries.
<2ms
Added latency
Decision overhead is negligible relative to RAG retrieval and LLM inference.
⚠ A2RAG is a statistical system. Results reflect controlled test sets and may vary by domain, corpus quality, and query type. See Terms of Service for full disclaimer.
Security & privacy

Your data never leaves your pipeline.

A2RAG is architecturally designed so user query content never reaches our servers. We return a routing decision - that's all.

🔒
Query content never stored
Queries, documents, and answers are processed in-memory and immediately discarded. We cannot reconstruct any conversation.
📦
Local deployment available
Docker image available for teams that need private infrastructure. Enterprise tier includes full on-premise deployment support.
📊
Optional anonymized analytics
Free tier sends anonymous decision metadata (action, confidence, latency - no content) to improve accuracy. Opt-out available.
🛡️
EU infrastructure
Hosted on AWS EU (Frankfurt). GDPR compliant. Israeli Privacy Protection Law (Amendment 13) compliant.
🔑
API key security
Keys are one-way hashed (SHA-256). Plaintext keys are never stored. Rate limiting and automatic expiry available.
🚫
No training on your data
Customer data is never used for model training by default. Anonymized aggregate patterns only, with explicit opt-in.
🔐
Minimal retention architecture.

We collect only what's necessary to run the service. IP addresses deleted after 30 days. Anonymous decision metadata retained 12 months then aggregated. You can request deletion at any time.

No query content stored No training on your data IP deleted after 30 days Local dashboard available Deletion on request GDPR compliant
Early access

Join the private beta.

We're working with a small group of developers and teams to validate A2RAG on real production corpora. No pricing yet - this is about learning together.

For developers
Developer Access
Builders · indie AI · experimentation
Get started immediately. No review required.
  • 500 decisions / month
  • Sandbox API access
  • Python SDK + demo notebooks
  • Anonymous usage analytics
  • Community feedback program
Free during early access · Terms of Use applies
Request Access →
For enterprise
Enterprise
Insurance · Legal · Healthcare · Enterprise AI
Private infrastructure, custom integrations, full on-premise deployment.
  • Docker / local deployment
  • Private infrastructure
  • Custom integrations
  • Security-focused architecture
  • Custom decision volumes
Always starts with a conversation
Contact Us →
FAQ

Common questions.

Each call to client.decide() is one decision. A2RAG evaluates the query, retrieved contexts, and draft answer - and returns one of three actions: answer, clarify, or abstain. One API call = one decision, regardless of outcome.

No. Query content, documents, and answers are processed in-memory and immediately discarded. We never store, log, or train on the content passing through the API. We only retain anonymous decision metadata (action, confidence, latency) to improve accuracy - and only with your consent on the free tier.

No. A2RAG sits between your RAG retrieval and your users. Your LLM still generates the draft answer. A2RAG decides whether that answer is safe to show, needs clarification, or should be withheld. Your existing LLM and RAG setup stay completely unchanged.

You define what happens. Options include: route to a human agent, return a custom predefined message, fire a webhook to your internal systems, or handle it silently in your application. Every behavior is fully configurable per customer and per domain.

Yes - A2RAG is a statistical system and decisions may be incorrect. It may answer when it should abstain (false positive) or abstain when an answer was available (false negative). Performance varies by domain, language, and corpus quality. We recommend testing on your specific corpus before production deployment. A2RAG reduces risk - it does not eliminate it.

A2RAG includes automatic language detection and works across English, Hebrew, Arabic, French, Spanish, and other languages. Performance is best on English corpora. Hebrew and Arabic operate in a specialized heuristic mode that handles right-to-left text and morphology.

Yes. Docker deployment is available for teams that need private infrastructure. Enterprise tier includes full on-premise support with no data leaving your environment. Contact us to discuss your requirements.

Most developers are up and running in under an hour. Install with pip install a2rag, pass your existing RAG output to client.decide(), and handle the three possible outcomes. No infrastructure changes required. Demo notebooks are included for Insurance, HR, Legal, and Support domains.

No. Query content is never used for training. The free developer tier optionally shares anonymous decision metadata (not content) to help improve model accuracy - this is disclosed at signup and can be disabled. Pilot and Enterprise tiers have no telemetry by default.

Get started

Stop your RAG from guessing.

Join the private beta. Limited spots for developers and teams.