Documentation

A2RAG is a decision layer that sits between your RAG retrieval and your users. It receives a query, retrieved contexts, and a draft answer — and returns a routing decision: answer, clarify, or abstain.

Quickstart

Get from zero to your first decision in under 5 minutes.

# Install
pip install a2rag

# Run your first decision
from a2rag import A2RAGClient

client = A2RAGClient(api_key="your_key_here")

decision = client.decide(
    query="What is the refund window?",
    contexts=["Refunds within 14 days for unused items."],
    draft_answer="14 days.",
)

print(decision.action)       # "answer"
print(decision.confidence)   # 0.82
print(decision.should_answer) # True

Installation

pip install a2rag

Requires Python 3.8+. No required dependencies beyond the standard library for the client package.

Authentication

All API requests require an API key passed as a header:

X-API-Key: a2rag_pilot_...

Get your key: Request early access → Developer keys are issued automatically. Pilot keys within 24 hours.

POST /decide

The core endpoint. Evaluates a query against retrieved contexts and returns a routing decision.

POST https://api.a2rag.ai/decide
Content-Type: application/json
X-API-Key: your_key

Request body

Field	Type	Required	Description
query	string	required	The user's original question
contexts	string[]	required	Retrieved chunks from your RAG system
draft_answer	string	required	The LLM-generated draft answer
domain	string	optional	insurance · legal · hr · support · medical
tau_evidence	float	optional	Override evidence threshold (0.0–1.0)
tau_completeness	float	optional	Override completeness threshold

Response

Response fields

action

"answer" | "clarify" | "abstain"

The routing decision

confidence

float 0.0–1.0

How confident A2RAG is in this decision

explanation

string

Human-readable reason for the decision

clarification

string | null

Follow-up question (only when action=clarify)

signals

object

Evidence signals: coverage, retrieval_strength, consistency, confidence

decision_id

UUID

Unique ID — use for feedback submission

latency_ms

integer

Processing time in milliseconds

{
  "decision_id": "5bbc7903-2250-4836-b0f5...",
  "action": "answer",
  "confidence": 0.82,
  "explanation": "Answer supported by evidence (91% of claims verified).",
  "clarification": null,
  "signals": {
    "coverage": 0.91,
    "retrieval_strength": 1.0,
    "consistency": 1.0,
    "confidence": 0.82
  },
  "latency_ms": 12
}

POST /feedback

Submit feedback on a decision. Used to improve calibration over time.

client.feedback(
    decision_id="5bbc7903-...",
    was_correct=True,   # or False
    comment="Correct — user confirmed refund"
)

GET /health

{
  "ok": true,
  "version": "8.0.0",
  "nli_loaded": true,
  "embed_loaded": true
}

Actions

Action	Meaning	What to do
answer	Corpus supports the draft. Safe to show.	Display draft_answer to user
clarify	Info exists but query is instance-specific.	Ask user decision.clarification
abstain	Topic not covered in corpus.	Escalate, custom message, or webhook

Evidence Signals

Signal	Range	Description
coverage	0.0–1.0	How much of the draft is supported by retrieved contexts
retrieval_strength	0.0–1.0	Average confidence of retrieved chunks
consistency	0.0–1.0	Whether contexts agree with each other (1.0 = consistent)
completeness	0.0–1.0	Whether query can be answered without instance-specific data

Domain Presets

Pass domain to use calibrated thresholds for your industry.

Value	Use case	Behavior
insurance	Coverage, claims, policies	Conservative — high abstain rate
legal	Contracts, compliance, NDA	Very conservative
medical	Clinical, drug, treatment	Very conservative
hr	Leave, benefits, policy	Moderate
support	Product, billing, SLA	Moderate

Languages

A2RAG detects language automatically. No configuration needed.

Language	Model	Accuracy
English	MS-MARCO cross-encoder	94%
Hebrew	multilingual-e5-base	85–100%
Arabic	multilingual-e5-base	~85%
French / Spanish	MS-MARCO	~90%

LangChain Integration

from langchain.chains import RetrievalQA
from a2rag import A2RAGClient

client = A2RAGClient(api_key="your_key")
chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

def safe_answer(query):
    result = chain.run(query)
    contexts = retriever.get_relevant_documents(query)
    ctx_texts = [doc.page_content for doc in contexts]

    decision = client.decide(
        query=query,
        contexts=ctx_texts,
        draft_answer=result,
    )

    if decision.should_answer:
        return result
    elif decision.should_clarify:
        return decision.clarification
    else:
        return "I don't have enough information to answer that."

Error Codes

Code	Meaning	Fix
401	Invalid or missing API key	Check X-API-Key header
403	Key suspended or revoked	Contact stav@aibee.co.il
422	Invalid request body	Check required fields: query, contexts, draft_answer
429	Monthly limit reached	Upgrade plan or request extension
500	Internal server error	Retry. If persists, contact support
503	Service temporarily unavailable	Check api.a2rag.ai/health and retry