Documentation
A2RAG is a decision layer that sits between your RAG retrieval and your users. It receives a query, retrieved contexts, and a draft answer — and returns a routing decision: answer, clarify, or abstain.
Quickstart
Get from zero to your first decision in under 5 minutes.
# Install
pip install a2rag
# Run your first decision
from a2rag import A2RAGClient
client = A2RAGClient(api_key="your_key_here")
decision = client.decide(
query="What is the refund window?",
contexts=["Refunds within 14 days for unused items."],
draft_answer="14 days.",
)
print(decision.action) # "answer"
print(decision.confidence) # 0.82
print(decision.should_answer) # True
Installation
pip install a2rag
Requires Python 3.8+. No required dependencies beyond the standard library for the client package.
Authentication
All API requests require an API key passed as a header:
X-API-Key: a2rag_pilot_...
Get your key: Request early access → Developer keys are issued automatically. Pilot keys within 24 hours.
POST /decide
The core endpoint. Evaluates a query against retrieved contexts and returns a routing decision.
POST https://api.a2rag.ai/decide
Content-Type: application/json
X-API-Key: your_key
Request body
| Field | Type | Required | Description |
|---|---|---|---|
| query | string | required | The user's original question |
| contexts | string[] | required | Retrieved chunks from your RAG system |
| draft_answer | string | required | The LLM-generated draft answer |
| domain | string | optional | insurance · legal · hr · support · medical |
| tau_evidence | float | optional | Override evidence threshold (0.0–1.0) |
| tau_completeness | float | optional | Override completeness threshold |
Response
"answer" | "clarify" | "abstain"{
"decision_id": "5bbc7903-2250-4836-b0f5...",
"action": "answer",
"confidence": 0.82,
"explanation": "Answer supported by evidence (91% of claims verified).",
"clarification": null,
"signals": {
"coverage": 0.91,
"retrieval_strength": 1.0,
"consistency": 1.0,
"confidence": 0.82
},
"latency_ms": 12
}
POST /feedback
Submit feedback on a decision. Used to improve calibration over time.
client.feedback(
decision_id="5bbc7903-...",
was_correct=True, # or False
comment="Correct — user confirmed refund"
)
GET /health
{
"ok": true,
"version": "8.0.0",
"nli_loaded": true,
"embed_loaded": true
}
Actions
| Action | Meaning | What to do |
|---|---|---|
| answer | Corpus supports the draft. Safe to show. | Display draft_answer to user |
| clarify | Info exists but query is instance-specific. | Ask user decision.clarification |
| abstain | Topic not covered in corpus. | Escalate, custom message, or webhook |
Evidence Signals
| Signal | Range | Description |
|---|---|---|
| coverage | 0.0–1.0 | How much of the draft is supported by retrieved contexts |
| retrieval_strength | 0.0–1.0 | Average confidence of retrieved chunks |
| consistency | 0.0–1.0 | Whether contexts agree with each other (1.0 = consistent) |
| completeness | 0.0–1.0 | Whether query can be answered without instance-specific data |
Domain Presets
Pass domain to use calibrated thresholds for your industry.
| Value | Use case | Behavior |
|---|---|---|
| insurance | Coverage, claims, policies | Conservative — high abstain rate |
| legal | Contracts, compliance, NDA | Very conservative |
| medical | Clinical, drug, treatment | Very conservative |
| hr | Leave, benefits, policy | Moderate |
| support | Product, billing, SLA | Moderate |
Languages
A2RAG detects language automatically. No configuration needed.
| Language | Model | Accuracy |
|---|---|---|
| English | MS-MARCO cross-encoder | 94% |
| Hebrew | multilingual-e5-base | 85–100% |
| Arabic | multilingual-e5-base | ~85% |
| French / Spanish | MS-MARCO | ~90% |
LangChain Integration
from langchain.chains import RetrievalQA
from a2rag import A2RAGClient
client = A2RAGClient(api_key="your_key")
chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
def safe_answer(query):
result = chain.run(query)
contexts = retriever.get_relevant_documents(query)
ctx_texts = [doc.page_content for doc in contexts]
decision = client.decide(
query=query,
contexts=ctx_texts,
draft_answer=result,
)
if decision.should_answer:
return result
elif decision.should_clarify:
return decision.clarification
else:
return "I don't have enough information to answer that."
Error Codes
| Code | Meaning | Fix |
|---|---|---|
| 401 | Invalid or missing API key | Check X-API-Key header |
| 403 | Key suspended or revoked | Contact stav@aibee.co.il |
| 422 | Invalid request body | Check required fields: query, contexts, draft_answer |
| 429 | Monthly limit reached | Upgrade plan or request extension |
| 500 | Internal server error | Retry. If persists, contact support |
| 503 | Service temporarily unavailable | Check api.a2rag.ai/health and retry |