Add optional retrieval confidence metadata to RAG response by Nithin00614 · Pull Request #11130 · deepset-ai/haystack

Nithin00614 · 2026-04-17T08:17:22Z

Add Retrieval Confidence Metadata to MultiQueryEmbeddingRetriever

Problem

When debugging poor RAG responses, it's difficult to determine if the issue is:

Low-quality retrievals (bad embeddings/search)
Good retrievals but poor LLM synthesis

There's currently no visibility into retrieval confidence scores.

Solution

Add optional, non-breaking metadata to track retrieval quality:

Features:

Calculates top score and average score from retrieved documents
Configurable confidence threshold (default: 0.7)
Classifies confidence as "high" or "low"
Tracks scored vs unscored document counts
Attached to each document's metadata

Backward Compatibility:

Completely non-breaking
Only adds metadata if scores exist
No changes to existing behavior
Documents without scores are unaffected

Usage

Default behavior (no changes needed):

retriever = MultiQueryEmbeddingRetriever(document_store=document_store)
results = retriever.run(queries=["What is Haystack?"])
# Works exactly as before

With confidence threshold customization:

retriever = MultiQueryEmbeddingRetriever(document_store=document_store)
results = retriever.run(
    queries=["What is Haystack?"],
    retriever_kwargs={"confidence_threshold": 0.8}  # Custom threshold
)

# Access confidence metadata
for doc in results["documents"]:
    if "retrieval_confidence" in doc.meta:
        print(f"Confidence: {doc.meta['retrieval_confidence']}")

Example output:

{
  "retrieval_confidence": {
    "top_score": 0.8524,
    "avg_score": 0.7231,
    "threshold": 0.7,
    "confidence_level": "high",
    "scored_docs_count": 5,
    "total_docs_count": 5
  }
}

Benefits

Debugging Aid: Quickly identify when retrievals are poor quality
Monitoring: Track retrieval performance in production
Quality Control: Set alerts for low-confidence retrievals
User Feedback: Show confidence indicators in UI
Model Tuning: Identify when to retune embeddings

Testing

Tested with documents that have scores
Tested with documents without scores
Tested with mixed (some scored, some unscored)
Verified backward compatibility (existing pipelines unchanged)
Tested with custom threshold via retriever_kwargs
Verified metadata structure and JSON serialization

Changes

File: haystack/components/retrievers/multi_query_embedding_retriever.py

Lines modified: ~10 lines added after line 124

Changes:

Calculate scores from documents with valid score attributes
Compute top score, average score, and confidence classification
Allow threshold configuration via retriever_kwargs
Attach metadata to each document's meta dictionary
Safe handling for documents without scores

Checklist

Non-breaking change (existing code works unchanged)
Added functionality is opt-in via metadata
Code follows Haystack style guidelines
Self-documenting code with clear variable names
No new dependencies
Handles edge cases (no scores, None values)

Related Issues

Closes #XXXX (if applicable)

Future Enhancements

This lays groundwork for:

Confidence-based filtering (future PR)
Adaptive retrieval strategies
Automatic query refinement on low confidence
Retrieval quality metrics dashboard

Type: Enhancement
Impact: Low (non-breaking, metadata only)
Complexity: Low (simple calculation + metadata attachment)

vercel · 2026-04-17T08:17:28Z

@Nithin00614 is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

CLAassistant · 2026-04-17T08:17:46Z

All committers have signed the CLA.

Add optional retrieval confidence metadata to RAG response

8df9773

Nithin00614 requested a review from a team as a code owner April 17, 2026 08:17

Nithin00614 requested review from anakin87 and removed request for a team April 17, 2026 08:17

Merge branch 'main' into feature/add-retrieval-confidence

ab41be9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional retrieval confidence metadata to RAG response#11130

Add optional retrieval confidence metadata to RAG response#11130
Nithin00614 wants to merge 2 commits intodeepset-ai:mainfrom
Nithin00614:feature/add-retrieval-confidence

Nithin00614 commented Apr 17, 2026

Uh oh!

vercel bot commented Apr 17, 2026

Uh oh!

CLAassistant commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Nithin00614 commented Apr 17, 2026

Add Retrieval Confidence Metadata to MultiQueryEmbeddingRetriever

Problem

Solution

Usage

Benefits

Testing

Changes

Checklist

Related Issues

Future Enhancements

Uh oh!

vercel bot commented Apr 17, 2026

Uh oh!

CLAassistant commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Apr 17, 2026 •

edited

Loading