Understanding Recommendations¶

The AI Imaging Agent uses a sophisticated two-stage pipeline to provide ranked tool recommendations. This guide explains how recommendations are generated and how to interpret them.

How Recommendations Work¶

Two-Stage Pipeline¶

graph TD
    A[User Input: Image + Query] --> B[Stage 1: Retrieval]
    B --> C[Candidate Tools]
    C --> D[Stage 2: Agent Selection]
    D --> E[Ranked Recommendations]

Stage 1: Retrieval (Fast Text Search)¶

The retrieval stage quickly narrows down candidates:

Query Enhancement: Your query is enriched with format tokens

Original: "segment lungs"
Enhanced: "segment lungs format:DICOM format:CT format:3D"

Embedding Search: BGE-M3 model converts query to vector
FAISS Vector Search: Finds semantically similar tools
CrossEncoder Reranking: Re-scores candidates for better relevance
Result: Top-K candidates (default: 8)

No LLM calls - this stage is fast and deterministic.

Stage 2: Agent Selection (VLM-Powered)¶

The agent analyzes candidates with full context:

Vision Analysis (only for VLM): GPT-4o/4o-mini (or your custom model) sees your image preview
Context Integration: Considers query + metadata + candidates
Reasoning: Explains why each tool matches
Scoring: Assigns accuracy scores (0-100%)
Ranking: Orders tools by relevance

Single VLM call - comprehensive analysis with explanations.

Recommendation Format¶

Each recommendation includes several components:

Header Information¶

Rank Number¶

Position in the ranked list (1 = best match).

1️⃣ TotalSegmentator
2️⃣ MedSAM
3️⃣ nnU-Net

Tool Name¶

The software or tool identifier, typically matching: - GitHub repository name - Published tool name - Common community name

Accuracy Score¶

Confidence level from 0-100%:

90-100%: Excellent match, highly confident
70-89%: Good match, suitable for task
50-69%: Moderate match, may need adaptation
Below 50%: Weak match, alternative approach

Score Interpretation

Scores reflect match quality for your specific task and image, not overall tool quality.

Body Content¶

Description¶

Brief explanation of what the tool does:

TotalSegmentator: Automated multi-organ segmentation for CT scans supporting 104 anatomical structures.

Explanation¶

Why this tool matches your request:

Explanation: TotalSegmentator is specifically designed for whole-body CT segmentation including lung structures. It supports DICOM input and provides automated, accurate lung segmentation without manual intervention.

Key points in explanations:

Task Alignment: How well it matches your goal
Format Compatibility: Support for your file format
Relevant Features: Specific capabilities that help
Known Limitations: Caveats or requirements

Demo Link¶

Direct link to a runnable example:

🚀 Demo: https://huggingface.co/spaces/example/totalsegmentator

Types of demos:

HuggingFace Spaces: Interactive Gradio/Streamlit apps
Colab Notebooks: Jupyter notebooks you can run
Web Demos: Hosted web interfaces
Documentation: GitHub README with examples

Technical details about the tool:

Modality Support¶

Medical imaging modalities the tool works with:

Modalities: CT, MRI, X-ray

Common modalities:

CT: Computed Tomography
MRI: Magnetic Resonance Imaging
XR: X-ray radiography
US: Ultrasound
PET: Positron Emission Tomography
OCT: Optical Coherence Tomography
Microscopy: Various microscopy types

Dimension Support¶

Image/volume dimensions supported:

Dimensions: 2D, 3D

2D: Single slice images
3D: Volumetric data
4D: Time-series volumes

Format Support¶

File formats the tool can process:

Formats: DICOM, NIfTI, PNG, JPEG

Format Importance

Tools that support your exact format are prioritized in ranking.

Tags¶

Categorization and keywords:

Tags: segmentation, medical-imaging, deep-learning, pytorch

Used for: - Task categorization - Technology stack - Domain specificity - Feature indicators

Scoring Factors¶

The agent considers multiple factors when scoring:

Primary Factors (High Weight)¶

Task Match: How well the tool's purpose aligns with your request
Format Compatibility: Support for your input format
Image Content: Visual analysis of what's in your image
Dimension Match: 2D tool for 2D images, 3D for volumes

Secondary Factors (Medium Weight)¶

Modality Specificity: Tool designed for your imaging modality
Feature Coverage: Breadth of capabilities
Stated Requirements: Meets any specific requirements you mentioned
Quality Indicators: Stars, citations, community adoption

Tertiary Factors (Low Weight)¶

License: Open-source vs. proprietary
Recency: Recently updated tools
Documentation Quality: Demo availability, examples
Popularity: Community usage and validation

Interpreting Results¶

High-Scoring Recommendations¶

When you see scores above 85%:

✅ Strong match - Tool is designed for this task ✅ Format compatible - Handles your file type ✅ Proven capability - Demonstrated results in this domain

Action: These are your best options. Try the top recommendation first.

Medium-Scoring Recommendations¶

Scores 60-85%:

⚠️ Good match - Suitable but may need adaptation ⚠️ Possible format conversion - Might require preprocessing ⚠️ Partial capability - Covers some but not all requirements

Action: Worth trying, especially if top choices don't work. Read explanations carefully.

Low-Scoring Recommendations¶

Scores below 60%:

❌ Weak match - Limited alignment with task ❌ Format issues - May not support your format ❌ Alternative approach - Different methodology

Action: Consider as fallback or for exploring alternative approaches.

Why Rankings Change¶

Rankings depend on your specific context:

Same Tool, Different Queries¶

"Segment lungs" vs "Detect tumors": - Different tools excel at each task - Rankings change based on task specificity

Same Task, Different Formats¶

DICOM input vs PNG input: - DICOM-compatible tools rank higher for DICOM - General tools rank higher for standard images

Same Task, Different Images¶

CT scan vs X-ray: - Modality-specific tools get boosted - Visual content influences selection

Common Patterns¶

All High Scores¶

Most recommendations >80%:

Good news! Multiple excellent options
Strategy: Try top recommendation, then compare

Mixed Scores¶

Wide range (e.g., 90%, 65%, 45%):

Top choice clear - Focus on highest scorer
Strategy: Try #1, fall back to #2 if needed

All Low Scores¶

All recommendations <60%:

Limited options - Task may be specialized
Strategy: Try anyway, or rephrase query
Alternative: Ask for suggestions

Acting on Recommendations¶

First Time with a Tool¶

Read the explanation - Understand why it was recommended
Check format compatibility - Verify it supports your format
Click demo link - See it in action
Try on your data - Run if agent offers

Comparing Tools¶

When choosing between similar scores:

Check licenses if redistribution matters
Compare formats - prefer exact format match
Review tags - match technology preferences
Demo availability - easier to try

When Results Don't Match¶

If recommendations seem wrong:

Provide more context: "I need 3D volume support"
Mention specific requirements: "Must work with DICOM"
Exclude irrelevant tools: [EXCLUDE:toolname]
Request alternatives: "Can you search differently?"

Explanation Analysis¶

Read explanations to understand:

Positive Indicators¶

Look for phrases like: - "Specifically designed for..." - "Supports your exact format..." - "Demonstrated accuracy on..." - "Active development and maintained"

Caveats¶

Watch for: - "May require preprocessing..." - "Limited to 2D images..." - "Experimental feature..." - "Requires specific environment..."

Requirements¶

Note when explanations mention: - "Needs GPU for inference" - "Requires Python 3.8+" - "DICOM headers must include..." - "Minimum image resolution..."

Next Steps¶

Learn about Running Demos
Explore Advanced Features
Understand the Architecture Overview