An interactive playground for Retrieval-Augmented Generation (RAG). Experiment with private data, compare models, and visualize retrieval metrics in real-time.
Get up and running with your own documents in minutes.
/laboratory page. Sign in if required.Scientia offers three distinct modes depending on your research needs.
| Mode | Best For | Output |
|---|---|---|
| Simple | General Q&A, Summarization | Single answer + Citations |
| A/B Compare | Model evaluation, Prompt testing | Two side-by-side answers |
| Graph | Multi-hop reasoning, Complex relationships | Graph-traversed answer |
Max 20 files total.
Fine-tune your retrieval pipeline using the controls panel.
Controls how many document chunks are retrieved for the LLM. Increase for complex questions, decrease for speed.
Controls randomness. 0.0 is deterministic/factual; 1.0 is creative. Recommended: 0.1 for RAG.
Cross-encoder: Slower but more accurate re-ordering.
LLM: Uses the model itself to pick the best chunks.
Decide how the answer is verified: Skip verification, RAG-V cross-check (default), or Fact-check LLM.
When a query runs successfully, an Answer panel appears:
A Sources area lists the retrieved document chunks used to generate the answer, enabling users to review evidence.
The system automatically assesses multiple quality dimensions and displays scores on a 0β10 scale:
| Metric | What it measures |
|---|---|
| Answer Relevance | How well the answer addresses the question. A higher score means the answer stays on topic. |
| Faithfulness | Whether the answer's statements are backed by the retrieved sources. A perfect 10 indicates no hallucinations. |
| Context Precision | The proportion of the retrieved context that is actually used in the answer. Lower scores imply more irrelevant context. |
| Context Recall | How much of the relevant information from the sources has been used. |
| Completeness | Whether the answer covers all important aspects of the question. |
| Conciseness | Measures brevityβhigh scores indicate the answer isn't overly verbose. |
A Verification section summarizes whether the answer is supported by the retrieved context.
Beneath the evaluation panel is a diagnostics section with a Show trace toggle.
Show trace expanded...Full prompt text...Generated answer...Useful for debugging prompt issues or understanding how the model formed the answer.
Click Show metrics in the header to reveal session statistics: events, average latency, and query history.
Check the API Status card at the bottom of the lab for connectivity health.
Displays aggregate counts at the footer: Total sessions, Total indices, Queries by mode, and System version.
Click Build index to enable querying. Uploading alone does not index the files.
The underlying models may be busy. Try a simpler question or check system status.
Try adjusting Top-k passages to retrieve more context, or refine your query wording.
Strict privacy features may block scripts. Please use Chrome or Firefox for the best experience.