Caliper and Discovery surface and synthesize published medical literature with verifiable citations. Both are evaluated against the same honesty bar: zero fabricated PMIDs, zero invented evidence, explicit refusal to claim what the literature does not show.
Both engines retrieve from the same indexed corpus using the same hybrid retrieval chain (lexical + dense-vector + cross-encoder rerank), indexed against ~29.6M medical documents (SPECTER2-embedded medical subset of a 48M+ document corpus). The difference is in the synthesis layer and the output shape — built for two distinct audiences.
Structured findings with primary-venue grounding, confidence statements, and verifiable citations — on hard research questions, in under a minute.
Prose synthesis with mechanism-level context, inline citations, and methodology-pending transparency — for literature reviews and research workflows.
Internal validation, May 2026. Each engine was run independently on five pharma-grade questions drawn from distinct medical domains. Every cited PMID was verified against the source corpus.
The questions were chosen to stress-test honest behavior, not to flatter the system: drug-trial contradictions (HAT protocol), recent paradigm shifts (SELECT trial), RCT-real-world gaps (DOAC in eGFR < 30), obscure repurposing signals (niclosamide for MDR-TB), and evidence discrepancies (metformin retrospective vs prospective in pancreatic cancer).
What the engines do not do: they do not generate novel scientific findings. They surface, organize, and explain what is already in the published literature, with citations a reader can verify. They refuse to fill gaps with model knowledge when the literature is silent.
Two outputs from the same evaluation: one where the engine surfaced and explained a known confounder, and one — the more important example — where it refused to claim evidence that does not exist.
Discovery synthesis (verbatim, with inline citations to the actual retrieved papers):
“Specialist analysis identifies a critical translational gap, confirming that published literature contains no clinical or preclinical data specifically evaluating niclosamide against multidrug-resistant Mycobacterium tuberculosis or systematically targeting biofilm-associated Staphylococcus aureus. Preclinical investigations instead demonstrate efficacy solely through in vitro assays and murine topical models against resistant Gram-positive pathogens, alongside structural biofilm disruption against S. aureus [1]. Pharmacological assessments indicate that niclosamide’s profound first-pass glucuronidation and lipophilicity-driven distribution limits create a fundamental pharmacokinetic mismatch that restricts its therapeutic utility to non-systemic applications [1]. Consequently, validated antimicrobial strategies remain confined to localized interventions, such as intestinal decolonization and direct device surface coatings [4; 2].”
What this proves. The retrieved corpus contained real, on-topic niclosamide papers — Gram-positive antimicrobial activity, surface-coating applications, intestinal decolonization. The engine cited them accurately. But for the specific clinical claim the user asked about (MDR-TB), the literature is genuinely absent — and the engine said so, in plain language, rather than extrapolating from adjacent results.
PMID 36173387 J Antimicrob Chemother 2022 ·
PMID 28471351 Biomed Mater 2017 ·
PMID 29432868 Int J Antimicrob Agents 2018.
Discovery synthesis (verbatim):
“The retrospective survival advantage attributed to adjunctive metformin in pancreatic ductal adenocarcinoma is fundamentally distorted by methodological confounders, particularly concomitant statin use and variable glycemic control [1], [3]. While unadjusted observational cohorts consistently link metformin exposure to prolonged overall survival [5], this association vanishes in prospective interventional trials where the drug’s unfavorable pharmacokinetic profile obscures antitumor efficacy [2]. Although pathobiological models propose AMPK/mTOR-mediated antiproliferative effects as a theoretical rationale [3], the systematic divergence between uncontrolled registry data and rigorously monitored clinical trials confirms that metformin does not deliver validated survival benefits for PDAC [4].”
What this proves. The statin co-use confounder is a known but underdiscussed pattern in the literature. The engine surfaced it from the cited papers (it did not invent it) and explained how it accounts for the retrospective–prospective discrepancy. The discipline is in the framing: the engine reads what is in the literature and reports it; it does not claim new mechanistic discovery.
PMID 26067687 Lancet Oncology 2015 (Reni RCT) ·
PMID 26474429 Pancreas 2016 (statin + metformin) ·
PMID 27069086 J Clin Oncol 2016 (Cautionary Lesson).
Honest scope. Honest exclusions.
We deploy the engines against your evidence corpus. You evaluate the output on questions you already know the right answer to. You tell us what works and what does not. The same honesty discipline applies to the pilot itself — we report what fired and what did not, and we do not soften the result.
Start a pilot conversation →