How to read a clinical trial

An interactive guide to the four dimensions of trial quality

Step 1 — Eligibility
Defining who can participate
Inclusion and exclusion criteria are set before recruitment begins. These determine what population the trial actually tests — not necessarily the population most likely to buy the supplement. A trial in elderly people with low baseline status may tell us little about healthy middle-aged adults.
⚠ Check: does the study population match you?
Core quality dimension
Step 2 — Randomisation & allocation concealment
Assigning participants to groups
Participants are randomly assigned so that known and unknown differences between people are distributed evenly across groups. Allocation concealment means the person doing the enrolling does not know which group a participant will be placed in — preventing conscious or unconscious selection bias before the trial starts.
Intervention group
Receives the active compound at the tested dose and formulation
Control group
Receives a placebo — ideally identical in appearance, taste and smell
⚠ Check: was allocation sequence concealed from enrollers?
Core quality dimension
Step 3 — Blinding
Keeping group assignment unknown
In a double-blind trial, neither participants nor researchers know who received the active intervention until after analysis. Participant blinding controls for the placebo effect. Researcher blinding prevents differential treatment of groups and biased recording of outcomes.
✓ Strong: double-blind with blinding integrity check ⚠ Weak: open-label with no justification
Step 4 — Follow-up & analysis
Collecting and interpreting outcomes
Outcomes are measured at pre-specified time points. Intention-to-treat analysis includes all randomised participants regardless of whether they completed the protocol — preventing the distortion that comes from only analysing those who responded well enough to stay in the trial.
⚠ Check: was there significant dropout? Were dropouts reported?
Step 5 — Publication
Reporting the findings
A well-reported trial publishes all pre-specified outcomes, including null and negative findings. Comparing the published paper with the trial's registry entry reveals whether outcomes were switched or selectively reported after data was collected.
⚠ Check: does the registry entry match the published outcomes?

Patient-relevant outcomes

  • Reduced symptom burden
  • Improved physical function
  • Avoided hospitalisation
  • Better quality of life
  • Reduced disease incidence
  • Longer survival

Surrogate markers

  • Blood test result
  • Biomarker level
  • Imaging measurement
  • Physiological variable
  • Enzyme activity
  • Tissue concentration
The gap that marketing exploits. A surrogate marker may be genuinely related to a clinical outcome — but changing a blood number does not automatically mean clinical benefit follows. Many interventions that shifted surrogates in the right direction failed to reduce disease incidence or improve wellbeing in trials large enough to measure those outcomes. The inferential leap from "this changes a biomarker" to "this improves your health" is where most supplement overclaiming lives.
?
When are surrogates acceptable?
Not all biomarker evidence is equal
Some surrogate markers have been formally validated as reliable predictors of clinical outcomes through decades of evidence. LDL cholesterol predicting cardiovascular events is an example. Most biomarkers used in supplement research have not been validated in this way. The question to ask is: has a change in this biomarker been directly linked to a change in the clinical outcome in large, independent trials?
?
What does Evidentia do with this distinction?
How the library labels outcome types
Evidentia separates clinical outcome evidence from biomarker evidence throughout the library. A rating of Moderate for a biomarker change is not equivalent to a Moderate rating for a clinical outcome. Each entry states the type of evidence the rating reflects, so readers can see whether the evidence base rests on blood test results or on outcomes that matter in practice.

A p-value tells you the probability of seeing a result this extreme if the intervention had no effect. It says nothing about how large the effect is or whether it matters. A confidence interval tells you both things at once: the most likely effect size and the range of plausible values around it. These four examples show how to read them.

Effect estimates with 95% confidence intervals — four scenarios
← Favours control  |  Favours intervention →
Clear benefit,
precise estimate
Clear benefit
Possible benefit,
wide interval
Uncertain
Crosses zero —
cannot rule out
no difference
Cannot rule out no effect
Statistically sig.
but trivially small
Sig. but tiny
Point estimate (most likely effect)
95% confidence interval
Zero line (no difference)

These scenarios illustrate uncertainty within a single trial. Across multiple studies, estimates may vary further due to differences in populations, doses, and methods — a concept explored in Part 4 of this series.

Width = precision
A narrow interval means the trial had enough participants and a consistent enough outcome to estimate the effect precisely. A wide interval means substantial uncertainty remains about the true size of the effect.
Crossing zero
When the interval includes zero, the estimate is compatible with both benefit and no effect. This applies regardless of whether the p-value is just above or just below 0.05.
Size matters
A statistically significant effect that is clinically trivial — too small to be noticed or to change quality of life — is not evidence of a meaningful benefit, even if the p-value is very small.
p-values alone mislead
A result reported as "significant (p = 0.04)" without an effect size and confidence interval is incomplete. Ask what the actual difference was, and how large the uncertainty around it is.
1
Selection bias
Inadequate randomisation or concealment
Occurs when the method of assigning participants to groups is flawed or the assignment can be predicted or influenced. Even well-intentioned investigators may unconsciously enrol participants likely to respond well into the intervention group. Trials with inadequate allocation concealment consistently report larger apparent treatment effects than those with proper concealment.
Ask: was sequence generation random and allocation concealed from enrollers?
2
Performance bias
Unblinded participants or researchers
When participants know which group they are in, they may behave differently, report outcomes differently, or respond to the expectation of benefit through the placebo effect. When researchers know group assignment, they may interact with participants or record ambiguous outcomes in ways that favour the intervention.
Ask: were both participants and investigators blinded? Was blinding tested?
3
Attrition bias
Differential dropout between groups
When participants drop out of a trial at different rates across groups, and when those dropouts are not accounted for, the remaining participants in each group may no longer be comparable. People who drop out because of side effects or lack of benefit are systematically different from those who complete the trial, and excluding them distorts the results.
Ask: what was the dropout rate? Were dropouts analysed under intention-to-treat?
4
Reporting bias
Selective publication of outcomes
Trials commonly measure multiple outcomes but publish only those that reached significance. When this happens without pre-registration, it is not possible to know how many outcomes were measured and not reported. Studies comparing trial protocols with published papers consistently find that the majority of trials change, add, or omit at least one primary outcome.
Ask: was the trial pre-registered? Do published outcomes match the registry?
5
Funding bias
Industry sponsorship and results
Trials funded by manufacturers of the product being tested consistently produce more favourable findings than independently funded trials on the same interventions. This operates through design choices, selective reporting, choice of comparator, and publication practices. It does not mean every industry-funded trial is biased, but it is a documented systematic effect across multiple fields.
Ask: who funded the trial? Are the authors independent of the sponsor?
6
Small-study effect
Effect size inflation in underpowered trials
Small trials can only produce statistically significant results when the observed effect is large, either because the true effect is large or because chance produced an unusually extreme result. This means that small trials that do reach significance systematically overestimate the true effect size. As larger, better-powered trials are conducted, effects from small trials often shrink considerably.
Ask: was the sample size justified? Has the finding been replicated in larger independent trials?
7
Applicability (external validity)
Population, dose, and formulation mismatch
A well-conducted trial in one population does not automatically apply to another. Trials in elderly people with documented deficiency, in clinical populations, or at doses far higher or lower than typical supplements use may say little about a healthy adult reading a product label. Dose and formulation differences further limit transferability — evidence from one form of a compound does not establish that a different form will behave the same way.
Ask: does the study population, dose, and formulation match the situation in question?