Models answer a large set of power-sector multiple-choice questions using only their embedded training knowledge (no external tools).
Knowledge Plus Web Search (multiple choice)
Baseline Knowledge (multiple-choice)
Open-Ended Short Answers (with and without web search)
Phase 1
Phase 2
Phase 3
EPRI’s initial benchmarking uses a three-phase methodology that increases realism step by step
What EPRI Tested
The same questions are repeated, but this time the models are allowed to perform internet searches to isolate the impact of retrieval augmentation through web search.
A subset of questions is reformulated into open-ended prompts that require models to generate free-form answers under conditions closer to real-world use.