# Data Set and Benchmark (MedGPTEval) to Evaluate Responses From Large Language Models in Medicine: Evaluation Development and Validation

> Research article (JMIR Medical Informatics, 2024) · cited 22× · AI/ML

**Wikidata**: [openalex:W4400261110](https://www.wikidata.org/wiki/openalex:W4400261110)  
**Source**: https://4ort.xyz/entity/data-set-and-benchmark-medgpteval-to-evaluate-responses-from-large-language-models-in-medicine-evaluation-development-an
