# SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading

> Research article (Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024) · cited 11× · AI/ML

**Wikidata**: [openalex:W4404792913](https://www.wikidata.org/wiki/openalex:W4404792913)  
**Source**: https://4ort.xyz/entity/sciex-benchmarking-large-language-models-on-scientific-exams-with-human-expert-grading-and-automatic-grading