# Direct Preference Optimization: Your Language Model is Secretly a Reward Model

> Research article (Advances in Neural Information Processing Systems 36, 2023) · cited 72× · AI/ML

**Wikidata**: [openalex:W7133208539](https://www.wikidata.org/wiki/openalex:W7133208539)  
**Source**: https://4ort.xyz/entity/direct-preference-optimization-your-language-model-is-secretly-a-reward-model
