# DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

> Research article (SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, 2022) · cited 219× · AI/ML

**Wikidata**: [openalex:W4321636575](https://www.wikidata.org/wiki/openalex:W4321636575)  
**Source**: https://4ort.xyz/entity/deepspeed-inference-enabling-efficient-inference-of-transformer-models-at-unprecedented-scale