# Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs

> Research article (ACM Transactions on Architecture and Code Optimization, 2023) · cited 11× · AI/ML

**Wikidata**: [openalex:W4386191499](https://www.wikidata.org/wiki/openalex:W4386191499)  
**Source**: https://4ort.xyz/entity/improving-computation-and-memory-efficiency-for-real-world-transformer-inference-on-gpus
