# A 17–95.6 TOPS/W Deep Learning Inference Accelerator with Per-Vector Scaled 4-bit Quantization for Transformers in 5nm

> Research article (2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), 2022) · cited 42× · AI/ML

**Wikidata**: [openalex:W4286571858](https://www.wikidata.org/wiki/openalex:W4286571858)  
**Source**: https://4ort.xyz/entity/a-1795-6-tops-w-deep-learning-inference-accelerator-with-per-vector-scaled-4-bit-quantization-for-transformers-in-5nm