# Fast On-device LLM Inference with NPUs

> Research article (Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, 2025) · cited 17× · AI/ML

**Wikidata**: [openalex:W4407196790](https://www.wikidata.org/wiki/openalex:W4407196790)  
**Source**: https://4ort.xyz/entity/fast-on-device-llm-inference-with-npus
