# ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching

> Research article (2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), 2024) · cited 23× · AI/ML

**Wikidata**: [openalex:W4401211590](https://www.wikidata.org/wiki/openalex:W4401211590)  
**Source**: https://4ort.xyz/entity/alisa-accelerating-large-language-model-inference-via-sparsity-aware-kv-caching
