# ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

> Research article (2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024) · cited 50× · AI/ML

**Wikidata**: [openalex:W4402716166](https://www.wikidata.org/wiki/openalex:W4402716166)  
**Source**: https://4ort.xyz/entity/vip-llava-making-large-multimodal-models-understand-arbitrary-visual-prompts