# Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

> Research article (2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023) · cited 64× · AI/ML

**Wikidata**: [openalex:W4386076398](https://www.wikidata.org/wiki/openalex:W4386076398)  
**Source**: https://4ort.xyz/entity/rethinking-video-vits-sparse-video-tubes-for-joint-image-and-video-learning