Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
-
Updated
Dec 10, 2024 - Python
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
"VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Language Repository for Long Video Understanding
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
[ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning", AAAI 2025
This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
Add a description, image, and links to the long-video-understanding topic page so that developers can more easily learn about it.
To associate your repository with the long-video-understanding topic, visit your repo's landing page and select "manage topics."