Caption videos
Use AI to caption videos with an API
Модели в коллекции
Сортировка: по популярности (run_count)Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
A multimodal LLM-based AI assistant, which is trained with alignment techniques. Qwen-VL-Chat supports more flexible interaction, such as multi-round question answering, and creative capabilities.
CogVLM2: Visual Language Models for Image and Video Understanding
Latest model in the Qwen family for chatting with video and image models
Generate Tiktok-Style Captions powered by Whisper (GPU)
Apollo 7B - An Exploration of Video Understanding in Large Multimodal Models
Automatically add captions to a video
VideoLLaMA 3: Frontier Multimodal Foundation Models for Video Understanding
Qwen2.5-Omni is an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner.
SOTA open-source model for chatting with videos and the newest model in the Qwen family
MiniCPM-V 4.0 has strong image and video understanding performance
Video Preprocessing tool for captioning multiple videos using GPT, Claude or Gemini
Apollo 3B - An Exploration of Video Understanding in Large Multimodal Models