Все модели
Полный список отсортирован по популярности на Replicate.
Generate a video that morphs between subjects, with an optional style
The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.
Studio-grade lipsync in minutes, not weeks
SDRV_2.0
(wip) Audiocraft is a library for audio processing and generation with deep learning.
An extremely fast all-in-one model to use LCM with SDXL, ControlNet and custom LoRA url's!
Take an image and an audio file and create a video clip
Image-to-video - SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Tuning-Free Longer Video Diffusion via Noise Rescheduling
Audio-Driven Synthesis of Photorealistic Portrait Animations
Deliberate V5 Model (Text2Img, Img2Img and Inpainting)
Emotionally Expressive and Duration-Controlled Text-to-Speech
High-Quality Image Restoration Following Human Instructions
Moonshot AI's latest open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model
Babes XL Model (Text2Img, Img2Img and Inpainting)
Fuyu-8B is a multi-modal text and image transformer trained by Adept AI
🗣️ TalkNet-ASD: Detect who is speaking in a video
Auto fuse a user's face onto the template image, with a similar appearance to the user
Flux lora, use "sftsrv style illustration" to trigger the image generation
FLUX Schnell Model (Text2Img and Img2Img)
Great text-to-image model by Cagliostro Lab
Cartoonifies a video
Super-resolves an LR video frame (ultra-wide) using a reference video frame (wide-angle)
Use Wan 2.2 Animate to copy the motion of a video to another scene
Fine tuned to generate cute mascot avatars, by aistartupkit.com
A diffusion model for generating human motion video from a text prompt
Generate high-quality 2K resolution images from text prompts
NeverSleep's MiquMaid v1 70B Miqu Finetune, GGUF Q3_K_M quantized by NeverSleep.
Highly practical solution for robust monocular depth estimation by training on a combination of 1.5M labeled images and 62M+ unlabeled images
The "newspaper illustration" model specializes in creating black-and-white, cartoon-style drawings reminiscent of classic newspaper illustrations.
SDXL fine-tune to generate images of people in Germain's drawing style
High-Quality Video Generation with Cascaded Latent Diffusion Models
An SDXL fine-tune based on bad 2004 digital photography
Janus-Pro is a novel autoregressive framework for multimodal understanding
Proteus v0.2 Model (Text2Img, Img2Img and Inpainting)
ASR with word alignment based on whisperx using whisper medium (769M)
MetaVoice-1B: 1.2B parameter base model trained on 100K hours of speech
Stable Diffusion XL specifically trained on Inpainting by huggingface
Fine-tuned on Google Material Symbols
Mistral-7B-v0.1 fine tuned for chat with the Dolphin dataset (an open-source implementation of Microsoft's Orca)
Stable diffusion, but with more powerful in-painting & out-painting capabilities
(development branch) Inpainting for Stable Diffusion
Ovi: generate videos with audio from image and text inputs
The Picsart Text2Video-Zero model leverages the power of existing text-to-image synthesis methods (e.g., Stable Diffusion), making them suitable for the video domain.
stable-diffusion models for high quality and detailed anime images