Все модели
Полный список отсортирован по популярности на Replicate.
Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house
Mistoon Anime XL Model (Text2Img, Img2Img and Inpainting)
GFPGAN for human face video upscaling
Realistic Vision V4.0 Model (Text2Img, Img2Img and Inpainting)
Modern line icons with a consistent weight and style.
Generate a new image given any input text with majicMix realistic v6
nomic-embed-text-v1 is 8192 context length text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks
Gemma2 2b by Google
Counterfeit XL v2 Model (Text2Img, Img2Img and Inpainting)
Generates unrestricted images from text prompts using a fine-tuned Stable Diffusion model
This is an optimised version of the hidream-full model using the pruna ai optimisation toolkit!
Orpheus 3B - high quality, emotive Text to Speech
Flux version of FormFinder-XL - trained to create moody atmospheric images but is quite versatile to be mixed with other LoRAs to generate diverse styles with interesting architecture design shapes. As the name suggests: it will always tend to surprise yo
LoRA model trainer with presets for faces, objects, and styles
Tempo BPM estimation with Essentia
Whisper Model that can be use for adding domain-specific words
SDXL model trained on Hiroshi Nagai's illustrations.
Leonardo AI’s first foundational model produces images up to 5 megapixels (fast, quality and ultra modes)
Text-to-video generation
Flux lora, use "in the style of TOK a trtcrd tarot style" to trigger image generation
SDXL fine-tuned on MJv6 Simpsons generated images
SDXL ControlNet - Depth
A model for experimenting with all the SD3 settings. Non-commercial use only, unless you have a Stability AI Self Hosted License.
Audio-based Lip Synchronization for Talking Head Video
Add lip-sync to any video with an audio file or text
Spotify's Basic Pitch Model
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
A 70 billion parameter Llama tuned for coding and conversation
(Research only) IP-Adapter-FaceID can generate various style images conditioned on a face with only text prompts
A 6B parameter open bilingual chat LLM | 开源双语对话语言模型
high-quality highly detailed anime stylized latent diffusion model
Alibaba Wan 2.5 text to video generation model
Image generation model from Reve which handles multiple input reference images
a StyleGAN Encoder for Image-to-Image Translation
VideoLLaMA 3: Frontier Multimodal Foundation Models for Video Understanding
Cinematic Flux LoRA: Use "r3dcma" in your prompt to trigger this LoRA model.
PyTorch implementation of AnimeGAN for fast photo animation
Qwen2.5-Omni is an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner.
stable-video-diffusion
Fast animation using a latent consistency model
The online demo of Bread (Low-light Image Enhancement via Breaking Down the Darkness). This demo is developed to enhance images with poor/irregular illumination and annoying noises.
🔥 SeedVR2: one-step video & image restoration with 3B/7B hot‑swap and optional color fix 🎬✨
Upscale videos by 4x, up to a maximum of 4k
mixed stable diffusion model
SDXL finetune to generate slick Icons and Flat Pop Constructivist Graphics with thick edges. Trained on Bing Generations