Все модели
Полный список отсортирован по популярности на Replicate.
Simple tool to merge together separate video snippets
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
Alibaba Wan 2.6 image to video generation model
Waifu Diffusion v1.4 16bit
Un-distilled version of FLUX.2 [klein]. A foundation model for maximum flexibility and control
A 7 billion parameter Llama tuned for coding and conversation
SDXL ControlNet - OpenPose
Generate expressive, natural speech in 23 languages. Features instant voice cloning from short audio, emotion control, and seamless cross-language voice transfer.
High-Fidelity GAN Inversion for Image Attribute Editing
merge a video and an audio file
Three models in one Cog: Absolute Reality v1.8.1, DreamShaper v8 and Meina V4
Modify images using sketches
Some 4x esrgan upscalers
Joint Low-light Enhancement and Deblurring in the Dark
Learning to Animate Images via Latent Space Navigation
A capable large language model for natural language to SQL generation.
A 2x faster qwen 3 model through pruna oss
90s anime
Turn any description into wallpaper tiles
Scale-Arbitrary Super-Resolution
Create your own Realistic Voice Cloning (RVC v2) dataset using a YouTube link
Flux LoRA for creating whimsical illustrations. Use "illustration in the style of WHMSCPE001" to trigger the model.
Notion-style illustration
An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Generate high-quality music and sound from text prompts
VQ-Diffusion for Text-to-Image Synthesis
Facial Expression Recognition using Residual Masking Network
Stylized Audio-Driven Single Image Talking Face Animation
Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This is the SUPIR-v0F model and does NOT use LLaVA-13b.
Meta's Llama 2 7b Chat - GPTQ
Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages
An advanced open-source multimodal large language model from the InternVL3.5 family, specializing in versatile vision-language tasks, and enhanced reasoning
Midjourney v6 text-to-image quality model but Open and Decentralized
ControlNet with SD 2.1
fancyfeast/joytag
Stable Diffusion 3 medium with added variability in outputs. Non-commercial use only, unless you have a Stability AI Self Hosted License.
an autocomplete api that runs on the cpu :)
Generate 360 panorama images.
Detects one paragraph of text in an image.
Create beautiful icons & emojis
ASR from video URL based on whisperx using large-v2 model
A FLUX Kontext fine-tune to fix plastic AI skin textures
AuraSR v2: Second-gen GAN-based Super-Resolution for real-world applications
输入图片和音频合并关键帧视频
Generate realistic lipsyncs with Sync Labs' 2.0 model
sdxs-512-0.9 can generate high-resolution images in real-time based on prompt texts, trained using score distillation and feature matching
Lora & openjourney V4
In-Context LoRA with Image-to-Image and Inpainting to apply your logo to anything