Все модели
Полный список отсортирован по популярности на Replicate.
MOSS-TTSD (text to spoken dialogue) is an open-source bilingual spoken dialogue synthesis model that supports both Chinese and English. It can transform dialogue scripts between two speakers into natural, expressive conversational speech.
Welcome to Danny K's World
Some scrappy experiments 🫣
Model
stream previews as it is generated
controlnet-lineart-brightness-tile-inpainting + low res fix with tile
Latent diffusion models, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches
Wan2.1 14B 480p LoRA inference via Diffusers (Work in progress)
SDXL 1.0 + Wrong LoRA weights + Better VAE | WIP
Stable Diffusion 2.1 - NSFW - Supabase
Open-weight version of FLUX.1 Kontext via Hugging Face Diffusers
Adding Vietnamese punctuation and capitalization raw text from ASR system.
a fine tuned gopher flux LoRA - the trigger word is GOGOPH
Psst..
Manmaru mix v3.0
Extract text with pixel coordinates from screenshots and images. GPU-accelerated, multi-language, perfect for camera-translation overlays.
Segmind-Vega Model is a distilled version of SDXL, offering a 70% reduction in size and an 100% speedup
The Fish Speech V1.5 model.
Shiba stable diffusion model
OCR receipt into JSON
Super High Quality Depth Maps 🗺️: An End-to-End Tile-Based Framework 🏗️ for High-Resolution Monocular Metric Depth Estimation 🔍📏
pitch correction on your voice
Agentic image model optimized for high-quality, fast generations supporting font control
MARS5, a fully open-source (commercially usable) voice-cloning/TTS with break-through prosody and realism.
Music Generator
This is VACE-1.3B model optimised with pruna ai. Wan2.1 VACE is an all-in-one model for video creation and editing.
Easily create video datasets with auto-captioning for Hunyuan-Video LoRA finetuning
A version of FLUX.2 [klein] 9B-base that supports fast fine-tuned lora inference
PuLID-FLUX-v0.9.0
Change eye (iris) color
Object Detector Using Yolo
Generate multilingual text-to-speech audio in over 30 languages
DreamCraft3D is a text and image to 3D model. Dreamcraft3D uses DeepFloyd IF and Stable Zero123, non-commercial research-only models. Please make sure you read and abide to the relevant licenses before using it.
A model trained for the task of story visualization; generating images to pair with captions in a story.
A LoRA based on Foundation