Official AI models
Official AI models: Always available, stable, and predictably priced
Модели в коллекции
Сортировка: по популярности (run_count)The fastest image generation model tailored for local development and personal use
An 8 billion parameter language model from Meta, fine tuned for chat completions
A 70 billion parameter language model from Meta, fine tuned for chat completions
Real-ESRGAN with optional face correction and adjustable upscale
Google's latest image editing model in Gemini 2.5
Fine-Tuned Vision Transformer (ViT) for NSFW Image Classification
Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.
Base version of Llama 3, an 8 billion parameter language model from Meta.
A state-of-the-art text-based image editing model that delivers high-quality outputs with excellent prompt following and consistent results for transforming images through natural language
A 12 billion parameter rectified flow transformer capable of generating images from text descriptions
This is the fastest Flux endpoint in the world.
Low latency, low cost version of OpenAI's GPT-4o model
Unified text-to-image generation and precise single-sentence editing at up to 4K resolution
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
FLUX1.1 [pro] in ultra and raw modes. Images are up to 4 megapixels. Use raw mode for realism.
Ultra fast flux kontext endpoint
A 7 billion parameter language model from Meta, fine tuned for chat completions
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
State-of-the-art image generation with top of the line prompt following, visual quality, image detail and output diversity.
Google's state of the art image generation and editing model 🍌🍌
A 70 billion parameter language model from Meta, fine tuned for chat completions
A premium text-based image editing model that delivers maximum performance and improved typography generation for transforming images through natural language prompts
The latest Qwen-Image’s iteration with improved multi-image editing, single-image consistency, and native support for ControlNet
Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Designed for real-time applications with low latency
This is an optimised version of the hidream-l1 model using the pruna ai optimisation toolkit!
Turbo is the fastest and cheapest Ideogram v3. v3 creates images with stunning realism, creative designs, and consistent styles
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis
Google's Imagen 4 flagship model
A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B image-to-video
Meta's flagship 405 billion parameter language model, fine-tuned for chat completions
Open-weight version of FLUX.1 Kontext
A version of flux-dev, a text to image model, that supports fast fine-tuned lora inference
DeepSeek-V3-0324 is the leading non-reasoning model, a milestone for open source
A 13 billion parameter language model from Meta, fine tuned for chat completions
Fastest, most cost-effective GPT-5 model from OpenAI
Use this fast version of Imagen 4 when speed and cost are more important than quality
Official CLIP models, generate CLIP (clip-vit-large-patch14) text & image embeddings
A sub 1 second text-to-image model built for production use cases.
Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.
The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)
The fastest image generation model tailored for fine-tuned use
Very fast image generation and editing model. 4 steps distilled, sub-second inference for production and near real-time applications.
A 17 billion parameter model with 16 experts
Use Kling v2.1 to generate 5s and 10s videos in 720p and 1080p resolution from a starting image (image-to-video)
Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
A 17 billion parameter model with 128 experts
A text-to-image model with support for native high-resolution (2K) image generation
High-quality image generation model optimized for creative professional workflows and ultra-high fidelity outputs
Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)
A fast image model with state of the art inpainting, prompt comprehension and text rendering.
OpenAI's latest image generation model with better instruction following and adherence to prompts
High-quality image generation and editing with support for eight reference images
An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".
A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution
An excellent image model with state of the art inpainting, prompt comprehension and text rendering
Minimax's first image model, with character reference support
An experimental model with FLUX Kontext Pro that can combine two input images
Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions
A reasoning model trained with reinforcement learning, on par with OpenAI o1
The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles
An enhanced version over Qwen-Image-Edit-2509, featuring multiple improvements including notably better consistency
Like Ideogram v2, but faster and cheaper
An efficient, intelligent, and truly open-source language model
Google's highest quality text-to-image model, capable of generating images with detail, rich lighting and beauty
A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, thanks to Query-Key Normalization.
A 7 billion parameter language model from Mistral.
Designed to make images sharper and cleaner, Crisp Upscale increases overall quality, making visuals suitable for web use or print-ready materials.
Kling 2.5 Turbo Pro: Unlock pro-level text-to-video and image-to-video creation with smooth motion, cinematic depth, and remarkable prompt adherence.
A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency
Text-to-Audio (T2A) that offers voice synthesis, emotional expression, and multilingual capabilities. Optimized for high-fidelity applications like voiceovers and audiobooks.
Google’s hybrid “thinking” AI model optimized for speed and cost-efficiency
Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.
Edit images using a prompt. This model extends Qwen-Image’s unique text rendering capabilities to image editing tasks, enabling precise text editing
A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution
Fast, affordable version of GPT-4.1
An image generation foundation model in the Qwen series that achieves significant advances in complex text rendering.
Generate 5s and 10s videos in 720p resolution at 30fps
Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill [pro].
A multimodal image generation model that creates high-quality images. You need to bring your own verified OpenAI key to use this model. Your OpenAI account will be charged for usage.
Use this ultra version of Imagen 4 when quality matters more than speed and cost
Professional-grade image upscaling, from Topaz Labs
OpenAI's new model excelling at coding, writing, and reasoning.
Fastest, most cost-effective GPT-4.1 model from OpenAI
Quality image generation and editing with support for reference images
Faster version of OpenAI's flagship GPT-5 model
Open-weight depth-aware image generation. Edit images while preserving spatial relationships.
This model generates beautiful cinematic 2 megapixel images in 3-4 seconds and is derived from the Wan 2.2 model through optimisation techniques from the pruna package
Use FLUX Kontext to restore, fix scratches and damage, and colorize old photos
A text-to-image model that generates high-resolution images with fine details. It supports various artistic styles and produces diverse outputs from the same prompt, with a focus on fewer inference steps
Base version of Llama 3, a 70 billion parameter language model from Meta.
Google's most advanced reasoning Gemini model
Video Upscaling from Topaz Labs
Generate 5s and 10s videos in 1080p resolution
Granite-3.1-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.
Create 5s-8s videos with enhanced character movement, visual effects, and exclusive 1080p-8s support. Optimized for anime characters and complex actions
Google's latest image generation model in Gemini 2.5
A faster and cheaper version of Seedance 1 Pro
Runway's Gen-4 Image model with references. Use up to 3 reference images to create the exact image you need. Capture every angle.
Generate 6s videos with prompts or images. (Also known as Hailuo). Use a subject reference to make a video with a character and the S2V-01 model.
Base version of Llama 2 7B, a 7 billion parameter language model
The highest fidelity image model from Black Forest Labs
Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)
Claude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle
A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality
Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community
Generate consistent characters from a single reference image. Outputs can be in many styles. You can also use inpainting to add your character to an existing image.
Quickly generate up to 1 minute of music with lyrics and vocals in the style of a reference track
OpenAI's high-intelligence chat model
High-precision image upscaler optimized for portraits, faces and products. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
Bria AI's remove background model
Camera-aware edits for Qwen/Qwen-Image-Edit-2509 with Lightning + multi-angle LoRA
Granite-3.2-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for reasoning and instruction-following capabilities.
A joint audio-video model that accurately follows complex instructions.
The fastest Wan 2.2 text-to-image and image-to-video model
Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Granite-3.0-2B-Instruct is a lightweight and open-source 2B parameter model designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.
Professional edge-guided image generation. Control structure and composition using Canny edge detection
A cost-efficient version of GPT Image 1
New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support
OpenAI's fast, lightweight reasoning model
Automated background removal for images. Tuned for AI-generated content, product photos, portraits, and design workflows
Like Ideogram v2 turbo, but now faster and cheaper
Balance speed, quality and cost. Ideogram v3 creates images with stunning realism, creative designs, and consistent styles
GPT-5 with support for structured outputs, web search and custom tools
Recraft V3 SVG (code-named red_panda) is a text-to-image model with the ability to generate high quality SVG images including logotypes, and icons. The model supports a wide list of styles.
Base version of Llama 2, a 70 billion parameter language model from Meta.
Bria Expand expands images beyond their borders in high quality. Resizing the image by generating new pixels to expand to the desired aspect ratio. Trained exclusively on licensed data for safe and risk-free commercial use
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.
Affordable and fast images
Latest hybrid thinking model from Deepseek
Open-weight image variation model. Create new versions while preserving key elements of your original.
Professional depth-aware image generation. Edit images while preserving spatial relationships.
SOTA Object removal, enables precise removal of unwanted objects from images while maintaining high-quality outputs. Trained exclusively on licensed data for safe and risk-free commercial use
Accelerated variant of Photon prioritizing speed while maintaining quality
OpenAI's Flagship GPT model for complex tasks.
Generate realistic lipsync animations from audio for high-quality synchronization
Updated Qwen3 model for instruction following
Quickly make 5s or 8s videos at 540p, 720p or 1080p. It has enhanced motion, prompt coherence and handles complex actions well.
Artistic and high-quality visuals with improved prompt adherence, diversity, and definition
Generate expressive, natural speech. Features unique emotion control, instant voice cloning from short audio, and built-in watermarking.
An experimental FLUX Kontext model that can combine two input images
OpenAI's Flagship video generation with synced audio
Sound on: Google’s flagship Veo 3 text to video model, with audio
The best model for coding and agentic tasks with configurable reasoning effort.
Color match and white balance fixes for images
20b open-weight language model from OpenAI
Low‑latency MiniMax Speech 2.6 Turbo brings multilingual, emotional text-to-speech to Replicate with 300+ voices and real-time friendly pricing
Base version of Llama 2 13B, a 13 billion parameter language model
Upscale images 2x or 4x times
Open-weight edge-guided image generation. Control structure and composition using Canny edge detection.
Qwen Image Edit 2509 LoRA explorer, uses HuggingFace URLs to load any safetensor
Convert raster images to high-quality SVG format with precision and clean vector paths, perfect for logos, icons, and scalable graphics.
The best model for coding and agentic tasks across industries
Change the aspect ratio of any photo using AI (not cropping)
Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding
FLUX.1 Kontext[dev] image editing model for running lora finetunes
Alibaba Wan 2.5 Image to video generation with background audio
A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video
Accelerated inference for Wan 2.1 14B text to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Granite-3.0-8B-Instruct is a lightweight and open-source 8B parameter model is designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.
Quickly change someone's hair style and hair color, powered by FLUX.1 Kontext [pro]
FLUX Kontext max with list input for multiple images
An image-to-video (I2V) model specifically trained for Live2D and general animation use cases
A faster and cheaper version of Google’s Veo 3 video model, with audio
Enables precise control of character actions and expressions from a reference image.
Commercial-ready, trained entirely on licensed data, text-to-image model. With only 4B parameters provides exceptional aesthetics and text rendering. Evaluated to be on par to other leading models in the market
Turns your audio/video/images into professional-quality animated videos
120b open-weight language model from OpenAI
Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.
Granite-4.0-H-Small is a 32B parameter long-context instruct model finetuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.
MiniMax Speech 2.6 HD delivers studio-quality multilingual text-to-audio on Replicate with nuanced prosody, subtitle export, and premium voices
A new way to edit, transform and generate video
Max-quality image generation and editing with support for ten reference images
Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation
Turn your image into a cartoon with FLUX.1 Kontext [pro]
The fastest open source TTS model without sacrificing quality.
4MP text-to-image generation with enhanced cinematic-quality image generation with precise style control, improved text rendering, and commercial design optimization.
An AI system that can create realistic images and art from a description in natural language.
Join the Granite community where you can find numerous recipe workbooks to help you get started with a wide variety of use cases using this model. https://github.com/ibm-granite-community
Inference model for FLUX 1.1 [pro] Ultra using custom `finetune_id`. Supports 4MP images and raw mode for realism
Bria Increase resolution upscales the resolution of any image. It increases resolution using a dedicated upscaling method that preserves the original image content without regeneration.
Gen-4 Image Turbo is cheaper and 2.5x faster than Gen-4 Image. An image model with references, use up to 3 reference images to create the exact image you need. Capture every angle.
State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.
2.5 billion parameter image model with improved MMDiT-X architecture
Affordable and fast vector images
Claude Haiku 4.5 gives you similar levels of coding performance but at one-third the cost and more than twice the speed
Generate 5s and 10s videos in 720p resolution
A premium version of Kling v2.1 with superb dynamics and prompt adherence. Generate 1080p 5s and 10s videos from text or an image
Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Remove all text from an image with FLUX.1 Kontext
OpenAI's Most advanced synced-audio video generation
Create a series of portrait photos from a single image
Become a character, in style
Image generation model from Reve
Generate videos with specific camera movements
Create a professional headshot photo from any single image
Fast, efficient image variation model for rapid iteration and experimentation.
Fast, high quality text-to-video and image-to-video (Also known as Dream Machine)
Image editing model from Reve
4 step distilled version of FLUX.2 [klein]. A foundation model for maximum flexibility and control
Generate a video from an audio clip and a reference image
Lyria 2 is a music generation model that produces 48kHz stereo audio through text-based prompts
Generate 5s and 10s 720p videos fast
Ideal for rapid ideation and mobile workflows. Perfect for creators who need instant feedback, real-time previews, or high-throughput content.
Generate 5s and 9s 540p videos, faster and cheaper than Ray 2
A high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across both text-to-video and image-to-video workflows
Music-1.5: Full-length songs (up to 4 mins) with natural vocals & rich instrumentation
A powerful native multimodal model for image generation (PrunaAI squeezed)
Wan 2.5 image-to-video, optimized for speed
Qwen Image 2512 is an improved version of Qwen Image with more realistic human generation, finer textures, and stronger text rendering
Bria Background Generation allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use
Image-to-video at 720p and 480p with Wan 2.2 A14B
Generate 5s 480p videos. Wan is an advanced and powerful visual generation model developed by Tongyi Lab of Alibaba Group
Grok 4 is xAI’s most advanced reasoning model. Excels at logical thinking and in-depth analysis. Ideal for insightful discussions and complex problem-solving.
A lower-latency image-to-video version of Hailuo 2.3 that preserves core motion quality, visual consistency, and stylization performance while enabling faster iteration cycles.
Generate 5s and 9s 720p videos, faster and cheaper than Ray 2
Wan 2.5 text-to-video, optimized for speed
Clone voices to use with Minimax's speech-02-hd and speech-02-turbo
A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p
Convert PDF to markdown + JSON quickly with high accuracy
Quickly generate smooth 5s or 8s videos at 540p, 720p or 1080p
Use one or two face images to create AI avatars
Generate 5s and 9s 720p videos
Generate videos using xAI's Grok Imagine Video model
Change the aspect ratio of any video up to 30 seconds long, outputs will be 720p
A speech-to-text model that uses GPT-4o to transcribe audio
Accelerated inference for Wan 2.1 14B text to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design
Leonardo AI’s first foundational model produces images up to 5 megapixels (fast, quality and ultra modes)
Add lip-sync to any video with an audio file or text
Alibaba Wan 2.5 text to video generation model
Image generation model from Reve which handles multiple input reference images
Upscale videos by 4x, up to a maximum of 4k
End-to-end AI speech model designed for natural-sounding conversational speech synthesis, with support for context-aware prosody, intonation, and emotional expression.
Reve's fast image edit model at only $0.01 per edit
Use Wan 2.2 Animate to replace a character in a video scene
Alibaba Wan 2.6 image to video generation model
Un-distilled version of FLUX.2 [klein]. A foundation model for maximum flexibility and control
Generate high-quality music and sound from text prompts
Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages
Generate realistic lipsyncs with Sync Labs' 2.0 model
Delivers high visual fidelity with fast turnaround. Great for daily content creation, marketing teams, and iterative creative workflows.
Generate expressive, natural speech with Resemble AI's Chatterbox.
Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).
OpenAI's first o-series reasoning model
A film-grade digital human model that generates realistic video from a single image, audio clip, and optional text prompt.
Studio-grade lipsync in minutes, not weeks
Moonshot AI's latest open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model
Use Wan 2.2 Animate to copy the motion of a video to another scene
Generate high-quality 2K resolution images from text prompts
Ovi: generate videos with audio from image and text inputs
Skin – Natural beauty retouch that enhances pores and tonal variation (no plastic skin) via the Skin LoRA.
Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use.
Creative Upscale focuses on enhancing details and refining complex elements in the image. It doesn’t just increase resolution but adds depth by improving textures, fine details, and facial features.
Un-distilled version of FLUX.2 [klein]. Optimized for fine-tuning, customization, and post-training workflows
an open-source, 2B-parameter model built for real-world applications
A speech-to-text model that uses GPT-4o mini to transcribe audio
SOTA Open source model trained on licensed data, transforming intent into structured control for precise, high-quality AI image generation in enterprise and agentic workflows.
Generate 5s and 9s 540p videos
Inference model for FLUX.1 [pro] using custom `finetune_id`
Create 5s 480p videos from a text prompt
Image-to-video generation with optional audio, multi-shot narrative support, and faster inference
Put yourself in an iconic location around the world from a single image
A 20B MMDiT model for next-gen text-to-image generation
Granite-3.1-2B-Instruct is a lightweight and open-source 2B parameter model designed to excel in instruction following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.
LTX-2: The first open source audio-video model
Minimax Speech 2.8 Turbo: Turn text into natural, expressive speech with voice cloning, emotion control, and support for 40+ languages
Fast pixel art image generation
Modify a video with style transfer and prompt-based editing
Add simple filters to your images
High quality and authentic pixel art image generation
The most expressive Text to Speech model
Experience impossible adventures and extreme scenarios from a single image
Generate synced sounds for any video and return it with its new soundtrack - now enhanced in version 1.5 for improved sound synchronization and realism
Anthropic's most intelligent model with state-of-the-art coding, reasoning, and agentic capabilities
Agentic image model optimized for robust, high-precision generations supporting font control
Alibaba Wan 2.6 text to video generation model
VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video
Next Scene – “Next beat” cinematic edits that keep subject identity while steering to the next camera move via the Next Scene LoRA
Translate videos into over 150 languages
Latest video model from Pixverse with astonishing physics
SOTA image model from xAI
Compose a song from a prompt or a composition plan
Generate synced sounds for any video, and return it with its new sound track
A small model alternative to o1
Generate complex 3D models from images with Rodin Gen-2
Minimax Speech 2.8 HD focuses on high-fidelity audio generation with features like studio-grade quality, flexible emotion control, multilingual support, and voice cloning capabilities
Kimi K2 Thinking is the latest, most capable version of an open-source thinking model.
Create videos in as little as 10 seconds. 5s or 8s videos at 360p, 540p, 720p or 1080p.
Generate vivid, realistic images based on a text prompt. Excels at generating images for marketing, social media, and entertainment.
Style consistent animated pixel art sprite generation
High quality, low latency text to speech in 32 languages
Add consistent, customizable shadows to product cutouts for enhanced visual appeal
Bring your subjects into focus with FLUX.1 Kontext [pro]
Generate 5s and 10s videos in 1080p resolution at 30fps
Photo to Anime – Stylized conversion that turns photos into crisp cel-shaded anime frames using the Photo-to-Anime LoRA.
Relight – Soft, curtain-filtered relighting that repaints the scene with golden-hour or moody tones using the Relight LoRA.
All the tools you need for generating pixel art tilesets
Granite-Embedding-278M-Multilingual is a 278M parameter model from the Granite Embeddings suite that can be used to generate high quality text embeddings
The smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands
Use trained LoRAs from the https://replicate.com/prunaai/p-image-trainer. Find or contribute LoRAs here https://huggingface.co/collections/PrunaAI/p-image-loras
Take any shot and edit specific sections. Rephrase, change the action, camera angles and more
High-precision video upscaler optimized for portraits, faces and products. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
Create avatar videos with realistic humans, animals, cartoons, or stylized characters
Use trained LoRAs from the https://replicate.com/prunaai/p-image-edit-trainer. Find or contribute LoRAs here: https://huggingface.co/collections/PrunaAI/p-image-edit-loras.
Upscale – Detail-loving upscale/restore pass that sharpens textures and color fidelity with the Upscale LoRA.
The original classic DALLᐧE 2
Precise AI-powered product cutout with 256-level transparency for eCommerce
Animate any character, humans, cartoons, animals, even non-humans, from a single image + driving video
Fusion – Product/object blending that fixes perspective and lighting so the subject melts into a new background via the Fusion LoRA.
3D models with texture fidelity and geometry precision
Generate 5s and 10s videos in 720p resolution at 30fps
Transform any product photo into professional 2000x2000px packshots with optimal positioning
Agentic image model optimized for high-quality, fast generations supporting font control
A version of FLUX.2 [klein] 9B-base that supports fast fine-tuned lora inference
Generate multilingual text-to-speech audio in over 30 languages
Use flux-kontext-pro to change the first or last frame of a video. Useful to use as inputs for restyling an entire video in a certain way
Use audio input with an image or prompt to generate videos
Realistic lipsync with refined human emotion capabilities
Modify an existing video through natural-language commands, changing subjects, environments, and visual style while preserving the original motion and timing.
Fine-tunable Qwen Image model with exceptional composition abilities - train custom LoRAs for any style or subject
A version of FLUX.2 [klein] 4B-base that supports fast fine-tuned lora inference
ElevenLabs's fastest speech synthesis model
Render product images with 100% accuracy and environmental blending
Remove dust and scratches from old photos
Image colorization model from Topaz Labs
Fast LoRA trainer for p-image, a super fast text-to-image model developed by Pruna AI. Use LoRAs here: https://replicate.com/prunaai/p-image-lora. Find or contribute LoRAs here: https://huggingface.co/collections/PrunaAI/p-image
Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen.
FIBO-Edit brings the power of structured prompt generation to image editing
Upscale videos up to 8K output resolution. Trained on fully licensed and commercially safe data.
A high-fidelity capability for erasing unwanted objects, people, or visual elements from videos while maintaining aesthetic quality and temporal consistency