Search models...

Google Veo 3.1 Official on Vidgo API with Lite, Fast, and Quality tiers for text-to-video, image-to-video, first/last-frame, reference, native audio, and 4K workflows.

Happy Horse

Per Request:$0.08

Save 43%

Alibaba Happy Horse video generation and editing on Vidgo API with text-to-video, image-to-video, reference-to-video, and video-edit workflows.

GPT Image 2

OpenAI image generation and editing with quality tiers, flexible size controls, and single-image outputs up to 4K.

Seedance 2

Sora 2 Official

Per Request:$0.24

Sora 2 Official with synced audio, improved physics, optional reference image input, fixed 4-second to 20-second tiers, and Pro Official resolution-based output.

Kling 3.0 Motion Control

Save 65%

Kling 3.0 Motion Control is a reference-driven motion transfer model that combines one character image and one source video with transparent per-second pricing.

Kling 2.6 Motion Control

Save 30%

Kuaishou's motion control model that transfers motion from reference videos to character images while maintaining identity and adapting environments.

Wan 2.6

Per Request:$0.40

Save 20%

Alibaba's Wan 2.6 video generation family for text-to-video, image-to-video, and video-to-video with multi-shot 1080p output.

GPT Image 1.5

Save 23%

OpenAI's latest image model with 4x speed, precision editing, and superior text rendering.

Hailuo 02

Save 22%

MiniMax's #2 globally-ranked video model with NCR architecture, ultra-realistic physics, and 1080p cinematic output.

Seedance 1.0 Pro

Save 83%

ByteDance's #1 ranked video model with multi-shot storytelling, cinema-grade motion, and bilingual text-to-video generation.

Z-Image

Alibaba's efficient 6B-parameter image model with sub-second generation and exceptional Chinese-English bilingual text rendering capabilities.

Kling 2.6

Per Request:$0.33

Kuaishou's revolutionary video model that simultaneously generates visuals with synchronized dialogue, sound effects, and ambient audio in one pass.

Seedream 4.5

Save 37.5%

ByteDance's unified 4K image generation and editing model with professional-grade text rendering and commercial photography quality.

Seedream 5.0 Lite

ByteDance Seedream 5.0 Lite family on Vidgo API: seedream-5.0-lite for text-to-image and seedream-5.0-lite-edit for image-to-image editing, with 2K/3K presets, custom sizes, and up to 10 reference images.

FLUX.2

Black Forest Labs' production-grade model combining 4MP image generation and editing with multi-reference support, precise typography, and hex color control.

Wan Animate

Alibaba's 14B-parameter character animation model that transfers motion from reference videos to static characters with exceptional identity preservation.

GPT-4o Image

Save 80%

OpenAI's native multimodal image generator with exceptional text rendering, precise prompt following, and conversational editing capabilities.

Nano Banana

Save 36%

Google's leaderboard-topping image model (Gemini 2.5 Flash) excelling in natural language editing, character consistency, and multi-image blending.

Nano Banana Pro

Per Request:$0.05

Save 83%

Google's Nano Banana Pro image model powered by Gemini 3 Pro for 4K generation, strong text rendering, multi-image blending, and production-ready image editing.

Sora 2

Per Request:$0.15

Save 85%

OpenAI's advanced video model with realistic physics simulation, synchronized audio generation, and innovative Cameo feature for personalized content.

Sora 2 Pro

Per Request:$0.50

Save 95%

Premium Sora 2 variant delivering professional-grade 1024p video with enhanced fidelity, extended duration, and sophisticated audio-visual coherence.

Veo 3.1

Google DeepMind's 1080p video model with native audio generation, scene extension to 60+ seconds, and advanced creative controls for cinematic storytelling.

Suno v5

Save 90%

Suno's most advanced AI music model with studio-quality audio, authentic vocals, 10x faster generation, and up to 8-minute track support.

Suno Music

Save 90%

AI music generator with customizable styles, vocals, and full creative control over musical characteristics and quality.

Extend Music

Extend or modify existing music tracks by creating sequels based on source audio. Supports custom mode with full parameter control or simple mode inheriting original parameters. Specify continuation points and maintain style consistency across extensions up to 8 minutes.

Upload and Cover Audio

Transform audio tracks into new styles while preserving original melodies. Upload your audio files (up to 2 minutes) and convert them with AI-powered style transfer. Supports custom and simplified modes with vocal/instrumental options and audio weight controls.

Upload and Extend Audio

Upload audio files and extend them while maintaining the original style and characteristics. AI generates seamless continuations from specified time points. Supports multiple model versions with style weight and creative controls for natural extensions.

Add Instrumental

Generate musical accompaniment for uploaded audio files containing vocals or melodies. AI creates matching instrumental backing tracks with customizable style tags, genre preferences, and quality controls. Perfect for adding professional-quality backing to vocal recordings.

Add Vocals

Layer AI-generated vocals onto existing instrumental tracks. Provide lyrics or descriptions and the API generates matching vocal performances with customizable gender, style, and expression. Transform instrumental music into complete songs with professional AI singing.

Get Timestamped Lyrics

Retrieve lyrics synchronized with precise timestamps from generated music. Returns word-by-word timing data, waveform visualization, and alignment accuracy scores. Essential for karaoke applications, lyric videos, and music synchronization projects.

Boost Music Style

AI-enhanced music style description generator. Transform simple style inputs like 'pop, mysterious' into detailed, comprehensive musical descriptions. Optimize your prompts for better music generation results with enriched genre, mood, and instrumentation details.

Generate Music Cover

Generate alternative cover versions of existing music tracks. Create variations with automatic style changes while maintaining the essence of the original composition. Perfect for producing multiple versions or exploring different interpretations of your generated music.

Replace Section

Per Request:$0.05

Replace specific sections of generated music tracks with precision timing control. Modify choruses, verses, or any segment by specifying start and end times. Maintains overall coherence while allowing targeted changes to lyrics, style, or musical elements.

Generate Persona

Create reusable music personas from existing audio tracks. Save distinctive vocal characteristics, musical styles, and personality traits for consistent use across multiple generations. Build your own AI artist profiles for brand consistency and style continuity.

Generate Lyrics

AI-powered lyrics generation based on themes, moods, and descriptions. Create original song lyrics from simple prompts up to 200 characters. Generate creative, coherent lyrics for any genre or emotional tone with professional songwriting quality.

Convert to WAV

Obtain high-quality WAV format files from your generated music. Convert any PoYo-generated audio to lossless WAV format for professional use, further editing, or high-fidelity playback. Essential for production workflows requiring uncompressed audio.

Vocal Remover

Per Request:$0.07

Separate vocals from instrumentals or split audio into multiple stem tracks. Two modes available: vocal separation for isolating vocals and backing tracks, or stem splitting for extracting drums, bass, vocals, and other instruments individually. Professional-grade audio source separation.

AI Music Video

Generate visualized music videos from audio tracks. Create engaging visual content automatically synchronized to your music with optional author attribution and brand watermarks. Perfect for social media content, promotional materials, and music distribution.

Hailuo 2.3

Per Request:$0.18

MiniMax's Hailuo 2.3 video model for realistic human motion, expressive characters, and text-to-video or first-frame guided generation at 768p and 1080p.

Kling 1.6

Cost-effective Kling 1.6 access on Vidgo API for realistic video generation, text-to-video, image-to-video, Pro first/last-frame control, and Elements reference-image workflows.

Kling 2.1

Per Request:$0.15

Kling 2.1 on PoYo provides Standard and Pro image-to-video modes with 5-second and 10-second clips, start-frame control, and optional end-frame control in Pro.

Kling 2.5 Turbo Pro

Per Request:$0.21

Kling 2.5 Turbo Pro is a flexible short-form video model with text-to-video, optional frame guidance, smooth motion, cinematic depth, and fixed 5-second and 10-second tiers.

Wan 2.2 Fast

Wan 2.2 Fast provides fast text-to-video and image-to-video generation with low-cost 480p and 720p tiers for quick iteration.

Wan 2.5

Per Request:$0.15

Wan 2.5 combines text-to-video and image-to-video generation with 5-second and 10-second output, synchronized audio support, and multiple size and resolution tiers.

Runway Gen-4.5

Per Request:$0.38

Runway Gen-4.5 is a high-fidelity video model focused on prompt adherence, cinematic motion, visual fidelity, and optional reference image guidance.

Nano Banana 2

Google's next-gen image model powered by Gemini 3.1 Flash with native 2K/4K resolution, chain-of-thought reasoning, precise multi-language text rendering, and up to 14 reference images.

Kling 3.0

Per Request:$0.14

Kuaishou's most advanced video model with native 4K/60fps output, multi-shot storyboarding, multilingual audio, and character consistency for up to 3 people.

Seedance 1.5 Pro

Save 24%

ByteDance's latest video model with synchronized audio generation, flexible aspect ratios, and enhanced motion control.

Grok Imagine

Save 57%

xAI's Aurora-powered visual AI for image generation and video creation with Fun, Normal, and Spicy creative modes.

Seedream 4