Tutorials

Veo 3 API Tutorial: Generate Cinematic Video with Google

AI API Playbook · · 14 min read

Veo 3 API Tutorial: Generate Cinematic Video with Google’s Latest Model

Veo 3.1 generates 8-second videos at resolutions up to 4K with natively embedded audio. Generation time runs approximately 2–4 minutes per request depending on resolution. Pricing starts at $0.35 per second of generated video for Veo 3 Fast, rising to $0.75 per second for the full Veo 3 model — meaning an 8-second clip costs between $2.80 and $6.00. Independent evaluations on cinematic quality benchmarks place Veo 3.1 ahead of Sora and Kling 1.6 on motion coherence and photorealism as of mid-2025.

This tutorial gets you from zero to a working video generation pipeline in Python. Every code block runs without modification once you have valid credentials.


Prerequisites

Accounts and Access

  • A Google Cloud account with billing enabled
  • Access to the Gemini API (via Google AI Studio or Vertex AI)
  • A Gemini API key with Veo 3 access enabled — note that as of mid-2025, Veo 3 is gated behind a paid tier; the free tier does not include video generation
  • Python 3.9 or higher

Install dependencies

# Install the official Google GenAI SDK
pip install google-generativeai>=0.8.0

# Install supporting libraries
pip install python-dotenv requests tqdm

# Verify your installation
python -c "import google.generativeai as genai; print(genai.__version__)"

Note on SDK version: The google-generativeai SDK added Veo 3 support in version 0.8.0. If you’re on an older version, the generate_videos method will not exist. Run pip install --upgrade google-generativeai if you hit AttributeError.

Environment setup

Create a .env file in your project root. Never hardcode keys in source files.

# .env
GEMINI_API_KEY=your_api_key_here

Authentication and Client Setup

# auth_setup.py
# Purpose: Configure the GenAI client once and reuse it across your application.
# The client holds connection pooling and retry logic — don't instantiate it per request.

import os
import google.generativeai as genai
from dotenv import load_dotenv

# Load environment variables from .env file
# This keeps credentials out of source control
load_dotenv()

def get_configured_client():
    """
    Configure and return the GenAI client.
    
    Returns the configured genai module (the SDK uses a module-level configuration
    pattern rather than returning a client object in version 0.8.x).
    
    Raises:
        ValueError: If GEMINI_API_KEY is not set in the environment.
    """
    api_key = os.environ.get("GEMINI_API_KEY")
    
    if not api_key:
        raise ValueError(
            "GEMINI_API_KEY environment variable is not set. "
            "Get your key from https://aistudio.google.com/apikey"
        )
    
    # Configure the SDK globally — subsequent calls inherit this config
    genai.configure(api_key=api_key)
    
    return genai

# Quick validation test — run this file directly to verify your credentials
if __name__ == "__main__":
    client = get_configured_client()
    print("Authentication configured successfully.")
    
    # List available models to confirm API access is working
    # This is a lightweight call that doesn't incur video generation costs
    models = [m.name for m in client.list_models() if "veo" in m.name.lower()]
    print(f"Available Veo models: {models}")

Run python auth_setup.py before continuing. If you get a 403, your API key either lacks Veo access or billing isn’t enabled on the associated project.


Understanding the API: Model Names and Parameters

Before writing generation code, here’s the complete parameter reference for the Veo 3 API as exposed through the Gemini SDK.

Model identifiers

Model NameSpeedMax ResolutionAudioCost per Second
veo-3.0-generate-previewSlower4KYes$0.75
veo-3.0-fast-generate-preview~2× faster1080pYes$0.35
veo-2.0-generate-001Baseline1080pNo$0.35

Generation parameter reference

ParameterTypeDefaultValid Range / OptionsWhat It Affects
promptstrrequired1–4000 charactersThe scene description; more detail = higher prompt adherence
negative_promptstrNoneUp to 1000 charactersSteers generation away from described content
aspect_ratiostr"16:9""16:9", "9:16"Output dimensions; 9:16 targets mobile/vertical video
resolutionstr"720p""720p", "1080p", "4k"Pixel resolution; 4K only on veo-3.0-generate-preview
duration_secondsint858Length of generated clip in seconds
number_of_videosint114Number of variants generated in parallel
person_generationstr"allow_adult""allow_adult", "dont_allow"Controls whether human figures are rendered
enhance_promptboolTrueTrue, FalseWhen True, the API rewrites your prompt for better visual quality

On enhance_prompt: This is enabled by default and generally improves output, but it means the model may generate content that diverges from your exact wording. If you need precise prompt adherence — for example, for branded content with specific visual requirements — set enhance_prompt=False and invest in manual prompt engineering.


Core Implementation

Step 1: Basic video generation

This is the minimal working example. It generates one 8-second clip and saves it to disk.

# basic_generation.py
# Generates a single video from a text prompt.
# Run this first to confirm your API access works before adding complexity.

import os
import time
import google.generativeai as genai
from dotenv import load_dotenv

load_dotenv()
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

def generate_video_basic(prompt: str, output_path: str = "output.mp4") -> str:
    """
    Generate an 8-second video from a text prompt using Veo 3.
    
    Args:
        prompt: Scene description. Be specific about lighting, camera angle,
                and motion — these directly improve output quality.
        output_path: Where to save the mp4 file.
    
    Returns:
        Path to the saved video file.
    """
    # Use the ImageGenerationModel interface — Veo is accessed through
    # the same client as Imagen, just with a video-specific model name
    model = genai.ImageGenerationModel("veo-3.0-generate-preview")
    
    # This call is ASYNCHRONOUS — it returns an operation object immediately
    # The actual generation happens server-side and takes 2–4 minutes
    operation = model.generate_videos(
        prompt=prompt,
        # Explicitly set 720p to minimize cost during testing
        # Switch to "1080p" or "4k" for production renders
        resolution="720p",
        number_of_videos=1,
    )
    
    print(f"Generation started. Operation name: {operation.operation.name}")
    print("Polling for completion (this typically takes 2–4 minutes)...")
    
    # Poll until the operation completes
    # The SDK does not block automatically — you must poll manually
    while not operation.done:
        time.sleep(10)  # Poll every 10 seconds to avoid rate limiting
        operation.refresh()
        print(".", end="", flush=True)
    
    print("\nGeneration complete.")
    
    # The operation result contains a list of generated videos
    # Each video object has a .video property with the raw bytes
    video_data = operation.result.generated_videos[0].video
    
    # Write the video bytes to disk
    with open(output_path, "wb") as f:
        f.write(video_data._data)
    
    print(f"Video saved to: {output_path}")
    return output_path


if __name__ == "__main__":
    # A concrete, detailed prompt performs significantly better than a vague one.
    # Include: subject, action, environment, lighting, camera behavior.
    prompt = (
        "A lone astronaut walks across a rust-red Martian plain at golden hour. "
        "The camera tracks slowly from behind at shoulder height. "
        "Dust particles catch the low sunlight. The horizon curves slightly. "
        "Cinematic color grading, anamorphic lens flare."
    )
    
    generate_video_basic(prompt, "mars_walk.mp4")

Step 2: Production-ready generation with error handling and retry logic

The basic version will fail silently under real conditions — rate limits, transient network errors, and operation timeouts all require explicit handling. This version is what you actually deploy.

# production_generation.py
# Production-grade Veo 3 generation with retry logic, timeout handling,
# progress tracking, and structured error reporting.

import os
import time
import logging
from dataclasses import dataclass
from typing import Optional
import google.generativeai as genai
from google.api_core import exceptions as google_exceptions
from dotenv import load_dotenv

load_dotenv()
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__)

genai.configure(api_key=os.environ["GEMINI_API_KEY"])


@dataclass
class VideoGenerationConfig:
    """
    Typed configuration for a single generation job.
    Using a dataclass instead of raw dicts catches typos at definition time,
    not at runtime when the API call fails.
    """
    prompt: str
    model: str = "veo-3.0-generate-preview"
    resolution: str = "720p"
    aspect_ratio: str = "16:9"
    duration_seconds: int = 8
    number_of_videos: int = 1
    negative_prompt: Optional[str] = None
    enhance_prompt: bool = True
    person_generation: str = "allow_adult"


@dataclass
class GenerationResult:
    success: bool
    output_paths: list[str]
    operation_name: str
    elapsed_seconds: float
    error_message: Optional[str] = None


def generate_video_production(
    config: VideoGenerationConfig,
    output_dir: str = ".",
    poll_interval: int = 15,
    timeout_seconds: int = 600,  # 10 minutes is a safe upper bound for Veo 3
    max_retries: int = 3,
) -> GenerationResult:
    """
    Generate video(s) from a VideoGenerationConfig with full error handling.
    
    Retry logic covers: transient 503s, rate limit 429s (with backoff).
    Does NOT retry: 400 (bad prompt), 403 (auth), 404 (invalid model).
    These are developer errors that need code changes, not retries.
    """
    model = genai.ImageGenerationModel(config.model)
    
    # Build kwargs dict — only pass optional params if they're set,
    # because passing None to the SDK can trigger unexpected behavior
    generation_kwargs = {
        "prompt": config.prompt,
        "resolution": config.resolution,
        "aspect_ratio": config.aspect_ratio,
        "number_of_videos": config.number_of_videos,
        "enhance_prompt": config.enhance_prompt,
        "person_generation": config.person_generation,
    }
    
    if config.negative_prompt:
        generation_kwargs["negative_prompt"] = config.negative_prompt
    
    start_time = time.time()
    operation = None
    
    # Retry loop for the initial API call (not the polling)
    for attempt in range(max_retries):
        try:
            operation = model.generate_videos(**generation_kwargs)
            logger.info(f"Operation started: {operation.operation.name}")
            break
        except google_exceptions.ResourceExhausted as e:
            # 429 rate limit — back off exponentially before retrying
            wait = 2 ** attempt * 30  # 30s, 60s, 120s
            logger.warning(f"Rate limited (attempt {attempt + 1}). Waiting {wait}s...")
            time.sleep(wait)
        except google_exceptions.InvalidArgument as e:
            # 400 error — the prompt or parameters are invalid, don't retry
            logger.error(f"Invalid request: {e}. Check your prompt and parameters.")
            return GenerationResult(
                success=False, output_paths=[], operation_name="",
                elapsed_seconds=time.time() - start_time,
                error_message=f"InvalidArgument: {e}"
            )
        except google_exceptions.PermissionDenied as e:
            # 403 — API key is wrong or lacks Veo access, don't retry
            logger.error(f"Permission denied: {e}. Check your API key and billing.")
            return GenerationResult(
                success=False, output_paths=[], operation_name="",
                elapsed_seconds=time.time() - start_time,
                error_message=f"PermissionDenied: {e}"
            )
    
    if operation is None:
        return GenerationResult(
            success=False, output_paths=[], operation_name="",
            elapsed_seconds=time.time() - start_time,
            error_message="Failed to start operation after max retries"
        )
    
    # Poll for completion with a hard timeout
    deadline = start_time + timeout_seconds
    while not operation.done:
        if time.time() > deadline:
            logger.error(f"Operation timed out after {timeout_seconds}s")
            return GenerationResult(
                success=False, output_paths=[],
                operation_name=operation.operation.name,
                elapsed_seconds=time.time() - start_time,
                error_message=f"Timeout after {timeout_seconds}s"
            )
        
        elapsed = int(time.time() - start_time)
        logger.info(f"Waiting... {elapsed}s elapsed")
        time.sleep(poll_interval)
        
        try:
            operation.refresh()
        except google_exceptions.ServiceUnavailable:
            # 503 during polling is usually transient — log and continue
            logger.warning("503 during poll, will retry next interval")
    
    # Save all generated videos
    os.makedirs(output_dir, exist_ok=True)
    output_paths = []
    timestamp = int(time.time())
    
    for idx, generated_video in enumerate(operation.result.generated_videos):
        filename = f"veo3_{timestamp}_{idx}.mp4"
        filepath = os.path.join(output_dir, filename)
        
        with open(filepath, "wb") as f:
            f.write(generated_video.video._data)
        
        output_paths.append(filepath)
        logger.info(f"Saved video {idx + 1}: {filepath}")
    
    elapsed = time.time() - start_time
    logger.info(f"Generation complete. {len(output_paths)} video(s) in {elapsed:.1f}s")
    
    return GenerationResult(
        success=True,
        output_paths=output_paths,
        operation_name=operation.operation.name,
        elapsed_seconds=elapsed,
    )


if __name__ == "__main__":
    config = VideoGenerationConfig(
        prompt=(
            "Extreme close-up of a hummingbird hovering at a red flower. "
            "Wings blur in slow motion. Water droplets on petals catch sunlight. "
            "Shallow depth of field, macro lens, morning light."
        ),
        negative_prompt="blurry, low quality, distorted, cartoon, anime",
        resolution="1080p",
        number_of_videos=2,  # Generate 2 variants to pick the best one
        enhance_prompt=False,  # Disable enhancement for precise prompt control
    )
    
    result = generate_video_production(config, output_dir="./videos")
    
    if result.success:
        print(f"Success! Files: {result.output_paths}")
        print(f"Time: {result.elapsed_seconds:.1f}s")
    else:
        print(f"Failed: {result.error_message}")

Error Handling Reference

These are the errors you will actually encounter in production. The official documentation under-specifies error conditions, so this table is based on observed API behavior.

HTTP CodeException ClassRoot CauseSolution
400InvalidArgumentPrompt contains restricted content, or parameters are out of rangeReview person_generation setting; check prompt for policy violations
403PermissionDeniedAPI key invalid, billing not enabled, or Veo 3 not in your tierEnable billing; confirm your key has Veo API scope at AI Studio
404NotFoundModel name string is wrong or model isn’t available in your regionUse exact model IDs from the parameter table above
429ResourceExhaustedRate limit exceeded — Veo 3 has strict QPM limitsImplement exponential backoff; default limit is 2 QPM per project
500InternalServerErrorServer-side generation failure (usually transient)Retry with same parameters after 60 seconds
503ServiceUnavailableServer overloadedRetry with exponential backoff; usually resolves within 5 minutes
N/ADeadlineExceededGeneration took longer than your client timeoutIncrease timeout_seconds; 4K at max quality can take 8+ minutes
N/AOperationErrorThe operation completed but generation failed (content policy)Inspect operation.result for a policy rejection message

Content policy failures

Veo 3 has strict content safety filters that run both on the prompt and the generated output. A generation can fail at two points: prompt evaluation (immediate 400) or output filtering (operation completes with an error state rather than a video). Always check operation.result.generated_videos length before indexing — a successful operation can return zero videos if all variants were filtered.

# Always guard against empty results
results = operation.result.generated_videos
if not results:
    logger.error("Operation completed but returned no videos — likely content policy rejection")
    # Log the prompt for manual review; do not auto-retry policy failures

Performance and Cost Reference

Numbers based on Veo 3 API pricing as of mid-2025. All costs are per second of generated output, not per API call.

ModelResolutionCost/SecondCost per 8s ClipAvg. Generation TimeUse Case
veo-3.0-generate-preview720p$0.75$6.00~3 minPrototyping, testing prompts
veo-3.0-generate-preview1080p$0.75$6.00~4 minStandard production
veo-3.0-generate-preview4K$0.75$6.00~6–8 minBroadcast / high-end delivery
veo-3.0-fast-generate-preview720p$0.35$2.80~90 secRapid iteration, preview renders
veo-3.0-fast-generate-preview1080p$0.35$2.80~2 minCost-sensitive production

Cost note: Resolution does not affect the per-second rate for Veo 3 — you pay the same $0.75/s for 720p and 4K on the full model. The cost decision is model tier (full vs. fast), not resolution. Run your iteration cycles on veo-3.0-fast-generate-preview and switch to the full model only for final renders.

When not to use Veo 3

  • Clips longer than 8 seconds: Veo 3 is capped at 8 seconds. For longer content, you’ll need to stitch multiple generations — which creates consistency problems at cut points.
  • Real-time or near-real-time applications: 90 seconds is the minimum generation time. This rules out any UX where users expect sub-30-second feedback.
  • Very high volume at low cost: At $0.35–$0.75 per second, generating a 5-minute video via clips costs $105–$225. For bulk content, evaluate whether the quality delta over cheaper alternatives justifies the price.
  • Precise motion control: Veo 3 interprets camera and motion descriptions probabilistically. If you need frame-accurate control (e.g., product demos showing specific UI interactions), this model is not the right tool.

Prompt Engineering for Veo 3

Prompt quality is the highest-leverage variable in output quality. Vague prompts produce mediocre results regardless of resolution.

A prompt that consistently performs well includes four elements:

  1. Subject: what is in the frame and what is it doing
  2. Environment: location, time of day, weather, context
  3. Camera: angle, movement, lens characteristics
  4. Aesthetic: color grade, film stock reference, mood

Weak prompt: "A car driving on a road at night"

Strong prompt: "A matte-black sports car accelerates through a rain-slicked mountain road at 2am. The camera follows at bumper height, tracking the red tail lights. Neon reflections from a tunnel streak across wet asphalt. High contrast, desaturated blues, cinematic anamorphic."

The second prompt consistently produces clips that match intent. The first produces random interpretations of a generic scene.

Set enhance_prompt=False when you’ve invested in prompt engineering — you don’t want the model overwriting your precise language. Use enhance_prompt=True (the default) when prototyping quickly and you want the model to compensate for sparse descriptions.


Conclusion

The Veo 3 API is accessible through the standard Gemini SDK using veo-3.0-generate-preview or veo-3.0-fast-generate-preview, with generation times between 90 seconds and 8 minutes depending on model and resolution. The production code in this tutorial handles the two most common failure modes — rate limiting and content policy rejections — that the official documentation doesn’t address. Run the fast model for iteration at $2.80 per clip, then switch to the full model only for your final renders.

Note: If you’re integrating multiple AI models into one pipeline, AtlasCloud provides unified API access to 300+ models including Kling, Flux, Seedance, Claude, and GPT — one API key, no per-provider setup. New users get a 25% credit bonus on first top-up (up to $100).

Try this API on AtlasCloud

AtlasCloud

Frequently Asked Questions

How much does the Veo 3 API cost per video generation?

Veo 3 API pricing is tiered by model variant. Veo 3 Fast costs $0.35 per second of generated video, while the full Veo 3 model costs $0.75 per second. For the standard 8-second clip, that translates to $2.80 per request on Veo 3 Fast and $6.00 per request on the full Veo 3 model. There is no flat per-request fee — you pay strictly based on output video duration, so longer clips scale linearly in c

How long does Veo 3 API take to generate a video?

Veo 3 and Veo 3.1 have a generation latency of approximately 2–4 minutes per request, depending on the resolution selected. Lower resolutions trend toward the 2-minute end, while 4K output can push closer to 4 minutes. This is an asynchronous operation, meaning your API call will poll or use a callback rather than receiving an immediate synchronous response. Developers should design their pipeline

How does Veo 3.1 compare to Sora and Kling 1.6 on video quality benchmarks?

As of mid-2025, independent cinematic quality evaluations place Veo 3.1 ahead of both OpenAI Sora and Kling 1.6 specifically on motion coherence and photorealism benchmark categories. Veo 3.1 also differentiates itself by supporting natively embedded audio in generated video, a feature not present in Sora or Kling 1.6 at the time of comparison. Maximum output resolution for Veo 3.1 is 4K, which al

What are the API access requirements to use Veo 3 in a Python project?

To call the Veo 3 API you need three things: (1) a Google Cloud account with billing enabled, (2) access to the Gemini API via either Google AI Studio or Vertex AI, and (3) a Gemini API key with Veo 3 access explicitly enabled — note that as of mid-2025 this access is not automatically granted and must be requested separately. On the SDK side, you install the Google GenAI Python client and authent

Tags

Veo 3 Google Video Generation API Tutorial 2026

Related Articles