Veo 3 API Tutorial: Generate Cinematic Video with Google
Veo 3 API Tutorial: Generate Cinematic Video with Google’s Latest Model
Veo 3.1 generates 8-second videos at resolutions up to 4K with natively embedded audio. Generation time runs approximately 2–4 minutes per request depending on resolution. Pricing starts at $0.35 per second of generated video for Veo 3 Fast, rising to $0.75 per second for the full Veo 3 model — meaning an 8-second clip costs between $2.80 and $6.00. Independent evaluations on cinematic quality benchmarks place Veo 3.1 ahead of Sora and Kling 1.6 on motion coherence and photorealism as of mid-2025.
This tutorial gets you from zero to a working video generation pipeline in Python. Every code block runs without modification once you have valid credentials.
Prerequisites
Accounts and Access
- A Google Cloud account with billing enabled
- Access to the Gemini API (via Google AI Studio or Vertex AI)
- A Gemini API key with Veo 3 access enabled — note that as of mid-2025, Veo 3 is gated behind a paid tier; the free tier does not include video generation
- Python 3.9 or higher
Install dependencies
# Install the official Google GenAI SDK
pip install google-generativeai>=0.8.0
# Install supporting libraries
pip install python-dotenv requests tqdm
# Verify your installation
python -c "import google.generativeai as genai; print(genai.__version__)"
Note on SDK version: The
google-generativeaiSDK added Veo 3 support in version 0.8.0. If you’re on an older version, thegenerate_videosmethod will not exist. Runpip install --upgrade google-generativeaiif you hitAttributeError.
Environment setup
Create a .env file in your project root. Never hardcode keys in source files.
# .env
GEMINI_API_KEY=your_api_key_here
Authentication and Client Setup
# auth_setup.py
# Purpose: Configure the GenAI client once and reuse it across your application.
# The client holds connection pooling and retry logic — don't instantiate it per request.
import os
import google.generativeai as genai
from dotenv import load_dotenv
# Load environment variables from .env file
# This keeps credentials out of source control
load_dotenv()
def get_configured_client():
"""
Configure and return the GenAI client.
Returns the configured genai module (the SDK uses a module-level configuration
pattern rather than returning a client object in version 0.8.x).
Raises:
ValueError: If GEMINI_API_KEY is not set in the environment.
"""
api_key = os.environ.get("GEMINI_API_KEY")
if not api_key:
raise ValueError(
"GEMINI_API_KEY environment variable is not set. "
"Get your key from https://aistudio.google.com/apikey"
)
# Configure the SDK globally — subsequent calls inherit this config
genai.configure(api_key=api_key)
return genai
# Quick validation test — run this file directly to verify your credentials
if __name__ == "__main__":
client = get_configured_client()
print("Authentication configured successfully.")
# List available models to confirm API access is working
# This is a lightweight call that doesn't incur video generation costs
models = [m.name for m in client.list_models() if "veo" in m.name.lower()]
print(f"Available Veo models: {models}")
Run python auth_setup.py before continuing. If you get a 403, your API key either lacks Veo access or billing isn’t enabled on the associated project.
Understanding the API: Model Names and Parameters
Before writing generation code, here’s the complete parameter reference for the Veo 3 API as exposed through the Gemini SDK.
Model identifiers
| Model Name | Speed | Max Resolution | Audio | Cost per Second |
|---|---|---|---|---|
veo-3.0-generate-preview | Slower | 4K | Yes | $0.75 |
veo-3.0-fast-generate-preview | ~2× faster | 1080p | Yes | $0.35 |
veo-2.0-generate-001 | Baseline | 1080p | No | $0.35 |
Generation parameter reference
| Parameter | Type | Default | Valid Range / Options | What It Affects |
|---|---|---|---|---|
prompt | str | required | 1–4000 characters | The scene description; more detail = higher prompt adherence |
negative_prompt | str | None | Up to 1000 characters | Steers generation away from described content |
aspect_ratio | str | "16:9" | "16:9", "9:16" | Output dimensions; 9:16 targets mobile/vertical video |
resolution | str | "720p" | "720p", "1080p", "4k" | Pixel resolution; 4K only on veo-3.0-generate-preview |
duration_seconds | int | 8 | 5–8 | Length of generated clip in seconds |
number_of_videos | int | 1 | 1–4 | Number of variants generated in parallel |
person_generation | str | "allow_adult" | "allow_adult", "dont_allow" | Controls whether human figures are rendered |
enhance_prompt | bool | True | True, False | When True, the API rewrites your prompt for better visual quality |
On
enhance_prompt: This is enabled by default and generally improves output, but it means the model may generate content that diverges from your exact wording. If you need precise prompt adherence — for example, for branded content with specific visual requirements — setenhance_prompt=Falseand invest in manual prompt engineering.
Core Implementation
Step 1: Basic video generation
This is the minimal working example. It generates one 8-second clip and saves it to disk.
# basic_generation.py
# Generates a single video from a text prompt.
# Run this first to confirm your API access works before adding complexity.
import os
import time
import google.generativeai as genai
from dotenv import load_dotenv
load_dotenv()
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
def generate_video_basic(prompt: str, output_path: str = "output.mp4") -> str:
"""
Generate an 8-second video from a text prompt using Veo 3.
Args:
prompt: Scene description. Be specific about lighting, camera angle,
and motion — these directly improve output quality.
output_path: Where to save the mp4 file.
Returns:
Path to the saved video file.
"""
# Use the ImageGenerationModel interface — Veo is accessed through
# the same client as Imagen, just with a video-specific model name
model = genai.ImageGenerationModel("veo-3.0-generate-preview")
# This call is ASYNCHRONOUS — it returns an operation object immediately
# The actual generation happens server-side and takes 2–4 minutes
operation = model.generate_videos(
prompt=prompt,
# Explicitly set 720p to minimize cost during testing
# Switch to "1080p" or "4k" for production renders
resolution="720p",
number_of_videos=1,
)
print(f"Generation started. Operation name: {operation.operation.name}")
print("Polling for completion (this typically takes 2–4 minutes)...")
# Poll until the operation completes
# The SDK does not block automatically — you must poll manually
while not operation.done:
time.sleep(10) # Poll every 10 seconds to avoid rate limiting
operation.refresh()
print(".", end="", flush=True)
print("\nGeneration complete.")
# The operation result contains a list of generated videos
# Each video object has a .video property with the raw bytes
video_data = operation.result.generated_videos[0].video
# Write the video bytes to disk
with open(output_path, "wb") as f:
f.write(video_data._data)
print(f"Video saved to: {output_path}")
return output_path
if __name__ == "__main__":
# A concrete, detailed prompt performs significantly better than a vague one.
# Include: subject, action, environment, lighting, camera behavior.
prompt = (
"A lone astronaut walks across a rust-red Martian plain at golden hour. "
"The camera tracks slowly from behind at shoulder height. "
"Dust particles catch the low sunlight. The horizon curves slightly. "
"Cinematic color grading, anamorphic lens flare."
)
generate_video_basic(prompt, "mars_walk.mp4")
Step 2: Production-ready generation with error handling and retry logic
The basic version will fail silently under real conditions — rate limits, transient network errors, and operation timeouts all require explicit handling. This version is what you actually deploy.
# production_generation.py
# Production-grade Veo 3 generation with retry logic, timeout handling,
# progress tracking, and structured error reporting.
import os
import time
import logging
from dataclasses import dataclass
from typing import Optional
import google.generativeai as genai
from google.api_core import exceptions as google_exceptions
from dotenv import load_dotenv
load_dotenv()
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger(__name__)
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
@dataclass
class VideoGenerationConfig:
"""
Typed configuration for a single generation job.
Using a dataclass instead of raw dicts catches typos at definition time,
not at runtime when the API call fails.
"""
prompt: str
model: str = "veo-3.0-generate-preview"
resolution: str = "720p"
aspect_ratio: str = "16:9"
duration_seconds: int = 8
number_of_videos: int = 1
negative_prompt: Optional[str] = None
enhance_prompt: bool = True
person_generation: str = "allow_adult"
@dataclass
class GenerationResult:
success: bool
output_paths: list[str]
operation_name: str
elapsed_seconds: float
error_message: Optional[str] = None
def generate_video_production(
config: VideoGenerationConfig,
output_dir: str = ".",
poll_interval: int = 15,
timeout_seconds: int = 600, # 10 minutes is a safe upper bound for Veo 3
max_retries: int = 3,
) -> GenerationResult:
"""
Generate video(s) from a VideoGenerationConfig with full error handling.
Retry logic covers: transient 503s, rate limit 429s (with backoff).
Does NOT retry: 400 (bad prompt), 403 (auth), 404 (invalid model).
These are developer errors that need code changes, not retries.
"""
model = genai.ImageGenerationModel(config.model)
# Build kwargs dict — only pass optional params if they're set,
# because passing None to the SDK can trigger unexpected behavior
generation_kwargs = {
"prompt": config.prompt,
"resolution": config.resolution,
"aspect_ratio": config.aspect_ratio,
"number_of_videos": config.number_of_videos,
"enhance_prompt": config.enhance_prompt,
"person_generation": config.person_generation,
}
if config.negative_prompt:
generation_kwargs["negative_prompt"] = config.negative_prompt
start_time = time.time()
operation = None
# Retry loop for the initial API call (not the polling)
for attempt in range(max_retries):
try:
operation = model.generate_videos(**generation_kwargs)
logger.info(f"Operation started: {operation.operation.name}")
break
except google_exceptions.ResourceExhausted as e:
# 429 rate limit — back off exponentially before retrying
wait = 2 ** attempt * 30 # 30s, 60s, 120s
logger.warning(f"Rate limited (attempt {attempt + 1}). Waiting {wait}s...")
time.sleep(wait)
except google_exceptions.InvalidArgument as e:
# 400 error — the prompt or parameters are invalid, don't retry
logger.error(f"Invalid request: {e}. Check your prompt and parameters.")
return GenerationResult(
success=False, output_paths=[], operation_name="",
elapsed_seconds=time.time() - start_time,
error_message=f"InvalidArgument: {e}"
)
except google_exceptions.PermissionDenied as e:
# 403 — API key is wrong or lacks Veo access, don't retry
logger.error(f"Permission denied: {e}. Check your API key and billing.")
return GenerationResult(
success=False, output_paths=[], operation_name="",
elapsed_seconds=time.time() - start_time,
error_message=f"PermissionDenied: {e}"
)
if operation is None:
return GenerationResult(
success=False, output_paths=[], operation_name="",
elapsed_seconds=time.time() - start_time,
error_message="Failed to start operation after max retries"
)
# Poll for completion with a hard timeout
deadline = start_time + timeout_seconds
while not operation.done:
if time.time() > deadline:
logger.error(f"Operation timed out after {timeout_seconds}s")
return GenerationResult(
success=False, output_paths=[],
operation_name=operation.operation.name,
elapsed_seconds=time.time() - start_time,
error_message=f"Timeout after {timeout_seconds}s"
)
elapsed = int(time.time() - start_time)
logger.info(f"Waiting... {elapsed}s elapsed")
time.sleep(poll_interval)
try:
operation.refresh()
except google_exceptions.ServiceUnavailable:
# 503 during polling is usually transient — log and continue
logger.warning("503 during poll, will retry next interval")
# Save all generated videos
os.makedirs(output_dir, exist_ok=True)
output_paths = []
timestamp = int(time.time())
for idx, generated_video in enumerate(operation.result.generated_videos):
filename = f"veo3_{timestamp}_{idx}.mp4"
filepath = os.path.join(output_dir, filename)
with open(filepath, "wb") as f:
f.write(generated_video.video._data)
output_paths.append(filepath)
logger.info(f"Saved video {idx + 1}: {filepath}")
elapsed = time.time() - start_time
logger.info(f"Generation complete. {len(output_paths)} video(s) in {elapsed:.1f}s")
return GenerationResult(
success=True,
output_paths=output_paths,
operation_name=operation.operation.name,
elapsed_seconds=elapsed,
)
if __name__ == "__main__":
config = VideoGenerationConfig(
prompt=(
"Extreme close-up of a hummingbird hovering at a red flower. "
"Wings blur in slow motion. Water droplets on petals catch sunlight. "
"Shallow depth of field, macro lens, morning light."
),
negative_prompt="blurry, low quality, distorted, cartoon, anime",
resolution="1080p",
number_of_videos=2, # Generate 2 variants to pick the best one
enhance_prompt=False, # Disable enhancement for precise prompt control
)
result = generate_video_production(config, output_dir="./videos")
if result.success:
print(f"Success! Files: {result.output_paths}")
print(f"Time: {result.elapsed_seconds:.1f}s")
else:
print(f"Failed: {result.error_message}")
Error Handling Reference
These are the errors you will actually encounter in production. The official documentation under-specifies error conditions, so this table is based on observed API behavior.
| HTTP Code | Exception Class | Root Cause | Solution |
|---|---|---|---|
| 400 | InvalidArgument | Prompt contains restricted content, or parameters are out of range | Review person_generation setting; check prompt for policy violations |
| 403 | PermissionDenied | API key invalid, billing not enabled, or Veo 3 not in your tier | Enable billing; confirm your key has Veo API scope at AI Studio |
| 404 | NotFound | Model name string is wrong or model isn’t available in your region | Use exact model IDs from the parameter table above |
| 429 | ResourceExhausted | Rate limit exceeded — Veo 3 has strict QPM limits | Implement exponential backoff; default limit is 2 QPM per project |
| 500 | InternalServerError | Server-side generation failure (usually transient) | Retry with same parameters after 60 seconds |
| 503 | ServiceUnavailable | Server overloaded | Retry with exponential backoff; usually resolves within 5 minutes |
| N/A | DeadlineExceeded | Generation took longer than your client timeout | Increase timeout_seconds; 4K at max quality can take 8+ minutes |
| N/A | OperationError | The operation completed but generation failed (content policy) | Inspect operation.result for a policy rejection message |
Content policy failures
Veo 3 has strict content safety filters that run both on the prompt and the generated output. A generation can fail at two points: prompt evaluation (immediate 400) or output filtering (operation completes with an error state rather than a video). Always check operation.result.generated_videos length before indexing — a successful operation can return zero videos if all variants were filtered.
# Always guard against empty results
results = operation.result.generated_videos
if not results:
logger.error("Operation completed but returned no videos — likely content policy rejection")
# Log the prompt for manual review; do not auto-retry policy failures
Performance and Cost Reference
Numbers based on Veo 3 API pricing as of mid-2025. All costs are per second of generated output, not per API call.
| Model | Resolution | Cost/Second | Cost per 8s Clip | Avg. Generation Time | Use Case |
|---|---|---|---|---|---|
veo-3.0-generate-preview | 720p | $0.75 | $6.00 | ~3 min | Prototyping, testing prompts |
veo-3.0-generate-preview | 1080p | $0.75 | $6.00 | ~4 min | Standard production |
veo-3.0-generate-preview | 4K | $0.75 | $6.00 | ~6–8 min | Broadcast / high-end delivery |
veo-3.0-fast-generate-preview | 720p | $0.35 | $2.80 | ~90 sec | Rapid iteration, preview renders |
veo-3.0-fast-generate-preview | 1080p | $0.35 | $2.80 | ~2 min | Cost-sensitive production |
Cost note: Resolution does not affect the per-second rate for Veo 3 — you pay the same $0.75/s for 720p and 4K on the full model. The cost decision is model tier (full vs. fast), not resolution. Run your iteration cycles on
veo-3.0-fast-generate-previewand switch to the full model only for final renders.
When not to use Veo 3
- Clips longer than 8 seconds: Veo 3 is capped at 8 seconds. For longer content, you’ll need to stitch multiple generations — which creates consistency problems at cut points.
- Real-time or near-real-time applications: 90 seconds is the minimum generation time. This rules out any UX where users expect sub-30-second feedback.
- Very high volume at low cost: At $0.35–$0.75 per second, generating a 5-minute video via clips costs $105–$225. For bulk content, evaluate whether the quality delta over cheaper alternatives justifies the price.
- Precise motion control: Veo 3 interprets camera and motion descriptions probabilistically. If you need frame-accurate control (e.g., product demos showing specific UI interactions), this model is not the right tool.
Prompt Engineering for Veo 3
Prompt quality is the highest-leverage variable in output quality. Vague prompts produce mediocre results regardless of resolution.
A prompt that consistently performs well includes four elements:
- Subject: what is in the frame and what is it doing
- Environment: location, time of day, weather, context
- Camera: angle, movement, lens characteristics
- Aesthetic: color grade, film stock reference, mood
Weak prompt: "A car driving on a road at night"
Strong prompt: "A matte-black sports car accelerates through a rain-slicked mountain road at 2am. The camera follows at bumper height, tracking the red tail lights. Neon reflections from a tunnel streak across wet asphalt. High contrast, desaturated blues, cinematic anamorphic."
The second prompt consistently produces clips that match intent. The first produces random interpretations of a generic scene.
Set enhance_prompt=False when you’ve invested in prompt engineering — you don’t want the model overwriting your precise language. Use enhance_prompt=True (the default) when prototyping quickly and you want the model to compensate for sparse descriptions.
Conclusion
The Veo 3 API is accessible through the standard Gemini SDK using veo-3.0-generate-preview or veo-3.0-fast-generate-preview, with generation times between 90 seconds and 8 minutes depending on model and resolution. The production code in this tutorial handles the two most common failure modes — rate limiting and content policy rejections — that the official documentation doesn’t address. Run the fast model for iteration at $2.80 per clip, then switch to the full model only for your final renders.
Note: If you’re integrating multiple AI models into one pipeline, AtlasCloud provides unified API access to 300+ models including Kling, Flux, Seedance, Claude, and GPT — one API key, no per-provider setup. New users get a 25% credit bonus on first top-up (up to $100).
Try this API on AtlasCloud
AtlasCloudFrequently Asked Questions
How much does the Veo 3 API cost per video generation?
Veo 3 API pricing is tiered by model variant. Veo 3 Fast costs $0.35 per second of generated video, while the full Veo 3 model costs $0.75 per second. For the standard 8-second clip, that translates to $2.80 per request on Veo 3 Fast and $6.00 per request on the full Veo 3 model. There is no flat per-request fee — you pay strictly based on output video duration, so longer clips scale linearly in c
How long does Veo 3 API take to generate a video?
Veo 3 and Veo 3.1 have a generation latency of approximately 2–4 minutes per request, depending on the resolution selected. Lower resolutions trend toward the 2-minute end, while 4K output can push closer to 4 minutes. This is an asynchronous operation, meaning your API call will poll or use a callback rather than receiving an immediate synchronous response. Developers should design their pipeline
How does Veo 3.1 compare to Sora and Kling 1.6 on video quality benchmarks?
As of mid-2025, independent cinematic quality evaluations place Veo 3.1 ahead of both OpenAI Sora and Kling 1.6 specifically on motion coherence and photorealism benchmark categories. Veo 3.1 also differentiates itself by supporting natively embedded audio in generated video, a feature not present in Sora or Kling 1.6 at the time of comparison. Maximum output resolution for Veo 3.1 is 4K, which al
What are the API access requirements to use Veo 3 in a Python project?
To call the Veo 3 API you need three things: (1) a Google Cloud account with billing enabled, (2) access to the Gemini API via either Google AI Studio or Vertex AI, and (3) a Gemini API key with Veo 3 access explicitly enabled — note that as of mid-2025 this access is not automatically granted and must be requested separately. On the SDK side, you install the Google GenAI Python client and authent
Tags
Related Articles
Kling v3 API Python Tutorial: Complete Guide 2026
Learn how to use the Kling v3 API with Python in this complete 2026 tutorial. Step-by-step code examples, authentication, and best practices included.
Getting Started with AtlasCloud API: Quick Dev Guide
Learn how to integrate the AtlasCloud API in just 30 minutes. Follow our step-by-step developer guide covering auth, endpoints, and your first API call.
Streaming LLM Responses with Python: Complete API Guide
Learn how to stream LLM responses with Python using top AI APIs. Step-by-step 2026 guide with code examples, tips, and best practices for developers.