Claude can analyze images as well as text — this is called vision capability. Whether you want to describe a product photo, interpret a scanned document, or break down a chart, the Claude API accepts images through three distinct methods. This guide walks through each method so you can pick the right one for your use case and get up and running quickly.
Before You Start
- Anthropic API key — Required to call the Claude API. You can generate one at
platform.claude.com. - Python environment — Examples in this guide use Python. Other language SDKs follow the same structure.
- anthropic package — Install it by running
pip install anthropicin your terminal.
Quick Glossary
- Vision — The ability to accept images as input and understand their content, just like reading text.
- Base64 — An encoding scheme that converts binary files (like images) into plain text strings so they can be embedded in JSON payloads. Think of it like packing a photo into a text message.
- Multimodal — Handling more than one type of input at once — for example, both text and images in a single request.
- Payload — The actual data bundle sent in an API request. Including large images inflates payload size.
- file_id — A unique identifier returned when you upload a file via the Files API. Use this ID in later requests instead of re-uploading the file.
- Content block — Each distinct input element (image or text) in an API message. Multiple blocks can be sent in one request as an array.
A Key Ordering Tip
According to the official documentation, Claude works best when images are placed before the text in a message. Images placed after text or interspersed with text still work, but if your use case allows it, put the image first. This is similar to how providing a long document before your question tends to yield better results.
One important platform note: on Amazon Bedrock and Google Cloud, only the Base64 encoding method is currently available. URL reference and Files API methods require a direct call to the Anthropic API.
Method 1: Base64 Encoding
Convert your image file into a Base64 string and embed it directly in the request body. No external server or public URL needed — ideal for small or one-off images.
import anthropic
import base64
# Read and encode the image file
with open("my_image.jpg", "rb") as f:
image_data = base64.standard_b64encode(f.read()).decode("utf-8")
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg", # match your file's actual format
"data": image_data,
},
},
{
"type": "text",
"text": "Describe this image."
}
],
}
],
)
print(message.content)
Success signal
You'll see a ContentBlock object printed in the terminal with type='text' followed by Claude's description of the image.
Supported image formats
- JPEG —
image/jpeg - PNG —
image/png - GIF —
image/gif - WebP —
image/webp
Make sure the media_type value matches the actual file format. A mismatch will cause an error or unreliable results.
Method 2: URL Reference
If your image is already hosted online and publicly accessible, you can simply pass the URL. No encoding or uploading required.
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
},
},
{
"type": "text",
"text": "Describe this image."
}
],
}
],
)
print(message.content)
Things to watch out for
- The URL must be publicly accessible. Local file paths or login-protected URLs will not work.
- If the image is later removed or the URL changes, the same request will fail. For stable references, use the Files API instead.
Method 3: Files API (Upload Once, Reuse Many Times)
The Files API is the most efficient approach when you need to reference the same image across multiple requests or in a long multi-turn conversation. Upload the file once and reference it by its file_id from then on.
In multi-turn conversations, every request resends the full conversation history. If images are Base64-encoded, the full image bytes travel in every payload, ballooning request size as the conversation grows. With the Files API, only a short ID is included each time — the image bytes stay on Anthropic's servers.
import anthropic
client = anthropic.Anthropic()
# Step 1: Upload the image once
with open("image.jpg", "rb") as f:
file_upload = client.beta.files.upload(
file=("image.jpg", f, "image/jpeg")
)
file_id = file_upload.id # e.g. "file_abc123..."
print(f"Upload complete. file_id: {file_id}")
# Step 2: Reference the image by file_id
message = client.beta.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
betas=["files-api-2025-04-14"],
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "file",
"file_id": file_id,
},
},
{
"type": "text",
"text": "Describe this image."
}
],
}
],
)
print(message.content)
Success signal
First, you'll see Upload complete. file_id: file_… printed. Then Claude's image description follows. Save the file_id value — you can use it in future requests without re-uploading the image.
Common Problems & Fixes
-
Error:
invalid_request_error— media_type mismatch
Cause: Themedia_typevalue doesn't match the actual file format.
Fix: Check the file extension and set the correct value:image/jpeg,image/png,image/gif, orimage/webp. -
URL method fails to load the image
Cause: The URL is not publicly accessible, or requires authentication.
Fix: Open the URL in a private/incognito browser window. If the image doesn't load, switch to Base64 or Files API. -
Files API error — missing
betasparameter
Cause: The Files API is a beta feature and requires thebetasparameter to be included.
Fix: Useclient.beta.messages.createand addbetas=["files-api-2025-04-14"]as shown in the example above. -
URL method not working on Amazon Bedrock or Google Cloud
Cause: According to official documentation, only Base64 encoding is currently available on those platforms.
Fix: Use Base64 encoding when working within Bedrock or Google Cloud environments.
Summary: Which Method Should You Use?
- Base64 encoding — Simplest setup, no external dependencies. Gets heavy if you use many large images or long conversations.
- URL reference — Shortest code when the image is already public. Doesn't work for private images.
- Files API — Best for repeated use or multi-turn conversations. Requires an upload step, but pays off by keeping payloads small.
Next Steps
- Send multiple images at once — Add more than one image block to the content array in a single request to analyze several images together.
- Mix images and text — Combine image blocks and text blocks in one content array for compound analysis tasks.
- Coordinates and bounding boxes — The official documentation's "Coordinates and bounding boxes" section covers how to reference specific regions within an image.
- PDF input — Claude also accepts PDF files as input. Pairing this with the Files API is an efficient way to process long documents.