ขอแนะนำโหมดกลุ่มที่มีขีดจำกัดอัตราที่สูงขึ้นและส่วนลดโทเค็น 50% ดูข้อมูลเพิ่มเติม

หน้านี้ได้รับการแปลโดย Cloud Translation API

โหมดการประมวลผลแบบเป็นกลุ่ม

โหมดแบตช์ของ Gemini API ออกแบบมาเพื่อประมวลผลคำขอจำนวนมาก แบบไม่พร้อมกันที่50% ของต้นทุนมาตรฐาน เวลาในการดำเนินการตามเป้าหมายคือ 24 ชั่วโมง แต่ในกรณีส่วนใหญ่จะเร็วกว่านั้นมาก

ใช้โหมดกลุ่มสำหรับงานขนาดใหญ่ที่ไม่เร่งด่วน เช่น การประมวลผลข้อมูลเบื้องต้นหรือการเรียกใช้การประเมินที่ไม่จำเป็นต้องมีการตอบกลับทันที

เริ่มต้นใช้งาน

ส่วนนี้จะช่วยให้คุณเริ่มต้นใช้งานการส่งคำขอแรกในโหมดแบตช์

การสร้างงานแบบกลุ่ม

คุณส่งคำขอในโหมดเป็นกลุ่มได้ 2 วิธี ดังนี้

คำขอแบบอินไลน์: รายการออบเจ็กต์ GenerateContentRequest ที่รวมอยู่ในคำขอสร้างแบบกลุ่มโดยตรง เหมาะสำหรับกลุ่มเล็กๆ ที่มีขนาดคำขอรวมไม่เกิน 20 MB เอาต์พุตที่โมเดลส่งกลับมา คือรายการออบเจ็กต์ inlineResponse
ไฟล์อินพุต: ไฟล์ JSON Lines (JSONL) โดยแต่ละบรรทัดจะมีออบเจ็กต์ GenerateContentRequest ที่สมบูรณ์ เราขอแนะนำให้ใช้วิธีนี้สำหรับคำขอที่มีขนาดใหญ่ เอาต์พุต ที่ได้จากโมเดลคือไฟล์ JSONL ซึ่งแต่ละบรรทัดจะเป็น GenerateContentResponse หรือออบเจ็กต์สถานะ

คำขอในหน้า

สำหรับคำขอจำนวนเล็กน้อย คุณสามารถฝังออบเจ็กต์ GenerateContentRequestภายใน BatchGenerateContentRequest ได้โดยตรง ตัวอย่างต่อไปนี้เรียกใช้เมธอด BatchGenerateContent ด้วยคำขอแบบอินไลน์

Python


from google import genai
from google.genai import types

client = genai.Client()

# A list of dictionaries, where each is a GenerateContentRequest
inline_requests = [
    {
        'contents': [{
            'parts': [{'text': 'Tell me a one-sentence joke.'}],
            'role': 'user'
        }]
    },
    {
        'contents': [{
            'parts': [{'text': 'Why is the sky blue?'}],
            'role': 'user'
        }]
    }
]

inline_batch_job = client.batches.create(
    model="models/gemini-2.5-flash",
    src=inline_requests,
    config={
        'display_name': "inlined-requests-job-1",
    },
)

print(f"Created batch job: {inline_batch_job.name}")

REST

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:batchGenerateContent \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-X POST \
-H "Content-Type:application/json" \
-d '{
    "batch": {
        "display_name": "my-batch-requests",
        "input_config": {
            "requests": {
                "requests": [
                    {
                        "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]},
                        "metadata": {
                            "key": "request-1"
                        }
                    },
                    {
                        "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]},
                        "metadata": {
                            "key": "request-2"
                        }
                    }
                ]
            }
        }
    }
}'

คุณใช้คำขอใดก็ได้ที่ใช้ในโหมดที่ไม่ใช่แบบกลุ่ม (หรือโหมดอินเทอร์แอกทีฟ) เช่น คุณสามารถระบุอุณหภูมิ คำสั่งของระบบ หรือแม้กระทั่งส่งผ่าน รูปแบบอื่นๆ ได้ ตัวอย่างต่อไปนี้แสดงคำขอแบบอินไลน์บางส่วนที่มี คำสั่งของระบบสำหรับคำขอรายการใดรายการหนึ่ง

inline_requests_list = [
    {'contents': [{'parts': [{'text': 'Write a short poem about a cloud.'}]}]},
    {'contents': [{'parts': [{'text': 'Write a short poem about a cat.'}]}], 'system_instructions': {'parts': [{'text': 'You are a cat. Your name is Neko.'}]}}
]

ในทำนองเดียวกัน คุณยังระบุเครื่องมือที่จะใช้สำหรับคำขอได้ด้วย ตัวอย่างต่อไปนี้ แสดงคำขอที่เปิดใช้เครื่องมือ Google Search

inline_requests_list = [
    {'contents': [{'parts': [{'text': 'Who won the euro 1998?'}]}]},
    {'contents': [{'parts': [{'text': 'Who won the euro 2025?'}]}], 'tools': [{'google_search ': {}}]}
]

ไฟล์อินพุต

สำหรับชุดคำขอขนาดใหญ่ ให้เตรียมไฟล์ JSON Lines (JSONL) แต่ละบรรทัดในไฟล์นี้ต้องเป็นออบเจ็กต์ JSON ที่มีคีย์ที่ผู้ใช้กำหนดและออบเจ็กต์คำขอ โดยที่คำขอเป็นออบเจ็กต์ GenerateContentRequest ที่ถูกต้อง ระบบจะใช้คีย์ที่ผู้ใช้กำหนดในคำตอบเพื่อระบุว่าเอาต์พุตใดเป็นผลลัพธ์ ของคำขอใด เช่น คำขอที่มีคีย์กำหนดเป็น request-1 จะมีคำอธิบายประกอบการตอบกลับด้วยชื่อคีย์เดียวกัน

ระบบจะอัปโหลดไฟล์นี้โดยใช้ File API ขนาดไฟล์สูงสุด ที่อนุญาตสำหรับไฟล์อินพุตคือ 2 GB

ตัวอย่างไฟล์ JSONL มีดังนี้ คุณสามารถบันทึกไว้ในไฟล์ชื่อ my-batch-requests.json

{"key": "request-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}}}
{"key": "request-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}}

คุณระบุพารามิเตอร์อื่นๆ เช่น คำสั่งของระบบ เครื่องมือ หรือการกำหนดค่าอื่นๆ ใน JSON ของคำขอแต่ละรายการได้เช่นเดียวกับคำขอแบบอินไลน์

คุณอัปโหลดไฟล์นี้ได้โดยใช้ File API ตามที่แสดงในตัวอย่างต่อไปนี้ หาก คุณทำงานกับอินพุตหลายรูปแบบ คุณสามารถอ้างอิงไฟล์อื่นๆ ที่อัปโหลด ภายในไฟล์ JSONL ได้

Python


from google import genai
from google.genai import types

client = genai.Client()

# Create a sample JSONL file
with open("my-batch-requests.jsonl", "w") as f:
    requests = [
        {"key": "request-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}]}},
        {"key": "request-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}}
    ]
    for req in requests:
        f.write(json.dumps(req) + "\n")

# Upload the file to the File API
uploaded_file = client.files.upload(
    file='my-batch-requests.jsonl',
    config=types.UploadFileConfig(display_name='my-batch-requests', mime_type='jsonl')
)

print(f"Uploaded file: {uploaded_file.name}")

REST

tmp_batch_input_file=batch_input.tmp
echo -e '{"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generationConfig": {"temperature": 0.7}}\n{"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}]}' > batch_input.tmp
MIME_TYPE=$(file -b --mime-type "${tmp_batch_input_file}")
NUM_BYTES=$(wc -c < "${tmp_batch_input_file}")
DISPLAY_NAME=BatchInput

tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "https://generativelanguage.googleapis.com/upload/v1beta/files \
-D "${tmp_header_file}" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
-H "Content-Type: application/jsonl" \
-d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
-H "Content-Length: ${NUM_BYTES}" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
--data-binary "@${tmp_batch_input_file}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)

ตัวอย่างต่อไปนี้เรียกใช้เมธอด BatchGenerateContent โดยมีไฟล์อินพุต ที่อัปโหลดโดยใช้ File API

Python


# Assumes `uploaded_file` is the file object from the previous step
file_batch_job = client.batches.create(
    model="gemini-2.5-flash",
    src=uploaded_file.name,
    config={
        'display_name': "file-upload-job-1",
    },
)

print(f"Created batch job: {file_batch_job.name}")

REST

BATCH_INPUT_FILE='files/123456' # File ID
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:batchGenerateContent \
-X POST \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" \
-d "{
    'batch': {
        'display_name': 'my-batch-requests',
        'input_config': {
            'requests': {
                'file_name': ${BATCH_INPUT_FILE}
            }
        }
    }
}"

เมื่อสร้างงานแบบกลุ่ม คุณจะได้รับชื่องานที่ส่งคืน ใช้ชื่อนี้ เพื่อตรวจสอบสถานะงานและดึงข้อมูลผลลัพธ์เมื่องานเสร็จสมบูรณ์

ต่อไปนี้เป็นตัวอย่างเอาต์พุตที่มีชื่องาน


Created batch job from file: batches/123456789

สถานะของงานตรวจสอบ

ใช้ชื่อการดำเนินการที่ได้รับเมื่อสร้างงานแบบกลุ่มเพื่อสำรวจสถานะ ฟิลด์สถานะของงานแบบกลุ่มจะระบุสถานะปัจจุบันของงาน งานแบบกลุ่ม อาจอยู่ในสถานะใดสถานะหนึ่งต่อไปนี้

JOB_STATE_PENDING: สร้างงานแล้วและกำลังรอให้บริการประมวลผล
JOB_STATE_SUCCEEDED: งานเสร็จสมบูรณ์แล้ว ตอนนี้คุณสามารถดึงข้อมูลผลลัพธ์ได้แล้ว
JOB_STATE_FAILED: งานล้มเหลว ดูรายละเอียดข้อผิดพลาดสำหรับข้อมูลเพิ่มเติม
JOB_STATE_CANCELLED: ผู้ใช้ยกเลิกงาน

คุณสามารถสำรวจสถานะของงานเป็นระยะๆ เพื่อตรวจสอบว่าเสร็จสมบูรณ์แล้วหรือไม่

Python


# Use the name of the job you want to check
# e.g., inline_batch_job.name from the previous step
job_name = "YOUR_BATCH_JOB_NAME"  # (e.g. 'batches/your-batch-id')
batch_job = client.batches.get(name=job_name)

completed_states = set([
    'JOB_STATE_SUCCEEDED',
    'JOB_STATE_FAILED',
    'JOB_STATE_CANCELLED',
])

print(f"Polling status for job: {job_name}")
batch_job = client.batches.get(name=job_name) # Initial get
while batch_job.state.name not in completed_states:
  print(f"Current state: {batch_job.state.name}")
  time.sleep(30) # Wait for 30 seconds before polling again
  batch_job = client.batches.get(name=job_name)

print(f"Job finished with state: {batch_job.state.name}")
if batch_job.state.name == 'JOB_STATE_FAILED':
    print(f"Error: {batch_job.error}")

กำลังดึงข้อมูลผลลัพธ์

เมื่อสถานะของงานระบุว่างานแบบกลุ่มสำเร็จแล้ว ผลลัพธ์จะพร้อมใช้งานในฟิลด์ response

Python

import json

# Use the name of the job you want to check
# e.g., inline_batch_job.name from the previous step
job_name = "YOUR_BATCH_JOB_NAME"
batch_job = client.batches.get(name=job_name)

if batch_job.state.name == 'JOB_STATE_SUCCEEDED':

    # If batch job was created with a file
    if batch_job.dest and batch_job.dest.file_name:
        # Results are in a file
        result_file_name = batch_job.dest.file_name
        print(f"Results are in file: {result_file_name}")

        print("Downloading result file content...")
        file_content = client.files.download(file=result_file_name)
        # Process file_content (bytes) as needed
        print(file_content.decode('utf-8'))

    # If batch job was created with inline request
    elif batch_job.dest and batch_job.dest.inlined_responses:
        # Results are inline
        print("Results are inline:")
        for i, inline_response in enumerate(batch_job.dest.inlined_responses):
            print(f"Response {i+1}:")
            if inline_response.response:
                # Accessing response, structure may vary.
                try:
                    print(inline_response.response.text)
                except AttributeError:
                    print(inline_response.response) # Fallback
            elif inline_response.error:
                print(f"Error: {inline_response.error}")
    else:
        print("No results found (neither file nor inline).")
else:
    print(f"Job did not succeed. Final state: {batch_job.state.name}")
    if batch_job.error:
        print(f"Error: {batch_job.error}")

REST

BATCH_NAME="batches/123456" # Your batch job name

curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" 2> /dev/null > batch_status.json

if jq -r '.done' batch_status.json | grep -q "false"; then
    echo "Batch has not finished processing"
fi

batch_state=$(jq -r '.metadata.state' batch_status.json)
if [[ $batch_state = "JOB_STATE_SUCCEEDED" ]]; then
    if [[ $(jq '.response | has("inlinedResponses")' batch_status.json) = "true" ]]; then
        jq -r '.response.inlinedResponses' batch_status.json
        exit
    fi
    responses_file_name=$(jq -r '.response.responsesFile' batch_status.json)
    curl https://generativelanguage.googleapis.com/download/v1beta/$responses_file_name:download?alt=media \
    -H "x-goog-api-key: $GEMINI_API_KEY" 2> /dev/null
elif [[ $batch_state = "JOB_STATE_FAILED" ]]; then
    jq '.error' batch_status.json
elif [[ $batch_state == "JOB_STATE_CANCELLED" ]]; then
    echo "Batch was cancelled by the user"
fi

การยกเลิกงานแบบกลุ่ม

คุณยกเลิกงานแบบกลุ่มที่กำลังดำเนินการได้โดยใช้ชื่อของงาน เมื่อมีการยกเลิกงาน ระบบจะหยุดประมวลผลคำขอใหม่

Python

# Cancel a batch job
client.batches.cancel(name=batch_job_to_cancel.name)

REST

BATCH_NAME="batches/123456" # Your batch job name

# Cancel the batch
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME:cancel \
-H "x-goog-api-key: $GEMINI_API_KEY" \

# Confirm that the status of the batch after cancellation is JOB_STATE_CANCELLED
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type:application/json" 2> /dev/null | jq -r '.metadata.state'

การลบงานแบบกลุ่ม

คุณลบงานแบบกลุ่มที่มีอยู่ได้โดยใช้ชื่อของงาน เมื่อมีการลบงาน ระบบจะหยุดประมวลผลคำขอใหม่และนำงานออกจากรายการ งานแบบกลุ่ม

Python

# Delete a batch job
client.batches.delete(name=batch_job_to_delete.name)

REST

BATCH_NAME="batches/123456" # Your batch job name

# Cancel the batch
curl https://generativelanguage.googleapis.com/v1beta/$BATCH_NAME:delete \
-H "x-goog-api-key: $GEMINI_API_KEY" \

รายละเอียดทางเทคนิค

โมเดลที่รองรับ: โหมดเป็นกลุ่มรองรับโมเดล Gemini หลายรุ่น โปรดดูรายการล่าสุดของรุ่นที่เข้ากันได้ในหน้าโมเดล รูปแบบที่รองรับสำหรับโหมดแบตช์จะเหมือนกับรูปแบบที่รองรับใน API แบบอินเทอร์แอกทีฟ (หรือโหมดที่ไม่ใช่แบตช์)
ราคา: การใช้งานโหมดแบตช์มีราคาอยู่ที่ 50% ของต้นทุน API แบบอินเทอร์แอกทีฟมาตรฐาน สำหรับโมเดลที่เทียบเท่า
เป้าหมายระดับการให้บริการ (SLO): งานแบบกลุ่มได้รับการออกแบบมาให้เสร็จสมบูรณ์ ภายในเวลาในการตอบกลับ 24 ชั่วโมง งานจำนวนมากอาจเสร็จสมบูรณ์เร็วกว่านี้มาก โดยขึ้นอยู่กับขนาดและภาระงานปัจจุบันของระบบ
การแคช: เปิดใช้การแคชบริบท สำหรับคำขอแบบกลุ่ม หากคำขอในกลุ่มของคุณทำให้เกิดการเข้าถึงแคช ระบบจะกำหนดราคาโทเค็นที่แคชไว้เหมือนกับการเข้าชมในโหมดที่ไม่ใช่แบบกลุ่ม

แนวทางปฏิบัติแนะนำ

ใช้ไฟล์อินพุตสำหรับคำขอจำนวนมาก: สำหรับคำขอจำนวนมาก ให้ใช้วิธีการป้อนไฟล์เสมอ เพื่อการจัดการที่ดีขึ้นและหลีกเลี่ยงการเกินขีดจำกัดขนาดคำขอสำหรับ การเรียกใช้ BatchGenerateContent เอง โปรดทราบว่าไฟล์อินพุตแต่ละไฟล์ต้องมีขนาดไม่เกิน 2 GB
การจัดการข้อผิดพลาด: ตรวจสอบ batchStats สำหรับ failedRequestCount หลังจาก งานเสร็จสมบูรณ์ หากใช้เอาต์พุตไฟล์ ให้แยกวิเคราะห์แต่ละบรรทัดเพื่อตรวจสอบว่าเป็น GenerateContentResponse หรือออบเจ็กต์สถานะที่ระบุข้อผิดพลาดสำหรับ คำขอที่เฉพาะเจาะจงนั้น
ส่งงานครั้งเดียว: การสร้างงานแบบกลุ่มไม่ใช่การดำเนินการที่ทำซ้ำได้ หากคุณส่งคำขอสร้างเดียวกัน 2 ครั้ง ระบบจะสร้างงานแบบกลุ่มแยกกัน 2 งาน
แบ่งกลุ่มงานขนาดใหญ่มาก: แม้ว่าเวลาในการดำเนินการตามเป้าหมายคือ 24 ชั่วโมง แต่เวลาในการประมวลผลจริงอาจแตกต่างกันไปตามภาระงานของระบบและขนาดงาน สำหรับงานขนาดใหญ่ ให้ลองแบ่งออกเป็นกลุ่มเล็กๆ หากต้องการผลลัพธ์ระดับกลางเร็วขึ้น

ขั้นตอนถัดไป

ดูตัวอย่างเพิ่มเติมได้ในสมุดบันทึกโหมดเป็นชุด