本頁面由 Cloud Translation API 翻譯而成。

傳送處理要求

設定Google Cloud 帳戶並建立處理器後，即可向 Document AI 處理器提出要求。

所有處理器都使用相同的程式碼傳送要求。您會發現每個處理器輸出的資訊中，處理器運作方式有所差異。

使用 Document AI 的 v1 API 版本或 Google Cloud 控制台時，您可以將處理要求傳送至該特定處理器版本。如果您未指定處理器版本，系統會使用預設版本。詳情請參閱「管理處理器版本」。

線上處理

線上 (同步) 要求可讓您傳送單一文件進行處理。 Document AI 會立即處理要求，並傳回 document。

向處理器傳送要求

下列程式碼範例說明如何將要求傳送至處理器。

REST

這個範例說明如何在 rawDocument 物件中提供文件內容 (以位元組為單位的原始文件內容，透過 Base64 編碼字串)。

或者，您也可以指定 inlineDocument，這與 Document AI 傳回的 Document JSON 格式相同。這樣一來，您就能來回傳遞相同格式，串連要求 (例如分類文件，然後擷取內容)。

使用任何要求資料之前，請先替換以下項目：

LOCATION：處理器的位置，例如：
- us - 美國
- eu - 歐盟
PROJECT_ID：您的 Google Cloud 專案 ID。
PROCESSOR_ID：自訂處理器的 ID。
skipHumanReview：用來停用人工審查的布林值 (僅限人機迴圈處理器支援)。
- true - 略過人工審查
- false - 啟用人工審查 (預設)
MIME_TYPE^†：其中一個有效的 MIME 類型選項。
IMAGE_CONTENT^†：其中一個有效內嵌文件內容，表示為位元組串流。如果是 JSON 表示法，則為二進位圖片資料的 base64 編碼 (ASCII 字串)。這個字串應類似下列字串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需更多資訊，請參閱「Base64 編碼」主題。
FIELD_MASK：指定要納入 Document 輸出內容的欄位。這是以半形逗號分隔的 FieldMask 格式完整欄位名稱清單。
- 範例：text,entities,pages.pageNumber
INDIVIDUAL_PAGES：要處理的個別頁面清單。
- 或者，您也可以提供 fromStart 或 fromEnd 欄位，從文件開頭或結尾處理特定頁數。

† 您也可以在 inlineDocument 物件中使用 Base64 編碼的內容指定這項內容。

HTTP 方法和網址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process

JSON 要求主體：

{
  "skipHumanReview": skipHumanReview,
  "rawDocument": {
    "mimeType": "MIME_TYPE",
    "content": "IMAGE_CONTENT"
  },
  "fieldMask": "FIELD_MASK",
  "processOptions": {
    "individualPageSelector" {
      "pages": [INDIVIDUAL_PAGES]
    }
  }
}

如要傳送要求，請選擇以下其中一個選項：

curl

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process" | Select-Object -Expand Content

如果要求成功，伺服器會傳回 200 OK HTTP 狀態碼與 JSON 格式的回應。回應主體包含 Document 的例項。

將要求傳送至處理器版本

使用任何要求資料之前，請先替換以下項目：

LOCATION：處理器的位置，例如：
- us - 美國
- eu - 歐盟
PROJECT_ID：您的 Google Cloud 專案 ID。
PROCESSOR_ID：自訂處理器的 ID。
PROCESSOR_VERSION：處理器版本 ID。詳情請參閱「選取處理器版本」。例如：
- pretrained-TYPE-vX.X-YYYY-MM-DD
- stable
- rc
skipHumanReview：用來停用人工審查的布林值 (僅限人機迴圈處理器支援)。
- true - 略過人工審查
- false - 啟用人工審查 (預設)
MIME_TYPE^†：其中一個有效的 MIME 類型選項。
IMAGE_CONTENT^†：其中一個有效內嵌文件內容，表示為位元組串流。如果是 JSON 表示法，則為二進位圖片資料的 base64 編碼 (ASCII 字串)。這個字串應類似下列字串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需更多資訊，請參閱「Base64 編碼」主題。
FIELD_MASK：指定要納入 Document 輸出內容的欄位。這是以半形逗號分隔的 FieldMask 格式完整欄位名稱清單。
- 範例：text,entities,pages.pageNumber

† 您也可以在 inlineDocument 物件中使用 Base64 編碼的內容指定這項內容。

HTTP 方法和網址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process

JSON 要求主體：

{
  "skipHumanReview": skipHumanReview,
  "rawDocument": {
    "mimeType": "MIME_TYPE",
    "content": "IMAGE_CONTENT"
  },
  "fieldMask": "FIELD_MASK"
}

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process" | Select-Object -Expand Content

如果要求成功，伺服器會傳回 200 OK HTTP 狀態碼與 JSON 格式的回應。回應主體包含 Document 的例項。

C#

詳情請參閱 Document AI C# API 參考說明文件。

如要向 Document AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


using Google.Cloud.DocumentAI.V1;
using Google.Protobuf;
using System;
using System.IO;

public class QuickstartSample
{
    public Document Quickstart(
        string projectId = "your-project-id",
        string locationId = "your-processor-location",
        string processorId = "your-processor-id",
        string localPath = "my-local-path/my-file-name",
        string mimeType = "application/pdf"
    )
    {
        // Create client
        var client = new DocumentProcessorServiceClientBuilder
        {
            Endpoint = $"{locationId}-documentai.googleapis.com"
        }.Build();

        // Read in local file
        using var fileStream = File.OpenRead(localPath);
        var rawDocument = new RawDocument
        {
            Content = ByteString.FromStream(fileStream),
            MimeType = mimeType
        };

        // Initialize request argument(s)
        var request = new ProcessRequest
        {
            Name = ProcessorName.FromProjectLocationProcessor(projectId, locationId, processorId).ToString(),
            RawDocument = rawDocument
        };

        // Make the request
        var response = client.ProcessDocument(request);

        var document = response.Document;
        Console.WriteLine(document.Text);
        return document;
    }
}

Java

詳情請參閱 Document AI Java API 參考說明文件。

如要向 Document AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


import com.google.cloud.documentai.v1.Document;
import com.google.cloud.documentai.v1.DocumentProcessorServiceClient;
import com.google.cloud.documentai.v1.DocumentProcessorServiceSettings;
import com.google.cloud.documentai.v1.ProcessRequest;
import com.google.cloud.documentai.v1.ProcessResponse;
import com.google.cloud.documentai.v1.RawDocument;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;

public class ProcessDocument {
  public static void processDocument()
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-project-location"; // Format is "us" or "eu".
    String processerId = "your-processor-id";
    String filePath = "path/to/input/file.pdf";
    processDocument(projectId, location, processerId, filePath);
  }

  public static void processDocument(
      String projectId, String location, String processorId, String filePath)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs
    // to be created
    // once, and can be reused for multiple requests. After completing all of your
    // requests, call
    // the "close" method on the client to safely clean up any remaining background
    // resources.
    String endpoint = String.format("%s-documentai.googleapis.com:443", location);
    DocumentProcessorServiceSettings settings =
        DocumentProcessorServiceSettings.newBuilder().setEndpoint(endpoint).build();
    try (DocumentProcessorServiceClient client = DocumentProcessorServiceClient.create(settings)) {
      // The full resource name of the processor, e.g.:
      // projects/project-id/locations/location/processor/processor-id
      // You must create new processors in the Cloud Console first
      String name =
          String.format("projects/%s/locations/%s/processors/%s", projectId, location, processorId);

      // Read the file.
      byte[] imageFileData = Files.readAllBytes(Paths.get(filePath));

      // Convert the image data to a Buffer and base64 encode it.
      ByteString content = ByteString.copyFrom(imageFileData);

      RawDocument document =
          RawDocument.newBuilder().setContent(content).setMimeType("application/pdf").build();

      // Configure the process request.
      ProcessRequest request =
          ProcessRequest.newBuilder().setName(name).setRawDocument(document).build();

      // Recognizes text entities in the PDF document
      ProcessResponse result = client.processDocument(request);
      Document documentResponse = result.getDocument();

      // Get all of the document text as one big string
      String text = documentResponse.getText();

      // Read the text recognition output from the processor
      System.out.println("The document contains the following paragraphs:");
      Document.Page firstPage = documentResponse.getPages(0);
      List<Document.Page.Paragraph> paragraphs = firstPage.getParagraphsList();

      for (Document.Page.Paragraph paragraph : paragraphs) {
        String paragraphText = getText(paragraph.getLayout().getTextAnchor(), text);
        System.out.printf("Paragraph text:\n%s\n", paragraphText);
      }

      // Form parsing provides additional output about
      // form-formatted PDFs. You must create a form
      // processor in the Cloud Console to see full field details.
      System.out.println("The following form key/value pairs were detected:");

      for (Document.Page.FormField field : firstPage.getFormFieldsList()) {
        String fieldName = getText(field.getFieldName().getTextAnchor(), text);
        String fieldValue = getText(field.getFieldValue().getTextAnchor(), text);

        System.out.println("Extracted form fields pair:");
        System.out.printf("\t(%s, %s))\n", fieldName, fieldValue);
      }
    }
  }

  // Extract shards from the text field
  private static String getText(Document.TextAnchor textAnchor, String text) {
    if (textAnchor.getTextSegmentsList().size() > 0) {
      int startIdx = (int) textAnchor.getTextSegments(0).getStartIndex();
      int endIdx = (int) textAnchor.getTextSegments(0).getEndIndex();
      return text.substring(startIdx, endIdx);
    }
    return "[NO TEXT]";
  }
}

Node.js

詳情請參閱 Document AI Node.js API 參考說明文件。

如要向 Document AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console
// const filePath = '/path/to/local/pdf';

const {DocumentProcessorServiceClient} =
  require('@google-cloud/documentai').v1;

// Instantiates a client
const client = new DocumentProcessorServiceClient();

async function processDocument() {
  // The full resource name of the processor, e.g.:
  // projects/project-id/locations/location/processor/processor-id
  // You must create new processors in the Cloud Console first
  const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;

  // Read the file into memory.
  const fs = require('fs').promises;
  const imageFile = await fs.readFile(filePath);

  // Convert the image data to a Buffer and base64 encode it.
  const encodedImage = Buffer.from(imageFile).toString('base64');

  const request = {
    name,
    rawDocument: {
      content: encodedImage,
      mimeType: 'application/pdf',
    },
  };

  // Recognizes text entities in the PDF document
  const [result] = await client.processDocument(request);
  const {document} = result;

  // Get all of the document text as one big string
  const {text} = document;

  // Extract shards from the text field
  const getText = textAnchor => {
    if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {
      return '';
    }

    // First shard in document doesn't have startIndex property
    const startIndex = textAnchor.textSegments[0].startIndex || 0;
    const endIndex = textAnchor.textSegments[0].endIndex;

    return text.substring(startIndex, endIndex);
  };

  // Read the text recognition output from the processor
  console.log('The document contains the following paragraphs:');
  const [page1] = document.pages;
  const {paragraphs} = page1;

  for (const paragraph of paragraphs) {
    const paragraphText = getText(paragraph.layout.textAnchor);
    console.log(`Paragraph text:\n${paragraphText}`);
  }

  // Form parsing provides additional output about
  // form-formatted PDFs. You  must create a form
  // processor in the Cloud Console to see full field details.
  console.log('\nThe following form key/value pairs were detected:');

  const {formFields} = page1;
  for (const field of formFields) {
    const fieldName = getText(field.fieldName.textAnchor);
    const fieldValue = getText(field.fieldValue.textAnchor);

    console.log('Extracted key value pair:');
    console.log(`\t(${fieldName}, ${fieldValue})`);
  }
}

Python

詳情請參閱 Document AI Python API 參考說明文件。

如要向 Document AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

from typing import Optional

from google.api_core.client_options import ClientOptions
from google.cloud import documentai  # type: ignore

# TODO(developer): Uncomment these variables before running the sample.
# project_id = "YOUR_PROJECT_ID"
# location = "YOUR_PROCESSOR_LOCATION" # Format is "us" or "eu"
# processor_id = "YOUR_PROCESSOR_ID" # Create processor before running sample
# file_path = "/path/to/local/pdf"
# mime_type = "application/pdf" # Refer to https://cloud.google.com/document-ai/docs/file-types for supported file types
# field_mask = "text,entities,pages.pageNumber"  # Optional. The fields to return in the Document object.
# processor_version_id = "YOUR_PROCESSOR_VERSION_ID" # Optional. Processor version to use


def process_document_sample(
    project_id: str,
    location: str,
    processor_id: str,
    file_path: str,
    mime_type: str,
    field_mask: Optional[str] = None,
    processor_version_id: Optional[str] = None,
) -> None:
    # You must set the `api_endpoint` if you use a location other than "us".
    opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")

    client = documentai.DocumentProcessorServiceClient(client_options=opts)

    if processor_version_id:
        # The full resource name of the processor version, e.g.:
        # `projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}`
        name = client.processor_version_path(
            project_id, location, processor_id, processor_version_id
        )
    else:
        # The full resource name of the processor, e.g.:
        # `projects/{project_id}/locations/{location}/processors/{processor_id}`
        name = client.processor_path(project_id, location, processor_id)

    # Read the file into memory
    with open(file_path, "rb") as image:
        image_content = image.read()

    # Load binary data
    raw_document = documentai.RawDocument(content=image_content, mime_type=mime_type)

    # For more information: https://cloud.google.com/document-ai/docs/reference/rest/v1/ProcessOptions
    # Optional: Additional configurations for processing.
    process_options = documentai.ProcessOptions(
        # Process only specific pages
        individual_page_selector=documentai.ProcessOptions.IndividualPageSelector(
            pages=[1]
        )
    )

    # Configure the process request
    request = documentai.ProcessRequest(
        name=name,
        raw_document=raw_document,
        field_mask=field_mask,
        process_options=process_options,
    )

    result = client.process_document(request=request)

    # For a full list of `Document` object attributes, reference this page:
    # https://cloud.google.com/document-ai/docs/reference/rest/v1/Document
    document = result.document

    # Read the text recognition output from the processor
    print("The document contains the following text:")
    print(document.text)

批次處理

批次 (非同步) 要求可讓您在單一要求中傳送多份文件。Document AI 會傳送 operation，您可以輪詢要求狀態。這項作業完成後，會包含 BatchProcessMetadata，指向儲存處理結果的 Cloud Storage bucket。

如要存取的輸入檔案位於其他專案的值區中，您必須先提供該值區的存取權，才能存取檔案。請參閱設定檔案存取權。

向處理器傳送要求

下列程式碼範例說明如何將批次處理要求傳送至處理器。

REST

這個範例顯示如何將 POST 要求傳送至 batchProcess 方法，以非同步處理大型文件。範例中使用的存取憑證，屬於透過 Google Cloud CLI 建立的專案服務帳戶。如需安裝 Google Cloud CLI、使用服務帳戶建立專案，以及取得存取憑證的操作說明，請參閱「事前準備」。

batchProcess 要求會啟動長時間執行的作業，並將結果儲存在 Cloud Storage bucket 中。這個範例也說明如何取得這項長時間執行的作業在啟動後的狀態。