Dataproc on GKE 会在 GKE 集群上部署 Dataproc 虚拟集群。与 Dataproc on Compute Engine 集群不同,Dataproc on GKE 虚拟集群不包含单独的主虚拟机和工作器虚拟机。当您创建 Dataproc on GKE 虚拟集群时,Dataproc on GKE 会在 GKE 集群中创建节点池。Dataproc on GKE 作业会在这些节点池上作为 Pod 运行。节点池以及节点池上的 Pod 调度由 GKE 管理。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-04-29。"],[[["Dataproc on GKE enables the execution of Big Data applications on GKE clusters through the Dataproc `jobs` API."],["You can create a Dataproc on GKE virtual cluster and then submit Spark, PySpark, SparkR, or Spark-SQL jobs via the Google Cloud console, Cloud CLI, or the Dataproc API."],["Dataproc on GKE utilizes virtual clusters, which, unlike Dataproc on Compute Engine clusters, do not have separate master and worker VMs."],["Dataproc on GKE job are run as pods on node pools and is managed by GKE."],["Dataproc on GKE supports Spark 3.5 versions."]]],[]]