Spark运行各个时间段的解释

本文详细解释了Apache Spark UI中各个性能指标的具体含义,包括调度延迟、任务反序列化时间等,有助于理解Spark作业的执行情况及进行性能调优。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

package org.apache.spark.ui


private[spark] object ToolTips {
  val SCHEDULER_DELAY =
    """Scheduler delay includes time to ship the task from the scheduler to
       the executor, and time to send the task result from the executor to the scheduler. If
       scheduler delay is large, consider decreasing the size of tasks or decreasing the size
       of task results."""

  val TASK_DESERIALIZATION_TIME =
    """Time spent deserializing the task closure on the executor, including the time to read the
       broadcasted task."""

  val KSHUFFLE_READ_BLOCED_TIME =
    "Time that the task spent blocked waiting for shuffle data to be read from remote machines."

  val INPUT = "Bytes and records read from Hadoop or from Spark storage."

  val OUTPUT = "Bytes and records written to Hadoop."

  val STORAGE_MEMORY =
    "Memory used / total available memory for storage of data " +
      "like RDD partitions cached in memory. "

  val SHUFFLE_WRITE =
    "Bytes and records written to disk in order to be read by a shuffle in a future stage."

  val SHUFFLE_READ =
    """Total shuffle bytes and records read (includes both data read locally and data read from
       remote executors). """

  val SHUFFLE_READ_REMOTE_SIZE =
    """Total shuffle bytes read from remote executors. This is a subset of the shuffle
       read bytes; the remaining shuffle data is read locally. """

  val GETTING_RESULT_TIME =
    """Time that the driver spends fetching task results from workers. If this is large, consider
       decreasing the amount of data returned from each task."""

  val RESULT_SERIALIZATION_TIME =
    """Time spent serializing the task result on the executor before sending it back to the
       driver."""

  val GC_TIME =
    """Time that the executor spent paused for Java garbage collection while the task was
       running."""

  val JOB_TIMELINE =
    """Shows when jobs started and ended and when executors joined or left. Drag to scroll.
       Click Enable Zooming and use mouse wheel to zoom in/out."""

  val STAGE_TIMELINE =
    """Shows when stages started and ended and when executors joined or left. Drag to scroll.
       Click Enable Zooming and use mouse wheel to zoom in/out."""

  val JOB_DAG =
    """Shows a graph of stages executed for this job, each of which can contain
       multiple RDD operations (e.g. map() and filter()), and of RDDs inside each operation
       (shown as dots)."""

  val STAGE_DAG =
    """Shows a graph of RDD operations in this stage, and RDDs inside each one. A stage can run
       multiple operations (e.g. two map() functions) if they can be pipelined. Some operations
       also create multiple RDDs internally. Cached RDDs are shown in green.
    """
}

转载于:https://www.cnblogs.com/wzyxidian/p/4853619.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值