关于spark2.x后的SparkSession

Apache Spark 2.0引入的SparkSession提供了一个统一的接口来使用Spark的各种功能,简化了编程模型并允许用户通过它调用DataFrame和Dataset相关API来编写Spark程序。本文将介绍如何创建SparkSession以及它的主要功能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Apache Spark2.0 引入了SparkSession,其为用户提供了一个统一的切入点来使用Spark的各项功能,并且允许用户通过它调用DataFrame和Dataset相关API来编写Spark程序。最重要的是,它减少了用户需要了解的一些概念,使得我们可以很容易地与Spark交互,在SparkSession中封装了SparkContext,SparkConf等,为了解决用户可能对SparkContext的混乱(不知道当时使用的哪一个Context),使用的时候就和SparkContext差不多;

val spark=SparkSession.builder().master("local").getOrCreate().config("key","value") //工厂创造

spark.read.textFile(Path:String),除了textFile,还有load,csv,json,text,format,jdbc等读取方法;封装了;很是方便的;

sparksession中部分源码如下:

@InterfaceStability.Stable
class Builder extends Logging {

  private[this] val options = new scala.collection.mutable.HashMap[String, String]

  private[this] var userSuppliedContext: Option[SparkContext] = None

  private[spark] def sparkContext(sparkContext: SparkContext): Builder = synchronized {
    userSuppliedContext = Option(sparkContext)
    this
  }

val sparkContext = userSuppliedContext.getOrElse {
  // set app name if not given
  val randomAppName = java.util.UUID.randomUUID().toString
  val sparkConf = new SparkConf()
  options.foreach { case (k, v) => sparkConf.set(k, v) }
  if (!sparkConf.contains("spark.app.name")) {
    sparkConf.setAppName(randomAppName)
  }
  val sc = SparkContext.getOrCreate(sparkConf)
  // maybe this is an existing SparkContext, update its SparkConf which maybe used
  // by SparkSession

ERROR SparkContext:91 - Error initializing SparkContext. java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem at scala.Predef$.require(Predef.scala:224) at org.apache.spark.metrics.MetricsSystem.getServletHandlers(MetricsSystem.scala:91) at org.apache.spark.SparkContext.<init>(SparkContext.scala:515) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2025-03-26 11:52:11 INFO SparkContext:54 - SparkContext already stopped. 2025-03-26 11:52:11 INFO SparkContext:54 - Successfully stopped SparkContext Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Can only call getServletHandlers on a running MetricsSystem at scala.Predef$.require(Predef.scala:224) at org.apache.spark.metrics.MetricsSyste
03-27
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值