先来看一个普通的Left Join
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.broadcast
case class People(id: Int, name: String)
case class Student(sid: Int, sname: String)
object TestBroadcastNestedLoopJoin3 {
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder()
.master("local")
.appName("TestBroadcastNestedLoopJoin")
//.config("spark.sql.autoBroadcastJoinThreshold", -1)
.getOrCreate()
import spark.implicits._
val df1 = spark.createDataFrame(Seq(
People(1, "darren"),
People(2, "zhang"),
People(3, "jonathan"),
People(4, "li"),
People(5, "joyce"),
People(6, "chen"),
People(7, "chaoshu"),
People(8, "sha"),
People(9, "cindy"),
People(10, "xie"))).as[People]
// df1.show()
val df2 = spark.createDataFrame(Seq(
Student(1, "darren"),
Student(2, "jonathan"))).as[Student]
val df3 = df1.join(df2, $"id" === $"sid", joinType = "left")
df3.show()
df3.explain()
spark.stop()
}
}
+---+--------+----+--------+
| id| name| sid| sname|
+---+--------+----+--------+
| 1| darren| 1| darren|
| 2| zhang| 2|jonathan|
| 3|jonathan|null| null|
| 4| li|null| null|
| 5| joyce|null| null|
| 6| chen|null| null|
| 7| chaoshu|null| null|
| 8| sha|null| null|
| 9| cindy|null| null|
| 10| xie|null| null|
+---+--------+----+--------+
== Physical Plan ==
*(1) BroadcastHashJoin [id#0], [sid#7], LeftOuter, BuildRight
:- LocalTableScan [id#0, name#1]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
+- LocalTableScan [sid#7, sname#8]
正常的BroadcastHashJoin,性能很好。
但是,我修改一下Join条件:
问题一,多个条件或者关系,广播无法关闭
val df3 = df1.join(df2, $"id" === $"sid" || $"name" === $"sname", joinType = "left")
+---+--------+----+--------+
| id| name| sid| sname|
+---+--------+----+--------+
| 1| darren| 1| darren|
| 2| zhang| 2|jonathan|
| 3|jonathan| 2|jonathan|
| 4| li|null| null|
| 5| joyce|null| null|
| 6| chen|null| null|
| 7| chaoshu|null| null|
| 8| sha|null| null|
| 9| cindy|null| null|
| 10| xie|null| null|
+---+--------+----+--------+
== Physical Plan ==
BroadcastNestedLoopJoin BuildRight, LeftOuter, ((id#0 = sid#7) || (name#1 = sname#8))
:- LocalTableScan [id#0, name#1]
+- BroadcastExchange IdentityBroadcastMode
+- LocalTableScan [sid#7, sname#8]
这里就出现了性能很差的BroadcastNestedLoopJoin。
这是就想到了方案,关闭广播不就行了。
方案一:关闭广播(错误的方案)
打开注视,然后再运行看看
.config("spark.sql.autoBroadcastJoinThreshold", -1)
+---+--------+----+--------+
| id| name| sid| sname|
+---+--------+----+--------+
| 1| darren| 1| darren|
| 2| zhang| 2|jonathan|
| 3|jonathan| 2|jonathan|
| 4| li|null| null|
| 5| joyce|null| null|
| 6| chen|null| null|
| 7| chaoshu|null| null|
| 8| sha|null| null|
| 9| cindy|null| null|
| 10| xie|null| null|
+---+--------+----+--------+
== Physical Plan ==
BroadcastNestedLoopJoin BuildRight, LeftOuter, ((id#0 = sid#7) || (name#1 = sname#8))
:- LocalTableScan [id#0, name#1]
+- BroadcastExchange IdentityBroadcastMode
+- LocalTableScan [sid#7, sname#8]
可以看出没有任何变化,并不能关闭广播。
注:其实仔细想想也能想明白,问什么关闭广播对BroadcastNestedLoopJoin不管用。因为BroadcastNestedLoopJoin需要对右表进行逐行遍历。所以必须把全部右表的内容广播到各个计算节点。
如果右表的数据量很大,就会出现broadcastTimeout的异常。所以,需要尽可能避免BroadcastNestedLoopJoin。
方案二,正确的方案
开着广播的情况
//打开broadcast
//.config("spark.sql.autoBroadcastJoinThreshold", -1)
// 导入包
import org.apache.spark.sql.functions.lit
val df3_1 = df1.join(df2, $"id" === $"sid", joinType = "inner")
.union(df1.join(df2, $"name" === $"sname", joinType = "inner"))
.distinct()
val df3 = df1.join(df3_1.selectExpr("id as id_1"), $"id" === $"id_1", joinType = "left_anti")
.withColumn("sid", lit(null))
.withColumn("sname", lit(null))
.union(df3_1)
+---+--------+----+--------+
| id| name| sid| sname|
+---+--------+----+--------+
| 4| li|null| null|
| 5| joyce|null| null|
| 6| chen|null| null|
| 7| chaoshu|null| null|
| 8| sha|null| null|
| 9| cindy|null| null|
| 10| xie|null| null|
| 1| darren| 1| darren|
| 2| zhang| 2|jonathan|
| 3|jonathan| 2|jonathan|
+---+--------+----+--------+
== Physical Plan ==
Union
:- *(5) Project [id#0, name#1, null AS sid#132, null AS sname#133]
: +- *(5) BroadcastHashJoin [id#0], [id_1#90], LeftAnti, BuildRight
: :- LocalTableScan [id#0, name#1]
: +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
: +- *(4) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
: +- Exchange hashpartitioning(id#0, name#1, sid#7, sname#8, 200)
: +- *(3) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
: +- Union
: :- *(1) BroadcastHashJoin [id#0], [sid#7], Inner, BuildRight
: : :- LocalTableScan [id#0, name#1]
: : +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
: : +- LocalTableScan [sid#7, sname#8]
: +- *(2) BroadcastHashJoin [name#1], [sname#8], Inner, BuildRight
: :- LocalTableScan [id#0, name#1]
: +- BroadcastExchange HashedRelationBroadcastMode(List(input[1, string, true]))
: +- LocalTableScan [sid#7, sname#8]
+- *(9) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
+- ReusedExchange [id#0, name#1, sid#7, sname#8], Exchange hashpartitioning(id#0, name#1, sid#7, sname#8, 200)
关闭广播的情况:
== Physical Plan ==
Union
:- *(11) Project [id#0, name#1, null AS sid#84, null AS sname#85]
: +- SortMergeJoin [id#0], [id_1#42], LeftAnti
: :- *(1) Sort [id#0 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(id#0, 200)
: : +- LocalTableScan [id#0, name#1]
: +- *(10) Sort [id_1#42 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(id_1#42, 200)
: +- *(9) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
: +- Exchange hashpartitioning(id#0, name#1, sid#7, sname#8, 200)
: +- *(8) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
: +- Union
: :- *(4) SortMergeJoin [id#0], [sid#7], Inner
: : :- *(2) Sort [id#0 ASC NULLS FIRST], false, 0
: : : +- ReusedExchange [id#0, name#1], Exchange hashpartitioning(id#0, 200)
: : +- *(3) Sort [sid#7 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(sid#7, 200)
: : +- LocalTableScan [sid#7, sname#8]
: +- *(7) SortMergeJoin [name#1], [sname#8], Inner
: :- *(5) Sort [name#1 ASC NULLS FIRST], false, 0
: : +- Exchange hashpartitioning(name#1, 200)
: : +- LocalTableScan [id#0, name#1]
: +- *(6) Sort [sname#8 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(sname#8, 200)
: +- LocalTableScan [sid#7, sname#8]
+- *(19) HashAggregate(keys=[id#0, name#1, sid#7, sname#8], functions=[])
+- ReusedExchange [id#0, name#1, sid#7, sname#8], Exchange hashpartitioning(id#0, name#1, sid#7, sname#8, 200)
可以看到方案二也很复杂,Union倒不是问题,问题有三次Join,要看实际效果怎样,需要用真实的数据进行测试。这里只提到思路。
参考
Avoid Broadcast nested loop join
问题二,not in条件, 广播无法关闭
打开广播开关
df1.createOrReplaceTempView("df1")
df2.createOrReplaceTempView("df2")
val df3 = spark.sql(
"""
|select * from df1 where id not in (select sid from df2)
|""".stripMargin)
+---+--------+
| id| name|
+---+--------+
| 3|jonathan|
| 4| li|
| 5| joyce|
| 6| chen|
| 7| chaoshu|
| 8| sha|
| 9| cindy|
| 10| xie|
+---+--------+
== Physical Plan ==
BroadcastNestedLoopJoin BuildRight, LeftAnti, ((id#0 = sid#7) || isnull((id#0 = sid#7)))
:- LocalTableScan [id#0, name#1]
+- BroadcastExchange IdentityBroadcastMode
+- LocalTableScan [sid#7]
可以看到广播无法关闭
方案一,使用not exists,正确的方案
val df3 = spark.sql(
"""
|select * from df1 where not exists (select 1 from df2 where df1.id = df2.sid)
|""".stripMargin)
+---+--------+
| id| name|
+---+--------+
| 6| chen|
| 3|jonathan|
| 5| joyce|
| 9| cindy|
| 4| li|
| 8| sha|
| 7| chaoshu|
| 10| xie|
+---+--------+
== Physical Plan ==
SortMergeJoin [id#0], [sid#7], LeftAnti
:- *(1) Sort [id#0 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(id#0, 200)
: +- LocalTableScan [id#0, name#1]
+- *(3) Sort [sid#7 ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(sid#7, 200)
+- *(2) Project [sid#7]
+- LocalTableScan [1#26, sid#7]
可以看到不再使用BroadcastNestedLoopJoin了
参考
Disable broadcast when query plan has BroadcastNestedLoopJoin
BroadcastNestedLoopJoin example
方案二,使用left anti join,正确的方案
val df3 = df1.join(df2, $"id" === $"sid", joinType = "left_anti")
+---+--------+
| id| name|
+---+--------+
| 6| chen|
| 3|jonathan|
| 5| joyce|
| 9| cindy|
| 4| li|
| 8| sha|
| 7| chaoshu|
| 10| xie|
+---+--------+
关闭广播时的行为
== Physical Plan ==
SortMergeJoin [id#0], [sid#7], LeftAnti
:- *(1) Sort [id#0 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(id#0, 200)
: +- LocalTableScan [id#0, name#1]
+- *(2) Sort [sid#7 ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(sid#7, 200)
+- LocalTableScan [sid#7]
打开广播时的行为
== Physical Plan ==
*(1) BroadcastHashJoin [id#0], [sid#7], LeftAnti, BuildRight
:- LocalTableScan [id#0, name#1]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)))
+- LocalTableScan [sid#7]
都没有使用BroadcastNestedLoopJoin,可以跟据具体情况选择