1. 安装java
- 下载解压
进入 Oracle 官方网站 下载合适的 JDK 版本,准备安装。
注意:这里需要下载 Linux 版本。这里以jdk1.8.0_60.tar.gz为例,你下载的文件可能不是这个版本,这没关系,只要后缀(.tar.gz)一致即可。
- 设置java 环境变量
vim /etc/profile
JAVA_HOME=/usr/java/jdk1.8.0_60
CLASSPATH=$JAVA_HOME/lib/
PATH=$PATH:$JAVA_HOME/bin
export PATH JAVA_HOME CLASSPATH
- 验证
java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
2. 安装scala
- 下载scala
wget https://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz
- 解压重命名文件
mv scala-2.11.8.tgz /usr/local
tar -xvf scala-2.11.8.tgz
mv scala-2.11.8 scala
3.配置环境变量
vim /etc/profile
export PATH=$PATH:/usr/local/scala/bin
source /etc/profile
- 验证
scala -version
//Scala code runner version 2.11.8 -- Copyright 2002-2016, LAMP/EPFL
3.安装spark
- 下载并解压重命名文件
mv spark-2.4.5-bin-hadoop2.7.tgz /usr/local
tar -xvf spark-2.4.5-bin-hadoop2.7.tgz
mv spark-2.4.5-bin-hadoop2.7 spark
2.配置环境变量
vim /etc/profile
export PATH=$PATH:/usr/local/spark/bin
source /etc/profile
- 验证
spark-shell
[root@iZbp1e67wmz30hykub1u4tZ spark]# spark-shell
20/03/02 00:43:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://iZbp1e67wmz30hykub1u4tZ:4040
Spark context available as 'sc' (master = local[*], app id = local-1583081034003).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.5
/_/
Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
求PI的官方案例
spark-submit --class org.apache.spark.examples.SparkPi --executor-memory 1G --total-executor-cores 2 /usr/local/spark/examples/jars/spark-examples_2.11-2.4.5.jar 100
// 该算法是利用蒙特·卡罗算法求 PI
20/03/02 00:58:35 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 4.368829 s
Pi is roughly 3.1419255141925513
20/03/02 00:58:35 INFO SparkUI: Stopped Spark web UI at http://iZbp1e67wmz30hykub1u4tZ:4040