1.下载 Spark 安装包, 下载时候选择对应的 Hadoop 版本 https://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
# 下载 Spark cd /export/softwares wget https://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz2.解压
# 解压 Spark 安装包 tar xzvf spark-2.2.0-bin-hadoop2.7.tgz # 移动 Spark 安装包 mv spark-2.2.0-bin-hadoop2.7.tgz /export/servers/spark3.修改配置文件spark/conf目录下spark-env.sh
#!/usr/bin/env bash # 指定 Java Home export JAVA_HOME=/usr/local/jdk1.8.0_231/ export SCALA_HOME=/usr/local/scala/scala-2.12.11 export HADOOP_HOME=/usr/local/hadoop-2.9.2 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop SPARK_LOCAL_DIRS=/usr/local/spark SPARK_DRIVER_MEMORY=512M # 指定 Spark Master 地址 export SPARK_MASTER_HOST=master export SPARK_MASTER_PORT=7077 export LD_LIBRARY_PATH=$JAVA_LIBRARY_PATH顺便看下/etc/profile
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/4.conf下slaves配置从节点
node-1 node-25.配置HistoryServer–spark-defaults.conf #首先修改spark-defaults.conf.template为spark-defaults.conf
cp spark-defaults.conf.template spark-defaults.conf spark.eventLog.enabled true spark.eventLog.dir hdfs://master:8020/spark_log spark.eventLog.compress true此处端口号和hadoop的core-site.xml文件中配置的保持一致 配置spark-env.sh
# 指定 Spark History 运行参数 export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://master:8020/spark_log"最后在hdfs上创建目录
hdfs dfs -mkdir -p /spark_log6.分发到各从节点
cd 到存放spark的目录 scp -r spark root@node-1:$PWD scp -r spark root@node-2:$PWD7.启动
sbin/start-all.sh sbin/start-history-server.sh1.首先配置Zookeeper zookeeper的安装和配置 Spark+zookeeper搭建高可用集群学习笔记
停止spark集群
sbin/stop-all.shspark-env.sh配置
# 指定 Spark Master 地址 #export SPARK_MASTER_HOST=master # 指定 Spark 运行时参数 export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=node-1:2181,node-2:2181 -Dspark.deploy.zookeeper.dir=/spark"再次启动后效果: