Spark环境的安装方法
发表于:2025-01-25 作者:千家信息网编辑
千家信息网最后更新 2025年01月25日,本篇内容主要讲解"Spark环境的安装方法",感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习"Spark环境的安装方法"吧!1、准备工作○ 下载工具scala-
千家信息网最后更新 2025年01月25日Spark环境的安装方法3、安装spark-1.4.0
本篇内容主要讲解"Spark环境的安装方法",感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习"Spark环境的安装方法"吧!
1、准备工作
○ 下载工具
scala-2.9.3:一种编程语言,下载地址:http://www.scala-lang.org/files/archive/scala-2.9.3.tgz
spark-1.4.0:必须是编译好的Spark,如果下载的是Source,则需要自己根据环境使用SBT或者MAVEN重新编译才能使用,编译好的Spark下载地址:http://mirror.bit.edu.cn/apache/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.4.tgz
○ 搭建hadoop环境,此处不再详细描述如何搭建,请自行百度。
2、安装scala-2.9.3
解压scala-2.9.3.tgz#tar -zxvf scala-2.9.3.tgz配置SCALA_HOME#vi /etc/profile添加如下环境export SCALA_HOME=/home/apps/scala-2.9.3export PATH=.:$SCALA_HOME/bin:$PATH测试scala安装是否成功直接输入 scala
3、安装spark-1.4.0
解压spark-1.4.0.tgz#tar -zxvf spark-1.4.0.tgz配置SPARK_HOME#vi /etc/profile添加如下环境export SCALA_HOME=/home/apps/spark-1.4.0export PATH=.:$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
4、修改Spark配置文件
复制slaves.template和 spark-env.sh.template各一份#cp spark-env.sh.template spark-env.sh#cp slaves.template slavesslaves,此文件是指定子节点的主机,直接添加子节点主机名即可在spark-env.sh末端添加如下几行:#JDK安装路径export JAVA_HOME=/root/app/jdk#SCALA安装路径export SCALA_HOME=/root/app/scala-2.9.3#主节点的IP地址export SPARK_MASTER_IP=192.168.1.200#分配的内存大小export SPARK_WORKER_MEMORY=200m#指定hadoop的配置文件目录export HADOOP_CONF_DIR=/root/app/hadoop/etc/hadoop#指定worker工作时分配cpu数量export SPARK_WORKER_CORES=1#指定spark实例,一般1个足以export SPARK_WORKER_INSTANCES=1#jvm操作,在spark1.0之后增加了spark-defaults.conf默认配置文件,该配置参数在默认配置在该文件中export SPARK_JAVA_OPTS#spark-defaults.conf中还有如下配置参数SPARK.MASTER //spark://hostname:8080SPARK.LOCAL.DIR //spark工作目录(做shuffle的目录)SPARK.EXECUTOR.MEMORY //spark1.0抛弃SPARK_MEM参数,使用该参数
5、测试spark安装是否成功
在主节点机器上启动顺序1、先启动hdfs(./sbin/start-dfs.sh)2、启动spark-master(./sbin/start-master.sh)3、启动spark-worker(./sbin/start-slaves.sh)4、jps查看进程有 主节点:namenode、secondrynamnode、master 从节点:datanode、worker5、启动spark-shell15/06/21 21:23:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/06/21 21:23:47 INFO spark.SecurityManager: Changing view acls to: root15/06/21 21:23:47 INFO spark.SecurityManager: Changing modify acls to: root15/06/21 21:23:47 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)15/06/21 21:23:47 INFO spark.HttpServer: Starting HTTP Server15/06/21 21:23:47 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:47 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:3865115/06/21 21:23:47 INFO util.Utils: Successfully started service 'HTTP class server' on port 38651.Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.4.0 /_/Using Scala version 2.10.4 (Java HotSpot(TM) Client VM, Java 1.7.0_65)Type in expressions to have them evaluated.Type :help for more information.15/06/21 21:23:54 INFO spark.SparkContext: Running Spark version 1.4.015/06/21 21:23:54 INFO spark.SecurityManager: Changing view acls to: root15/06/21 21:23:54 INFO spark.SecurityManager: Changing modify acls to: root15/06/21 21:23:54 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)15/06/21 21:23:56 INFO slf4j.Slf4jLogger: Slf4jLogger started15/06/21 21:23:56 INFO Remoting: Starting remoting15/06/21 21:23:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.200:57658]15/06/21 21:23:57 INFO util.Utils: Successfully started service 'sparkDriver' on port 57658.15/06/21 21:23:58 INFO spark.SparkEnv: Registering MapOutputTracker15/06/21 21:23:58 INFO spark.SparkEnv: Registering BlockManagerMaster15/06/21 21:23:58 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-4f1badf6-1e92-47ca-98a2-6d82f4882f15/blockmgr-530e4335-9e59-45d4-b9fb-6014089f5a0015/06/21 21:23:58 INFO storage.MemoryStore: MemoryStore started with capacity 267.3 MB15/06/21 21:23:59 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-4f1badf6-1e92-47ca-98a2-6d82f4882f15/httpd-4b2cca3c-e8d4-4ab3-9c3d-38ec579ec87315/06/21 21:23:59 INFO spark.HttpServer: Starting HTTP Server15/06/21 21:23:59 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:59 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:5189915/06/21 21:23:59 INFO util.Utils: Successfully started service 'HTTP file server' on port 51899.15/06/21 21:23:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator15/06/21 21:23:59 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:59 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:404015/06/21 21:23:59 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.15/06/21 21:23:59 INFO ui.SparkUI: Started SparkUI at http://192.168.1.200:404015/06/21 21:24:00 INFO executor.Executor: Starting executor ID driver on host localhost15/06/21 21:24:00 INFO executor.Executor: Using REPL class URI: http://192.168.1.200:3865115/06/21 21:24:01 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59385.15/06/21 21:24:01 INFO netty.NettyBlockTransferService: Server created on 5938515/06/21 21:24:01 INFO storage.BlockManagerMaster: Trying to register BlockManager15/06/21 21:24:01 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:59385 with 267.3 MB RAM, BlockManagerId(driver, localhost, 59385)15/06/21 21:24:01 INFO storage.BlockManagerMaster: Registered BlockManager15/06/21 21:24:02 INFO repl.SparkILoop: Created spark context..Spark context available as sc.15/06/21 21:24:03 INFO hive.HiveContext: Initializing execution hive, version 0.13.115/06/21 21:24:04 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore15/06/21 21:24:04 INFO metastore.ObjectStore: ObjectStore, initialize called15/06/21 21:24:04 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored15/06/21 21:24:04 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored15/06/21 21:24:05 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/06/21 21:24:07 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/06/21 21:24:14 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"15/06/21 21:24:14 INFO metastore.MetaStoreDirectSql: MySQL check failed, assuming we are not on mysql: Lexical error at line 1, column 5. Encountered: "@" (64), after : "".15/06/21 21:24:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:18 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:18 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:19 INFO metastore.ObjectStore: Initialized ObjectStore15/06/21 21:24:20 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.1aa15/06/21 21:24:24 INFO metastore.HiveMetaStore: Added admin role in metastore15/06/21 21:24:24 INFO metastore.HiveMetaStore: Added public role in metastore15/06/21 21:24:24 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty15/06/21 21:24:25 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.15/06/21 21:24:25 INFO repl.SparkILoop: Created sql context (with Hive support)..SQL context available as sqlContext.6、使用wordcount例子测试,启动spark-shell之前先上传一份文件到hdfs7、代码: val file = sc.textFile("hdfs://hadoop.master:9000/data/intput/wordcount.data") val count = file.flatMap(line=>(line.split(" "))).map(word=>(word,1)).reduceByKey(_+_) count.collect() count.textAsFile("hdfs://hadoop.master:9000/data/output")理解上面的代码你需要学习scala语言直接打印结果:hadoop dfs -cat /data/output/p*(im,1)(are,1)(yes,1)(hi,2)(do,1)(no,3)(to,1)(lll,1)(,3)(hello,3)(xiaoming,1)(ga,1)(world,1)
环境搭建完成了...
到此,相信大家对"Spark环境的安装方法"有了更深的了解,不妨来实际操作一番吧!这里是网站,更多相关内容可以进入相关频道进行查询,关注我们,继续学习!
环境
配置
文件
节点
方法
参数
地址
目录
学习
工作
测试
编译
成功
主机
代码
内容
语言
路径
实用
更深
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
网络安全高级威胁分析
mc外网服务器搭建
贵州党员教育软件开发公司
网络安全法培训讲座
360网络安全经销商
电商手机App软件开发合同
网络安全硬件
身边的网络安全故事奖励
智能软件开发就业
合肥软件开发王涛
服务器怎么进入任务管理
小米路由器3潘多拉打印服务器
大团结下载软件开发
如何上传到服务器
联想电脑检查代理服务器和防火墙
洞察网络安全
web服务器安全防护技巧
光伏监控服务器设置
农业银行软件开发中心成都公司
聊天软件开发中文版
怎么连接内网数据库
电子表格服务器搭建
数据库的模型包括
金山区网络技术咨询记录
天迹网络技术有限公司
中国移动服务器和基站连接
单细胞身份标志物数据库
网络安全到基层专项整治20项
数据库生成文件
中文数据库涉及工具