大数据Hadoop集群搭建
大数据Hadoop集群搭建
一、环境
服务器配置:
CPU型号:Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
CPU核数:16
内存:64GB
操作系统
版本:CentOS Linux release 7.5.1804 (Core)
主机列表:
IP | 主机名 |
---|---|
192.168.1.101 | node1 |
192.168.1.102 | node2 |
192.168.1.103 | node3 |
192.168.1.104 | node4 |
192.168.1.105 | node5 |
软件安装包路径:/data/tools/
JAVA_HOME路径:/opt/java # java为软链接,指向jdk的指定版本
Hadoop集群路径:/data/bigdata/
软件版本及部署分布:
组件名 | 安装包 | 说明 | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|---|---|
JDK | jdk-8u162-linux-x64.tar.gz | 基础环境 | ✔ | ✔ | ✔ | ✔ | ✔ |
zookeeper | zookeeper-3.4.12.tar.gz | ✔ | ✔ | ✔ | |||
Hadoop | hadoop-2.7.6.tar.gz | ✔ | ✔ | ✔ | ✔ | ✔ | |
spark | spark-2.1.2-bin-hadoop2.7.tgz | ✔ | ✔ | ✔ | ✔ | ✔ | |
scala | scala-2.11.12.tgz | ✔ | ✔ | ✔ | ✔ | ✔ | |
hbase | hbase-1.2.6-bin.tar.gz | ✔ | ✔ | ✔ | |||
hive | apache-hive-2.3.3-bin.tar.gz | ✔ | |||||
kylin | apache-kylin-2.3.1-hbase1x-bin.tar.gz | ✔ | |||||
kafka | kafka_2.11-1.1.0.tgz | ✔ | ✔ | ||||
hue | hue-3.12.0.tgz | ✔ | |||||
flume | apache-flume-1.8.0-bin.tar.gz | ✔ | ✔ | ✔ | ✔ | ✔ |
注:所有的软链接不可以跨服务器传输,应该单独创建;否则会把软链接所指向的文件或整个目录传过去;
二、常用命令
1、查看系统基本配置:
[root@localhost ~]# uname -aLinux node1 3.10.0-123.9.3.el7.x86_64 #1 SMP Thu Nov 6 15:06:03 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux[root@localhost ~]# cat /etc/redhat-releaseCentOS Linux release 7.5.1804 (Core)[root@localhost ~]# free -m total used free shared buffers cachedMem: 64267 2111 62156 16 212 1190-/+ buffers/cache: 708 63559Swap: 32000 0 32000[root@localhost ~]# lscpuArchitecture: x86_64CPU op-mode(s): 32-bit, 64-bitByte Order: Little EndianCPU(s): 16On-line CPU(s) list: 0-15Thread(s) per core: 2Core(s) per socket: 8Socket(s): 1NUMA node(s): 1Vendor ID: GenuineIntelCPU family: 6Model: 79Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHzStepping: 1CPU MHz: 2095.148BogoMIPS: 4190.29Hypervisor vendor: KVMVirtualization type: fullL1d cache: 32KL1i cache: 32KL2 cache: 256KL3 cache: 20480KNUMA node0 CPU(s): 0-15[root@localhost ~]# df -h文件系统 容量 已用 可用 已用% 挂载点/dev/sda2 100G 3.1G 97G 4% /devtmpfs 7.7G 0 7.7G 0% /devtmpfs 7.8G 0 7.8G 0% /dev/shmtmpfs 7.8G 233M 7.5G 3% /runtmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup/dev/sda1 500M 9.8M 490M 2% /boot/efi/dev/sda4 1.8T 9.3G 1.8T 1% /datatmpfs 1.6G 0 1.6G 0% /run/user/1000
2、启动集群
start-dfs.shstart-yarn.sh
3、关闭集群
stop-yarn.shstop-dfs.sh
4、监控集群
hdfs dfsadmin -report
5、单个进程启动/关闭
hadoop-daemon.sh start|stop namenode|datanode| journalnodeyarn-daemon.sh start |stop resourcemanager|nodemanager
三、 环境准备(所有服务器)
1、设置主机名(其它类似)
[root@localhost ~]# hostnamectl set-hostname node1
2、关闭防火墙firewalld并禁止开机自启动和SELINUX
[root@node1 ~]# systemctl disable firewalld.service [root@node1 ~]# systemctl stop firewalld.service [root@node1 ~]# systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled) Active: inactive (dead) Docs: man:firewalld(1)[root@node1 ~]# setenforce 0 [root@node1 ~]# sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config [root@node1 ~]# grep SELINUX=disabled /etc/selinux/configSELINUX=disabled
3、修改hosts文件
[root@node1 ~]# vim /etc/hosts#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.1.101 node1192.168.1.102 node2192.168.1.103 node3192.168.1.104 node4192.168.1.105 node5
4、设置ssh免密登陆,可用ansible
[root@node1 ~]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa[root@node1 ~]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys[root@node1 ~]# ll -d .ssh/drwx------ 2 root root 4096 Jun 5 08:50 .ssh/[root@node1 ~]# ll .ssh/ total 12-rw-r--r-- 1 root root 599 Jun 5 08:50 authorized_keys-rw------- 1 root root 672 Jun 5 08:50 id_dsa-rw-r--r-- 1 root root 599 Jun 5 08:50 id_dsa.pub# 把其它服务器的~/.ssh/id_dsa.pub内容也追加到node1服务器的~/.ssh/authorized_keys文件中,然后分发[root@node1 ~]# scp -rp ~/.ssh/authorized_keys node2: ~/.ssh/[root@node1 ~]# scp -rp ~/.ssh/authorized_keys node3: ~/.ssh/[root@node1 ~]# scp -rp ~/.ssh/authorized_keys node4: ~/.ssh/[root@node1 ~]# scp -rp ~/.ssh/authorized_keys node5: ~/.ssh/
也可以node1生成一套密钥,然后把~/.ssh整个目录分发到其它服务器,共用一个密钥
5、修改文件句柄数
[root@node1 ~]# vim /etc/security/limits.conf#---------custom-----------------------#* soft nofile 240000* hard nofile 655350* soft nproc 240000* hard nproc 655350#-----------end-----------------------[root@node1 ~]# source /etc/security/limits.conf[root@node1 ~]# ulimit -n24000
6、时间同步
ntp服务器设置
# 局域网内设置一台ntp服务器,其它和这台ntp同步即可,云服务器一般默认已同步[root@node1 ~]# yum install ntp -y # 安装ntp服务[root@node1 ~]# cp -a /etc/ntp.conf{,.bak}[root@node1 ~]# vim /etc/ntp.confrestrict default kod nomodify notrap nopeer noquery # restrict、default定义默认访问规则,nomodify禁止远程主机修改本地服务器restrict 127.0.0.1 # 这里的查询是服务器本身状态的查询。restrict -6 ::1#server 0.centos.pool.ntp.org iburst # 注掉官方自带的网络站点#server 1.centos.pool.ntp.org iburst#server 2.centos.pool.ntp.org iburst#server 3.centos.pool.ntp.org iburstserver ntp1.aliyun.com # 目标服务器网络位置server 127.127.1.0 # local clock,当服务器与公用的时间服务器失去联系时,就是连不上互联网时,以局域网内的时间服务器为客户端提供时间同步服务。fudge 127.127.1.0 stratum 10# 如果计划任务有时间同步,先注释,两种用法会冲突。[root@node1 ~]# crontab -e#*/30 * * * * /usr/sbin/ntpdate ntp1.aliyun.com > /dev/null 2>&1;/sbin/hwclock -w# 启动服务并设置开启自启:[root@node1 /]# systemctl start ntpd.service # 启动服务[root@node1 /]# systemctl enable ntpd.service # 设置为开机启动[root@node1 ~]# systemctl status ntpd.service● ntpd.service - Network Time Service Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled) Active: active (running) since 一 2018-05-21 13:47:33 CST; 1 weeks 2 days ago Main PID: 17915 (ntpd) CGroup: /system.slice/ntpd.service └─17915 /usr/sbin/ntpd -u ntp:ntp -g5月 23 11:41:40 node1 ntpd[17915]: Listen normally on 14 enp0s25 192.168.1.101 UDP 1235月 23 11:41:40 node1 ntpd[17915]: new interface(s) found: waking up resolver5月 23 11:41:42 node1 ntpd[17915]: Listen normally on 18 enp0s25 fe80::6a85:bbb1:ad57:f6ae UDP 123[root@node1 ~]# ntpq -p # 检查时间服务器是否正确同步 remote refid st t when poll reach delay offset jitter==============================================================================*time5.aliyun.co 10.137.38.86 2 u 468 1024 377 14.374 -4.292 6.377当所有远程服务器(不是本地服务器)的jitter值都为4000,并且reach和dalay的值是0时,就表示时间同步有问题。可能原因有2个: 1)服务器端的防火墙设置,阻断了123端口(可以用 iptables -t filter -A INPUT -p udp --destination-port 123 -j ACCEPT 解决) 2)每次重启ntp服务器之后,大约3-5分钟客户端才能与服务端建立连接,建立连接之后才能进行时间同步,否则客户端同步时间时会显示 no server suitable for synchronization found的报错信息,不用担心,等会就可以了。
其它主机设置,以node2为例
[root@node2 /]# systemctl stop ntpd.service # 关闭ntp服务[root@node2 /]# systemctl disable ntpd.service # 禁止开机自启动[root@node2 ~]# yum install ntpdate -y[root@node2 ~]# /usr/sbin/ntpdate 192.168.1.10130 May 17:54:09 ntpdate[20937]: adjust time server 192.168.1.101 offset 0.000758 sec[root@node2 ~]# crontab -e*/30 * * * * /usr/sbin/ntpdate 192.168.1.101 > /dev/null 2>&1;/sbin/hwclock -w[root@node2 ~]# systemctl restart crond.service[root@node2 ~]# systemctl status crond.service● crond.service - Command Scheduler Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled; vendor preset: enabled) Active: active (running) since 四 2018-05-31 09:05:39 CST; 11s ago Main PID: 12162 (crond) CGroup: /system.slice/crond.service └─12162 /usr/sbin/crond -n5月 31 09:05:39 node2 systemd[1]: Started Command Scheduler.5月 31 09:05:39 node2 systemd[1]: Starting Command Scheduler...
7、上传安装包到node1服务器
[root@node1 ~]# mkdir -pv /data/tools[root@node1 ~]# cd /data/tools[root@node1 tools]# lltotal 1221212-rw-r--r-- 1 root root 58688757 May 24 10:23 apache-flume-1.8.0-bin.tar.gz-rw-r--r-- 1 root root 232229830 May 24 10:25 apache-hive-2.3.3-bin.tar.gz-rw-r--r-- 1 root root 286104833 May 24 10:26 apache-kylin-2.3.1-hbase1x-bin.tar.gz-rw-r--r-- 1 root root 216745683 May 24 10:28 hadoop-2.7.6.tar.gz-rw-r--r-- 1 root root 104659474 May 24 10:27 hbase-1.2.6-bin.tar.gz-rw-r--r-- 1 root root 47121634 May 24 10:29 hue-3.12.0.tgz-rw-r--r-- 1 root root 56969154 May 24 10:49 kafka_2.11-1.1.0.tgz-rw-r--r-- 1 root root 193596110 May 24 10:29 spark-2.1.2-bin-hadoop2.7.tgz-rw-r--r-- 1 root root 36667596 May 24 10:28 zookeeper-3.4.12.tar.gz[root@node1 bigdata]#
8、安装JDK
[root@node1 ~]# tar xf /data/tools/jdk-8u162-linux-x64.tar.gz -C /opt/[root@node1 ~]# [ -L "/opt/java" ] && rm -f /opt/java[root@node1 ~]# cd /opt/ && ln -s /opt/jdk1.8.0_162 /opt/java[root@node1 ~]# chown -R root:root /opt/jdk1.8.0_162[root@node1 ~]# echo -e "# java\nexport JAVA_HOME=/opt/java\nexport PATH=\${PATH}:\${JAVA_HOME}/bin:\${JAVA_HOME}/jre/bin\nexport CLESSPATH=.:\${JAVA_HOME}/lib:\${JAVA_HOME}/jre/lib" > /etc/profile.d/java_version.sh[root@node1 ~]# source /etc/profile.d/java_version.sh[root@node1 ~]# java -versionjava version "1.8.0_162"Java(TM) SE Runtime Environment (build 1.8.0_162-b12)Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
四、 安装zookeeper
官方文档
1、解压zookeeper
[root@node1 ~]# mkdir -pv /data/bigdata/src[root@node1 ~]# tar -zxvf /data/tools/zookeeper-3.4.12.tar.gz -C /data/bigdata/src[root@node1 ~]# ln -s /data/bigdata/src/zookeeper-3.4.12 /data/bigdata/zookeeper# 添加环境变量[root@node1 ~]# echo -e "# zookeeper\nexport ZOOKEEPER_HOME=/data/bigdata/zookeeper\nexport PATH=\$ZOOKEEPER_HOME/bin:\$PATH" > /etc/profile.d/bigdata_path.sh[root@node1 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH[root@node1 ~]#
2、配置zoo.cfg文件
[root@node1 ~]# cd /data/bigdata/zookeeper/conf/ #进入conf目录[root@node1 conf]# cp zoo_sample.cfg zoo.cfg #拷贝模板[root@node1 conf]# vim zoo.cfg# The number of millinode2s of each ticktickTime=2000# The number of ticks that the initial # synchronization phase can takeinitLimit=10# The number of ticks that can pass between # sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.# do not use /tmp for storage, /tmp here is just # example sakes.dataDir=/data/bigdata/zookeeper/data # 添加dataLogDir=/data/bigdata/zookeeper/dataLog # 添加# the port at which the clients will connectclientPort=2181# the maximum number of client connections.# increase this if you need to handle more clients#maxClientCnxns=60## Be sure to read the maintenance section of the # administrator guide before turning on autopurge.## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance## The number of snapshots to retain in dataDir#autopurge.snapRetainCount=3# Purge task interval in hours# Set to "0" to disable auto purge feature#autopurge.purgeInterval=1server.1=node1:2888:3888 # 添加server.2=node2:2888:3888server.3=node3:2888:3888
3、添加myid,分发(安装个数为奇数)
# 创建指定目录:dataDir目录下增加myid文件;myid中写当前zookeeper服务的id, 因为server.1=node1:2888:3888 server指定的是1,[root@node1 conf]# mkdir -pv /data/bigdata/zookeeper/{data,dataLog}[root@node1 conf]# echo 1 > /data/bigdata/zookeeper/data/myid
4、分发:
[root@node2 ~]# mkdir -pv /data/bigdata/[root@node3 ~]# mkdir -pv /data/bigdata/[root@node1 conf]# scp -rp /data/bigdata/src node2:/data/bigdata/[root@node1 conf]# scp -rp /data/bigdata/src node3:/data/bigdata/[root@node2 ~]# ln -s /data/bigdata/src/zookeeper-3.4.12 /data/bigdata/zookeeper[root@node3 ~]# ln -s /data/bigdata/src/zookeeper-3.4.12 /data/bigdata/zookeeper# 在其余机子配置,node2下面的myid是2,node3下面myid是3,这些都是根据server来的[root@node2 ~]# echo 2 > /data/bigdata/zookeeper/data/myid[root@node3 ~]# echo 3 > /data/bigdata/zookeeper/data/myid
五、 安装Hadoop
官方文档
- 生产环境:两个主节点只装namenode,不装datanode;
1、解压hadoop
[root@node1 ~]# tar -zxvf /data/tools/hadoop-2.7.6.tar.gz -C /data/bigdata/src[root@node1 ~]# ln -s /data/bigdata/src/hadoop-2.7.6 /data/bigdata/hadoop# 添加环境变量[root@node1 ~]# echo -e "\n# hadoop\nexport HADOOP_HOME=/data/bigdata/hadoop\nexport PATH=\$HADOOP_HOME/bin:\$HADOOP_HOME/sbin\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node1 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH[root@node1 ~]#
2、配置hadoop-env.sh
[root@node1 ~]# cd /data/bigdata/hadoop/etc/hadoop/[root@node1 hadoop]# vim hadoop-env.sh export JAVA_HOME=/opt/java # 添加export HADOOP_SSH_OPTS="-p 22"
3、配置core-site.xml
[root@node1 hadoop]# vim core-site.xml fs.defaultFS hdfs://mycluster hadoop.tmp.dir /data/bigdata/tmp ha.zookeeper.quorum node1:2181,node2:2181,node3:2181 zookeeper客户端连接地址 ha.zookeeper.session-timeout.ms 10000 fs.trash.interval 1440 以分钟为单位的垃圾回收时间,垃圾站中数据超过此时间,会被删除。如果是0,垃圾回收机制关闭。 fs.trash.checkpoint.interval 1440 以分钟为单位的垃圾回收检查间隔。 hadoop.security.authentication simple 可以设置的值为 simple (无认证) 或者 kerberos(一种安全认证系统) hadoop.proxyuser.hue.hosts * hadoop.proxyuser.hue.groups * hadoop.proxyuser.root.hosts * hadoop.proxyuser.root.groups *
新建指定目录
[root@node1 hadoop]# mkdir -p /data/bigdata/tmp
4、配置yarn-site.xml
[root@node1 hadoop]# vim yarn-site.xml yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms 5000 schelduler失联等待连接时间 yarn.nodemanager.aux-services mapreduce_shuffle NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序 yarn.resourcemanager.connect.retry-interval.ms 5000 How often to try connecting to the ResourceManager. yarn.resourcemanager.ha.enabled true 是否启用RM HA,默认为false(不启用) yarn.resourcemanager.ha.automatic-failover.enabled true 是否启用自动故障转移。默认情况下,在启用HA时,启用自动故障转移。 yarn.resourcemanager.ha.automatic-failover.embedded true 启用内置的自动故障转移。默认情况下,在启用HA时,启用内置的自动故障转移。 yarn.resourcemanager.cluster-id cluster1 集群的Id,elector使用该值确保RM不会做为其它集群的active。 yarn.resourcemanager.ha.rm-ids rm1,rm2 RMs的逻辑id列表,rm管理资源器;一般配两个,一个起作用 其他备用;用逗号分隔,如:rm1,rm2 yarn.resourcemanager.hostname.rm1 node3 RM的hostname yarn.resourcemanager.scheduler.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8030 RM对AM暴露的地址,AM通过地址想RM申请资源,释放资源等 yarn.resourcemanager.resource-tracker.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8031 RM对NM暴露地址,NM通过该地址向RM汇报心跳,领取任务等 yarn.resourcemanager.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8032 RM对客户端暴露的地址,客户端通过该地址向RM提交应用程序等 yarn.resourcemanager.admin.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8033 RM对管理员暴露的地址.管理员通过该地址向RM发送管理命令等 yarn.resourcemanager.webapp.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8088 RM对外暴露的web http地址,用户可通过该地址在浏览器中查看集群信息 The https adddress of the RM web application. yarn.resourcemanager.webapp.https.address.rm1 ${yarn.resourcemanager.hostname.rm1}:8090 yarn.resourcemanager.hostname.rm2 node4 yarn.resourcemanager.scheduler.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8030 yarn.resourcemanager.resource-tracker.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8031 yarn.resourcemanager.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8032 yarn.resourcemanager.admin.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8033 yarn.resourcemanager.webapp.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8088 The https adddress of the RM web application. yarn.resourcemanager.webapp.https.address.rm2 ${yarn.resourcemanager.hostname.rm2}:8090 yarn.resourcemanager.recovery.enabled true 默认值为false,也就是说resourcemanager挂了相应的正在运行的任务在rm恢复后不能重新启动 yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore 状态存储的类 ha.zookeeper.quorum node1:2181,node2:2181,node3:2181 yarn.resourcemanager.zk-address ${ha.zookeeper.quorum} ZooKeeper服务器的地址(主机:端口号),既用于状态存储也用于内嵌的leader-election。 yarn.nodemanager.address ${yarn.nodemanager.hostname}:8041 The address of the container manager in the NM. yarn.nodemanager.resource.memory-mb 58000 该节点上nodemanager可使用的物理内存总量 yarn.nodemanager.resource.cpu-vcores 16 该节点上nodemanager可使用的虚拟CPU个数 yarn.nodemanager.vmem-pmem-ratio 2 任务每使用1MB物理内存,最多可使用虚拟内存量,默认是2.1。 yarn.scheduler.minimum-allocation-mb 1024 单个任务可申请的最小物理内存量 yarn.scheduler.maximum-allocation-mb 58000 单个任务可申请的最大物理内存量 yarn.scheduler.minimum-allocation-vcores 1 单个任务可申请的最小虚拟CPU个数 yarn.scheduler.maximum-allocation-vcores 16 单个任务可申请的最大虚拟CPU个数
5、配置mapred-site.xml
[root@node1 hadoop]# cp mapred-site.xml{.template,}[root@node1 hadoop]# vim mapred-site.xml mapreduce.framework.name yarn mapreduce.jobhistory.address sjfx:10020
6、配置hdfs-site.xml
[root@node1 hadoop]# vim hdfs-site.xml dfs.permissions true If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. dfs.replication 2 保存副本数 持久存储名字空间,事务日志的本地路径 dfs.namenode.name.dir /data/bigdata/hdfs/name datanode存放数据的路径,单个节点单配,多个目录逗号分隔 dfs.datanode.data.dir /data/bigdata/hdfs/data 指定用于在DataNode间传输block数据的最大线程数 dfs.datanode.max.transfer.threads 16384 dfs.datanode.balance.bandwidthPerSec 52428800 Specifies the maximum amount of bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second. dfs.datanode.balance.max.concurrent.moves 50 增加DataNode上转移block的Xceiver的个数上限。 dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 node1:8020 dfs.namenode.rpc-address.mycluster.nn2 node2:8020 dfs.namenode.http-address.mycluster.nn1 node1:50070 dfs.namenode.http-address.mycluster.nn2 node2:50070 dfs.namenode.journalnode node1:8485;node2:8485;node3:8485 journalnode为了解决hadoop单点故障,给namenode做元数据同步的,奇数个,一般3个或5个 dfs.namenode.shared.edits.dir qjournal://${dfs.namenode.journalnode}/mycluster dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_dsa dfs.journalnode.edits.dir ${hadoop.tmp.dir}/dfs/journal dfs.permissions.superusergroup root 超级用户组名 dfs.ha.automatic-failover.enabled true 开启自动故障转移
新建相应目录
[root@node1 hadoop]# mkdir -pv /data/bigdata/tmp/dfs/journal
7、配置capacity-scheduler.xml
[root@node1 hadoop]# vim capacity-scheduler.xml yarn.scheduler.capacity.maximum-applications 10000 Maximum number of applications that can be pending and running. yarn.scheduler.capacity.maximum-am-resource-percent 0.1 Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications. yarn.scheduler.capacity.resource-calculator org.apache.hadoop.yarn.util.resource.DominantResourceCalculator The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc. yarn.scheduler.capacity.root.queues default The queues at the this level (root is the root queue). yarn.scheduler.capacity.root.default.capacity 100 Default queue target capacity. yarn.scheduler.capacity.root.default.user-limit-factor 1 Default queue user limit a percentage from 0.0 to 1.0. yarn.scheduler.capacity.root.default.maximum-capacity 100 The maximum capacity of the default queue. yarn.scheduler.capacity.root.default.state RUNNING The state of the default queue. State can be one of RUNNING or STOPPED. yarn.scheduler.capacity.root.default.acl_submit_applications * The ACL of who can submit jobs to the default queue. yarn.scheduler.capacity.root.default.acl_administer_queue * The ACL of who can administer jobs on the default queue. yarn.scheduler.capacity.node-locality-delay 40 Number of missed scheduling opportunities after which the CapacityScheduler attempts to schedule rack-local containers. Typically this should be set to number of nodes in the cluster, By default is setting approximately number of nodes in one rack which is 40. yarn.scheduler.capacity.queue-mappings A list of mappings that will be used to assign jobs to queues The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]* Typically this list will be used to map users to queues, for example, u:%user:%user maps all users to queues with the same name as the user. yarn.scheduler.capacity.queue-mappings-override.enable false If a queue mapping is present, will it override the value specified by the user? This can be used by administrators to place jobs in queues that are different than the one specified by the user. The default is false.
8、配置slaves
[root@node1 hadoop]# vim slavesnode1node2node3node4node5
9、修改$HADOOP_HOME/sbin/hadoop-daemon.sh
[root@node1 hadoop]# cd /data/bigdata/hadoop/sbin/[root@node1 sbin]# vim hadoop-daemon.sh#添加:HADOOP_PID_DIR=/data/bigdata/hdfs/pidsYARN_PID_DIR=/data/bigdata/hdfs/pids# 新建相应目录[root@node1 sbin]# mkdir -pv /data/bigdata/hdfs/{name,data,pids}
10、修改$HADOOP_HOME/sbin/yarn-daemon.sh
#添加:[root@node1 sbin]# vim yarn-daemon.shHADOOP_PID_DIR=/data/bigdata/hdfs/pidsYARN_PID_DIR=/data/bigdata/hdfs/pids
11、分发
[root@node1 sbin]# scp -rp /data/bigdata/src/hadoop-2.7.6 node2:/data/bigdata/src/[root@node1 sbin]# scp -rp /data/bigdata/src/hadoop-2.7.6 node3:/data/bigdata/src/[root@node1 sbin]# scp -rp /data/bigdata/src/hadoop-2.7.6 node4:/data/bigdata/src/[root@node1 sbin]# scp -rp /data/bigdata/src/hadoop-2.7.6 node5:/data/bigdata/src/[root@node2 ~]# ln -s /data/bigdata/src/hadoop-2.7.6 /data/bigdata/hadoop[root@node3 ~]# ln -s /data/bigdata/src/hadoop-2.7.6 /data/bigdata/hadoop[root@node4 ~]# ln -s /data/bigdata/src/hadoop-2.7.6 /data/bigdata/hadoop[root@node5 ~]# ln -s /data/bigdata/src/hadoop-2.7.6 /data/bigdata/hadoop
六、启动过程
1、启动zookeeper服务:下面两种方法选一
(1) 同时开启所有zookeeper节点
# node1节点[root@node1 ~]# cd /data/bigdata/zookeeper/bin[root@node1 conf]# zkServer.sh start# node2节点[root@node2 ~]# cd /data/bigdata/zookeeper/bin[root@node2 conf]# zkServer.sh start# node3节点[root@node3 ~]# cd /data/bigdata/zookeeper/bin[root@node3 conf]# zkServer.sh start# 相应进程(其它类似)[root@node1 ~]# jps23993 QuorumPeerMain24063 Jps[root@node1 ~]#
(2) 集群启动
由于zookeeper没有提供同时启动集群中所有节点的执行脚本,在生产中逐个节点启动稍微有些麻烦,自定义一个脚本用来启动集群中所有节点,如下:
[root@node1 bigdata]# cat zookeeper_all_op.sh #!/bin/bash# start zookeeperzookeeperHome=/data/bigdata/src/zookeeper-3.4.12zookeeperArr=( "node1" "node2" "node3" )for znode in ${zookeeperArr[@]}; do ssh -p 22 -q root@$znode \ " export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin source /etc/profile $zookeeperHome/bin/zkServer.sh $1 " echo "$znode zookeeper $1 done"done# 启动[root@node1 bigdata]# ./zookeeper_all_op.sh start # 查看: leader为领导者(一台), follower为追随者;[root@node1 bigdata]# ./zookeeper_all_op.sh statusZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: followernode1 zookeeper status doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: leadernode2 zookeeper status doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: followernode3 zookeeper status done[root@node1 bigdata]#
(3) 启动客户端脚本
[root@node1 bigdata]# ./zookeeper/bin/zkCli.sh -server node2:2181Connecting to node2:21812018-06-13 17:28:21,115 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT2018-06-13 17:28:21,119 [myid:] - INFO [main:Environment@100] - Client environment:host.name=node1......省略......2018-06-13 17:28:21,220 [myid:] - INFO [main-SendThread(node2:2181):ClientCnxn$SendThread@878] - Socket connection established to node2/192.168.1.102:2181, initiating session2018-06-13 17:28:21,229 [myid:] - INFO [main-SendThread(node2:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server node2/192.168.1.102:2181, sessionid = 0x20034776a7c000e, negotiated timeout = 30000WATCHER::WatchedEvent state:SyncConnected type:None path:null[zk: node2:2181(CONNECTED) 0] helpZooKeeper -server host:port cmd args stat path [watch] set path data [version] ls path [watch] delquota [-n|-b] path ls2 path [watch] setAcl path acl setquota -n|-b val path history redo cmdno printwatches on|off delete path [version] sync path listquota path rmr path get path [watch] create [-s] [-e] path data acl addauth scheme auth quit getAcl path close connect host:port[zk: node2:2181(CONNECTED) 1] quitQuitting...2018-06-13 17:28:37,984 [myid:] - INFO [main:ZooKeeper@687] - Session: 0x20034776a7c000e closed2018-06-13 17:28:37,986 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@521] - EventThread shut down for session: 0x20034776a7c000e[root@node1 bigdata]#
2、启动所有journalnode节点
# node1节点[root@node1 ~]# cd /data/bigdata/hadoop/[root@node1 hadoop]# ./sbin/hadoop-daemon.sh start journalnodestarting journalnode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-journalnode-node1.out# 相应进程(其它类似)[root@node1 ~]# jps23993 QuorumPeerMain24474 JournalNode # 新启动的进程24910 Jps[root@node1 ~]## journalnode我配了3个# node2节点[root@node2 ~]# cd /data/bigdata/hadoop/[root@node2 hadoop]# ./sbin/hadoop-daemon.sh start journalnode# node3节点[root@node3 ~]# cd /data/bigdata/hadoop/[root@node3 hadoop]# ./sbin/hadoop-daemon.sh start journalnode
3、格式化namenode目录(主节点node1)
[root@node1 hadoop]# cd /data/bigdata/hadoop[root@node1 hadoop]# ./bin/hdfs namenode -format 18/06/04 14:24:03 INFO namenode.NameNode: STARTUP_MSG:/************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG: host = node1/192.168.1.101STARTUP_MSG: args = [-format]STARTUP_MSG: version = 2.7.6..........................................省略若干...........................................18/06/04 14:24:05 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 018/06/04 14:24:05 INFO util.ExitUtil: Exiting with status 018/06/04 14:24:05 INFO namenode.NameNode: SHUTDOWN_MSG:/************************************************************SHUTDOWN_MSG: Shutting down NameNode at node1/192.168.1.101************************************************************/[root@node1 hadoop]#
4、启动当前格式化的namenode进程(主节点node1)
[root@node1 hadoop]# ./sbin/hadoop-daemon.sh start namenodestarting namenode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-namenode-node1.out[root@node1 ~]# jps # # 相应进程25155 Jps23993 QuorumPeerMain25050 NameNode # name节点24474 JournalNode[root@node1 ~]#
5、在没有格式化的NN上 执行同步命令(副节点node2)
[root@node2 hadoop]# ./bin/hdfs namenode -bootstrapStandby 18/06/04 14:26:55 INFO namenode.NameNode: STARTUP_MSG:/************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG: host = node2/192.168.1.102STARTUP_MSG: args = [-bootstrapStandby]STARTUP_MSG: version = 2.7.6..........................................省略若干...........................................************************************************************/18/06/04 14:26:55 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]18/06/04 14:26:55 INFO namenode.NameNode: createNameNode [-bootstrapStandby]18/06/04 14:26:55 WARN common.Util: Path /data/bigdata/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.18/06/04 14:26:55 WARN common.Util: Path /data/bigdata/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.=====================================================About to bootstrap Standby ID nn2 from: Nameservice ID: mycluster Other Namenode ID: nn1 Other NN's HTTP address: http://node1:50070 Other NN's IPC address: node1/192.168.1.101:8020 Namespace ID: 736429223 Block pool ID: BP-1022667957-192.168.1.101-1528093445721 Cluster ID: CID-9d4854cd-7201-4e0d-9536-36e73195dc5a Layout version: -63 isUpgradeFinalized: true=====================================================18/06/04 14:26:56 INFO common.Storage: Storage directory /data/bigdata/hdfs/name has been successfully formatted.18/06/04 14:26:56 WARN common.Util: Path /data/bigdata/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.18/06/04 14:26:56 WARN common.Util: Path /data/bigdata/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.18/06/04 14:26:57 INFO namenode.TransferFsImage: Opening connection to http://node1:50070/imagetransfer?getimage=1&txid=0&storageInfo=-63:736429223:0:CID-9d4854cd-7201-4e0d-9536-36e73195dc5a18/06/04 14:26:57 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds18/06/04 14:26:57 INFO namenode.TransferFsImage: Transfer took 0.00s at 0.00 KB/s18/06/04 14:26:57 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 306 bytes.18/06/04 14:26:57 INFO util.ExitUtil: Exiting with status 018/06/04 14:26:57 INFO namenode.NameNode: SHUTDOWN_MSG:/************************************************************SHUTDOWN_MSG: Shutting down NameNode at node2/192.168.1.102************************************************************/# 如果不成功,直接把node1节点的/data/bigdata/hdfs/name目录复制过来,即可# 启动从节点[root@node2 hadoop]# ./sbin/hadoop-daemon.sh start namenodestarting namenode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-namenode-node2.out
6、格式化ZKFC
格式化zkfc,让在zookeeper中生成ha节点,在master上执行如下命令,完成格式化:[root@node1 hadoop]# ./bin/hdfs zkfc -formatZK18/06/04 16:53:15 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at node1/192.168.1.101:802018/06/04 16:53:16 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT18/06/04 16:53:16 INFO zookeeper.ZooKeeper: Client environment:host.name=node118/06/04 16:53:16 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_16218/06/04 16:53:16 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation18/06/04 16:53:16 INFO zookeeper.ZooKeeper: Client environment:java.home=/opt/jdk1.8.0_162/jre..........................................省略若干...........................................18/06/04 16:53:16 INFO ha.ActiveStandbyElector: Session connected.18/06/04 16:53:16 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.18/06/04 16:53:16 INFO zookeeper.ZooKeeper: Session: 0x20034776a7c0000 closed18/06/04 16:53:16 INFO zookeeper.ClientCnxn: EventThread shut down[root@node1 hadoop]# ./sbin/hadoop-daemon.sh start zkfcstarting zkfc, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-zkfc-node1.out[root@node1 hadoop]# jps5443 DFSZKFailoverController # 新进程4664 JournalNode23993 QuorumPeerMain5545 Jps4988 NameNode[root@node1 hadoop]## 另一个节点启动zkfc,有namenode运行的节点,都要启动ZKFC[root@node2 hadoop]# ./sbin/hadoop-daemon.sh start zkfc
7、启动hdfs(datanode)
[root@node1 hadoop]# ./sbin/start-dfs.shStarting namenodes on [node1 node2]node1: namenode running as process 25050. Stop it first.node2: namenode running as process 30976. Stop it first.node1: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node1.outnode2: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node2.outnode5: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node5.outnode3: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node3.outnode4: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node4.outStarting journal nodes [node1 node2 node3]node1: journalnode running as process 24474. Stop it first.node3: journalnode running as process 19893. Stop it first.node2: journalnode running as process 29871. Stop it first.Starting ZK Failover Controllers on NN hosts [node1 node2]node1: starting zkfc, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-zkfc-node1.outnode2: starting zkfc, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-zkfc-node2.out[root@node1 hadoop]# jps25968 DataNode # 所有节点都有23993 QuorumPeerMain5443 DFSZKFailoverController25050 NameNode24474 JournalNode26525 Jps[root@node1 hadoop]#
8、启动yarn:
[root@node1 hadoop]# ./sbin/start-yarn.sh starting yarn daemonsstarting resourcemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-resourcemanager-node1.outnode1: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node1.outnode2: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node2.outnode4: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node4.outnode3: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node3.outnode5: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node5.out[root@node1 hadoop]# jps25968 DataNode23993 QuorumPeerMain5443 DFSZKFailoverController25050 NameNode24474 JournalNode27068 Jps26894 NodeManager # 所有节点都有[root@node1 hadoop]#
9、两台resourcemanager上启动resourcemanager
(1)单独启动
[root@node3 hadoop]# ./sbin/yarn-daemon.sh start resourcemanagerstarting resourcemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-resourcemanager-node3.out[root@node4 hadoop]# ./sbin/yarn-daemon.sh start resourcemanager# 相应进程(其它类似)[root@node3 ~]# jps21088 NodeManager21297 ResourceManager # 此进程19459 QuorumPeerMain19893 JournalNode20714 DataNode21535 Jps[root@node3 ~]#
(2)集群启动
生产中一个hdfs集群会有两个ResourceManager节点,若逐个节点启动稍微有些麻烦,自定义一个脚本用来启动集群中所有ResourceManager节点,如下:
[root@node1 bigdata]# pwd/data/bigdata[root@node1 bigdata]# cat yarn_all_resourcemanager.sh#!/bin/bash# resourcemanager managementhadoop_yarn_daemon_home=/data/bigdata/hadoop/sbin/yarn_resourcemanager_node=( "node3" "node4" )for renode in ${yarn_resourcemanager_node[@]}; do ssh -p 22 -q root@$znode " export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin source /etc/profile cd $hadoop_yarn_daemon_home/ && ./yarn-daemon.sh $1 resourcemanager " echo "$renode resourcemanager $1 done"done[root@node1 bigdata]# HDFS和yarn的web控制台默认监听端口分别为50070和8088。可以通过浏览放访问查看运行情况。例: http://192.168.1.101:50070/ http://192.168.1.103:8088/# 停止命令:不操作$HADOOP_HOME/sbin/stop-dfs.sh$HADOOP_HOME/sbin/stop-yarn.sh# 如果一切正常,使用jps可以查看到正在运行的Hadoop服务,机器上的显示结果为:[root@node1 hadoop]# jps7312 Jps1793 NameNode2163 JournalNode357 NodeManager2696 QuorumPeerMain14428 DFSZKFailoverController1917 DataNode
到目前为止所启动的进程:
\ | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ |
QuorumPeerMain | ✔ | ✔ | ✔ | ||
JournalNode | ✔ | ✔ | ✔ | ||
NameNode | ✔ | ✔ | |||
DFSZKFailoverController | ✔ | ✔ | |||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ |
ResourceManager | ✔ | ✔ |
10、验证HDFS的HA功能
在任意一台namenode机器上通过jps命令查找到namenode的进程号,然后通过kill -9的方式杀掉进程,观察另一个namenode节点是否会从状态standby变成active状态。
[root@node1 bigdata]# jps16704 JournalNode16288 NameNode16433 DataNode23993 QuorumPeerMain17241 NodeManager18621 Jps16942 DFSZKFailoverController[root@node1 bigdata]# kill -9 16288
然后观察原来是standby状态的namenode机器的zkfc日志,若最后一行出现如下日志,则表示切换成功:
2018-05-31 16:14:41,114 INFOorg.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNodeat hd0/192.168.1.102:53310 to active state
这时再通过命令启动被kill掉的namenode进程
[root@node1 bigdata]# ./sbin/hadoop-daemon.sh start namenode
对应进程的zkfc最后一行日志如下:
2018-05-31 16:14:55,683 INFOorg.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNodeat hd2/192.168.1.101:53310 to standby state
可以在两台namenode机器之间来回kill掉namenode进程以检查HDFS的HA配置!
七、scala
1、配置前准备
scala运行在jvm虚拟机,需要配置jdk;
2、解压
[root@node1 sbin]# tar -zxvf /data/tools/scala-2.11.12.tgz -C /data/bigdata/src[root@node1 sbin]# ln -s /data/bigdata/src/scala-2.11.12 /data/bigdata/scala# 添加环境变量[root@node1 ~]# echo -e "\n# scala\nexport scala_HOME=/data/bigdata/scala\nexport PATH=\$scala_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node1 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH[root@node1 ~]# source /etc/profile
3、查看scala版本
[root@node1 ~]# scala -versionScala code runner version 2.11.12 -- Copyright 2002-2016, LAMP/EPFL
4、运行scala命令
[root@node1 ~]# scalaWelcome to Scala 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_162).Type in expressions for evaluation. Or try :help.scala> 1+1res0: Int = 2scala>
如果以上两步没问题,表示scala已安装和配置成功。
八、 安装spark
官方文档
1、解压spark
[root@node1 conf]# tar -zxvf /data/tools/spark-2.1.2-bin-hadoop2.7.tgz -C /data/bigdata/src/[root@node1 conf]# ln -s /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 /data/bigdata/spark# 添加环境变量[root@node1 ~]# echo -e "\n# spark\nexport SPARK_HOME=/data/bigdata/spark\nexport PATH=\$SPARK_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node1 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH[root@node1 ~]# source /etc/profile# sparkexport SPARK_HOME=/data/bigdata/sparkexport PATH=$SPARK_HOME/bin:$PATH[root@node1 ~]#
2、配置spark-env.sh
[root@node1 conf]# cd /data/bigdata/spark/conf/ [root@node1 conf]# cp spark-env.sh{.template,} # 添加:[root@node1 conf]# vim spark-env.sh export SPARK_LOCAL_IP="192.168.1.101" # 从节点改为自己的IP(或127.0.0.1 ),或者注掉export SPARK_MASTER_IP="192.168.1.101" export JAVA_HOME=/opt/javaexport SPARK_PID_DIR=/data/bigdata/hdfs/pidsexport SPARK_LOCAL_DIRS= /data/bigdata/sparktmpexport PYSPARK_PYTHON=/usr/local/bin/python3 # 当用python3开发时,配上python3的绝对路径。# 设置内存,本节点可以调用的内存export SPARK_WORKER_MEMORY=58gexport SPARK_MASTER_PORT=7077export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport LD_LIBRARY_PATH=$HADOOP_HOME/lib/nativeexport SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://mycluster/directory"# 限制程序申请资源最大核数,本节点可以调用的cpu核数export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=16"export SPARK_SSH_OPTS="-p 22 -o StrictHostKeyChecking=no $SPARK_SSH_OPTS"
3、配置spark-defaults.conf
[root@node1 conf]# cp spark-defaults.conf{.template,}[root@node1 conf]# vim spark-defaults.conf#添加spark.serializer org.apache.spark.serializer.KryoSerializerspark.eventLog.enabled truespark.eventLog.dir hdfs://mycluster/directory# 使用Python3+开发时配置;park.executorEnv.PYTHONHASHSEED=0
# 新建对应目录[root@node1 conf]# mkdir -pv /data/bigdata/sparktmp[root@node1 conf]# hdfs dfs -mkdir /directory
- 说明:
spark.executorEnv.PYTHONHASHSEED=0
配置:
如果你使用的是Python3+,并且在Spark集群上使用distinct(),reduceByKey(),和join()这几个函数时,就会触发下面的异常:Exception: Randomness of hash of string should be disabled via PYTHONHASHSEED
python创建遍历对象对象时会对每个对象进行随机哈希创建索引。然而在一个集群上,每个节点计算时对某一个变量创建的索引值不同,会导致数据索引冲突。因此需要设置PYTHONHASHSEED来固定随机种子,保证索引一致。参见:
Spark集群配置(4):其他填坑杂项
4、配置slaves
[root@node1 conf]# cp slaves{.template,}[root@node1 conf]# vim slavesnode1node2node3node4node5
5、分发
[root@node1 conf]# scp -rp /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 node2:/data/bigdata/src/[root@node1 conf]# scp -rp /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 node3:/data/bigdata/src/[root@node1 conf]# scp -rp /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 node4:/data/bigdata/src/[root@node1 conf]# scp -rp /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 node5:/data/bigdata/src/# 分发创建的目录[root@node1 bigdata]# scp -rp /data/bigdata/{hdfs,sparktmp,tmp} node2:/data/bigdata/[root@node1 bigdata]# scp -rp /data/bigdata/{hdfs,sparktmp,tmp} node3:/data/bigdata/[root@node1 bigdata]# scp -rp /data/bigdata/{hdfs,sparktmp,tmp} node4:/data/bigdata/[root@node1 bigdata]# scp -rp /data/bigdata/{hdfs,sparktmp,tmp} node5:/data/bigdata/# 创建软连接[root@node2 ~]# ln -s /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 /data/bigdata/spark[root@node3 ~]# ln -s /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 /data/bigdata/spark[root@node4 ~]# ln -s /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 /data/bigdata/spark[root@node5 ~]# ln -s /data/bigdata/src/spark-2.1.2-bin-hadoop2.7 /data/bigdata/spark# 修改$SPARK_HOME/conf/spark-env.sh中的SPARK_LOCAL_IP参数[root@node2 ~]# vim $SPARK_HOME/conf/spark-env.shexport SPARK_LOCAL_IP="192.168.1.102" [root@node3 ~]# vim $SPARK_HOME/conf/spark-env.shexport SPARK_LOCAL_IP="192.168.1.103" [root@node4 ~]# vim $SPARK_HOME/conf/spark-env.shexport SPARK_LOCAL_IP="192.168.1.104" [root@node5 ~]# vim $SPARK_HOME/conf/spark-env.shexport SPARK_LOCAL_IP="192.168.1.105"
6、启动spark
[root@node1 bin]# cd /data/bigdata/spark/sbin[root@node1 sbin]# ./start-all.shstarting org.apache.spark.deploy.master.Master, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-node1.outnode1: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node1.outnode5: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node5.outnode3: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node3.outnode4: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node4.outnode2: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node2.out[root@node1 sbin]# jps5443 DFSZKFailoverController5684 Master # spark的master1092 HRegionServer5846 Worker # spark进程904 HMaster4664 JournalNode23993 QuorumPeerMain6266 Jps7227 NodeManager4988 NameNode6495 DataNode
web界面:http://192.168.1.101:8080/
[root@node1 sbin]# ./start-history-server.shstarting org.apache.spark.deploy.history.HistoryServer, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-node1.out[root@node1 sbin]# jps5443 DFSZKFailoverController5684 Master # spark的master1092 HRegionServer5846 Worker # spark进程904 HMaster4664 JournalNode29366 HistoryServer # spark 保存历史日志记录的23993 QuorumPeerMain6266 Jps7227 NodeManager4988 NameNode6495 DataNode
访问WEBUI: http://192.168.1.101:18080/
到目前为止所启动的进程:
\ | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ |
QuorumPeerMain | ✔ | ✔ | ✔ | ||
JournalNode | ✔ | ✔ | ✔ | ||
NameNode | ✔ | ✔ | |||
DFSZKFailoverController | ✔ | ✔ | |||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ |
ResourceManager | ✔ | ✔ | |||
Master | ✔ | ||||
Worker | ✔ | ✔ | ✔ | ✔ | ✔ |
HistoryServer | ✔ |
九、 安装hbase
Master和Hadoop的NameNode进程运行在同一台主机上,与DataNode通信
以读写HDFS的数据。RegionServer跟Hadoop的DataNode运行在同一台主机上。
参考:
官方文档
hbase 数据库简介安装与常用命令的使用
1、解压hbase
[root@node1 sbin]# tar -zxvf /data/tools/hbase-1.2.6-bin.tar.gz -C /data/bigdata/src[root@node1 sbin]# ln -s /data/bigdata/src/hbase-1.2.6 /data/bigdata/hbase# 添加环境变量[root@node1 ~]# echo -e "\n# hbase\nexport HBASE_HOME=/data/bigdata/hbase\nexport PATH=\$HBASE_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node1 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH# sparkexport SPARK_HOME=/data/bigdata/sparkexport PATH=$SPARK_HOME/bin:$PATH# hbaseexport HBASE_HOME=/data/bigdata/hbaseexport PATH=$HBASE_HOME/bin:$PATH[root@node1 ~]#
2、修改$HBASE_HOME/conf/hbase-env.sh,添加
[root@node1 sbin]# cd /data/bigdata/hbase/conf[root@node1 conf]# vim hbase-env.shexport JAVA_HOME=/opt/javaexport HBASE_HOME=/data/bigdata/hbaseexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/export HBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/# 设置到Hadoop的etc/hadoop目录是用来引导Hbase找到Hadoop,也就是说hbase和hadoop进行关联【必须设置,否则hmaster起不来】export HBASE_CLASSPATH=$HADOOP_HOME/etc/hadoopexport HBASE_MANAGES_ZK=false #不启用hbase自带的zookeeper export HBASE_PID_DIR=/data/bigdata/hdfs/pidsexport HBASE_SSH_OPTS="-o ConnectTimeout=1 -p 22" # ssh端口;# jdk1.8及以上版本注掉下面两行# Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+#export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"#export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
3、修改regionservers文件
[root@node1 conf]# vim regionserversnode1node2node3
4、修改hbase-site.xml文件
[root@node1 conf]# vim hbase-site.xml hbase.rootdir hdfs://mycluster/hbase hbase.zookeeper.quorum node1,node2,node3 指定集群zookeeper主机名 hbase.zookeeper.property.clientPort 2181 hbase.master.info.port 60010 hbase.cluster.distributed true 多台hbase开启此参数
5、分发
[root@node1 conf]# scp -rp /data/bigdata/src/hbase-1.2.6 node2:/data/bigdata/src/[root@node1 conf]# scp -rp /data/bigdata/src/hbase-1.2.6 node3:/data/bigdata/src/# 创建软连接[root@node2 ~]# ln -s /data/bigdata/src/hbase-1.2.6 /data/bigdata/hbase[root@node3 ~]# ln -s /data/bigdata/src/hbase-1.2.6 /data/bigdata/hbase
6、启动hbase
[root@node1 hadoop]# cd /data/bigdata/hbase/bin[root@node1 bin]# ./start-hbase.sh[root@node1 bin]# jps5443 DFSZKFailoverController1092 HRegionServer # hbase进程904 HMaster # hbase主节点4664 JournalNode23993 QuorumPeerMain14730 Master7227 NodeManager4988 NameNode14877 Worker1917 Jps6495 DataNode[root@node1 bin]#
到目前为止所启动的进程:
\ | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ |
QuorumPeerMain | ✔ | ✔ | ✔ | ||
JournalNode | ✔ | ✔ | ✔ | ||
NameNode | ✔ | ✔ | |||
DFSZKFailoverController | ✔ | ✔ | |||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ |
ResourceManager | ✔ | ✔ | |||
Master | ✔ | ||||
Worker | ✔ | ✔ | ✔ | ✔ | ✔ |
HistoryServer | ✔ | ||||
HMaster | ✔ | ||||
HRegionServer | ✔ | ✔ | ✔ |
Hbase web页面http://192.168.1.101:16030
Hbase Master URL:http://192.168.1.101:60010
# 测试[root@node1 bin]# ./hbase shellSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/data/bigdata/src/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/data/bigdata/src/hadoop-2.7.6/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]HBase Shell; enter 'help' for list of supported commands.Type "exit" to leave the HBase ShellVersion 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017hbase(main):001:0> list # 输入命令listTABLE 0 row(s) in 0.2670 seconds=> []hbase(main):002:0> quit[root@node1 bin]#
十、kafka
官方文档
kafka实战最佳经验
1、解压创建环境变量
[root@node4 conf]# tar -zxvf /data/tools/kafka_2.11-1.1.0.tgz -C /data/bigdata/src/[root@node4 conf]# ln -s /data/bigdata/src/kafka_2.11-1.1.0 /data/bigdata/kafka# 添加环境变量[root@node4 ~]# echo -e "\n# kafka\nexport KAFKA_HOME=/data/bigdata/kafka\nexport PATH=\$KAFKA_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node4 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# hbaseexport HBASE_HOME=/data/bigdata/hbaseexport PATH=$HBASE_HOME/bin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH# sparkexport SPARK_HOME=/data/bigdata/sparkexport PATH=$SPARK_HOME/bin:$PATH# kafkaexport KAFKA_HOME=/data/bigdata/kafkaexport PATH=$KAFKA_HOME/bin:$PATH# 生效[root@node4 conf]# source /etc/profile
2、修改server.properties配置文件:
[root@node4 ~]# cd /data/bigdata/kafka/config/ # 进入conf目录[root@node4 conf]# cp server.properties{,.bak} # 备份配置文件[root@node4 conf]# vim server.propertiesbroker.id=1listeners=PLAINTEXT://node4:9092advertised.listeners=PLAINTEXT://node4:9092log.dirs=/data/bigdata/kafka/logszookeeper.connect=node1:2181,node2:2181,node3:2181[root@node4 conf]# mkdir -p /data/bigdata/kafka/logs
3、分发
[root@node4 conf]# scp -rp /data/bigdata/src/kafka_2.11-1.1.0 node5: /data/bigdata/src/# 子节点[root@node5 ~]# ln -s /data/bigdata/src/kafka_2.11-1.1.0 /data/bigdata/kafka[root@node5 ~]# mkdir -p /data/bigdata/kafka/logs# 修改每个server.properties文件中的broker.id[root@node5 ~]# cd /data/bigdata/kafka/[root@node5 kafka]# vim ./config/server.propertiesbroker.id=2listeners=PLAINTEXT://node5:9092advertised.listeners=PLAINTEXT://node5:9092# 查看[root@node1 conf]# ansible kafka -m shell -a '$(which egrep) --color=auto "^broker.id|^listeners|^advertised.listeners" ${KAFKA_HOME}/config/server.properties'
4、启动
[root@node4 kafka]# ./bin/kafka-server-start.sh config/server.properties # 单个节点前台运行[2018-06-05 13:32:35,323] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)[2018-06-05 13:32:35,672] INFO starting (kafka.server.KafkaServer)[2018-06-05 13:32:35,673] INFO Connecting to zookeeper on node1:2181,node2:2181,node3:2181 (kafka.server.KafkaServer)[2018-06-05 13:32:35,692] INFO [ZooKeeperClient] Initializing a new session to node1:2181,node2:2181,node3:2181. (kafka.zookeeper.ZooKeeperClient)[2018-06-05 13:32:35,698] INFO Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT ..........................................省略若干...........................................[2018-06-05 13:33:38,920] INFO Terminating process due to signal SIGINT (kafka.Kafka$)[2018-06-05 13:33:39,168] INFO [ThrottledRequestReaper-Produce]: Stopped (kafka.server.ClientQuotaManager$ThrottledRequestReaper)[2018-06-05 13:33:39,168] INFO [ThrottledRequestReaper-Produce]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledRequestReaper)[2018-06-05 13:33:39,168] INFO [ThrottledRequestReaper-Request]: Shutting down (kafka.server.ClientQuotaManager$ThrottledRequestReaper)[2018-06-05 13:33:39,168] INFO [ThrottledRequestReaper-Request]: Stopped (kafka.server.ClientQuotaManager$ThrottledRequestReaper)[2018-06-05 13:33:39,168] INFO [ThrottledRequestReaper-Request]: Shutdown completed (kafka.server.ClientQuotaManager$ThrottledRequestReaper)[2018-06-05 13:33:39,169] INFO [SocketServer brokerId=1] Shutting down socket server (kafka.network.SocketServer)[2018-06-05 13:33:39,193] INFO [SocketServer brokerId=1] Shutdown completed (kafka.network.SocketServer)[2018-06-05 13:33:39,199] INFO [KafkaServer id=1] shut down completed (kafka.server.KafkaServer)[root@node4 kafka]# ./bin/kafka-server-start.sh -daemon config/server.properties # 后台启动单个节点,其它kafka也要启动,进程名为:Kafka
到目前为止所启动的进程:
\ | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ |
QuorumPeerMain | ✔ | ✔ | ✔ | ||
JournalNode | ✔ | ✔ | ✔ | ||
NameNode | ✔ | ✔ | |||
DFSZKFailoverController | ✔ | ✔ | |||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ |
ResourceManager | ✔ | ✔ | |||
HMaster | ✔ | ||||
HRegionServer | ✔ | ✔ | ✔ | ||
HistoryServer | ✔ | ||||
Master | ✔ | ||||
Worker | ✔ | ✔ | ✔ | ✔ | ✔ |
Kafka | ✔ | ✔ |
Kafka并没有提供同时启动集群中所有节点的执行脚本,在生产中一个Kafka集群往往会有多个节点,若逐个节点启动稍微有些麻烦,自定义一个脚本用来启动集群中所有节点,如下:
[root@node1 bigdata]# cat kafka_cluster_start.sh #!/bin/bashbrokers="node4 node5"KAFKA_HOME="/data/bigdata/kafka"for broker in $brokers do ssh $broker -C "source /etc/profile; cd ${KAFKA_HOME}/bin && ./kafka-server-start.sh -daemon ../config/server.properties" if [ $? -eq 0 ]; then echo "INFO:[${broker}] Start successfully " fidone[root@node1 bigdata]#
5、测试
创建主题:(指明要连接的zookeeper)例如主题名称为:TestTopic
[root@node4 kafka]# ./bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 1 --partitions 1 --topic TestTopicCreated topic "TestTopic". # 表示创建成功
查看主题:
[root@node4 kafka]# ./bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181TestTopic # 可以看到所有已创建的主题
任选一台,创建生产者:(kafka集群用户)
[root@node4 kafka]# ./bin/kafka-console-producer.sh --broker-list node4:9092,node5:9092 --topic TestTopic>hi>hello 1>hello 2>
另一台,创建消费者
[root@node5 kafka]# ./bin/kafka-console-consumer.sh --bootstrap-server node4:9092,node5:9092 --from-beginning --topic TestTopichihello 1hello 2
生产者输入一些数据,看消费者是否显示生产者所输入的数据。
6、关闭
[root@node4 kafka]# ./bin/kafka-server-stop.sh # 关闭,其它kafka也要关闭,有时候不管用,显示"No kafka server to stop",失败的原因是kafka-server-stop.sh脚本里的ps ax | grep -i 'kafka.Kafka' | grep Java | grep -v grep | awk '{print $1}'命令在我所使用的操作系统中并不能得到Kafka进程的PID:因此这里将kafka-server-stop.sh脚本查找PID的命令修改如下:#PIDS=$(ps ax | grep -i 'kafka\.Kafka' | grep java | grep -v grep | awk '{print $1}') # 注掉,改为下面命令PIDS=$(jps | grep -i 'Kafka' |awk '{print $1}')
Kafka也同样没有提供关闭集群操作的脚本。这里我提供一个用来关闭Kafka集群的脚本(可以放在任意一条节点上):
[root@node1 bigdata]# cat kafka_cluster_stop.sh #!/bin/bashbrokers="node4 node5"KAFKA_HOME="/data/bigdata/kafka"for broker in $brokers do ssh $broker -C "cd ${KAFKA_HOME}/bin && ./kafka-server-stop.sh" if [ $? -eq 0 ]; then echo "INFO:[${broker}] shut down completed " fidone[root@node1 bigdata]# chmod +x kafka-cluster-stop.sh
十一、hive
官方文档
1、安装mysql数据库
参考:https://blog.51cto.com/moerjinrong/2092614
# 新建hive用户及metastore库root@node2 14:37: [(none)]> grant all privileges on *.* to 'hive'@'192.168.1.%' identified by '123456';Query OK, 0 rows affected, 1 warning (0.00 sec)root@node2 15:14: [(none)]> create database metastore; # 待定Query OK, 1 row affected (0.01 sec)
2、解压添加环境变量
[root@node2 ~]# tar -zxvf /data/tools/apache-hive-2.3.3-bin.tar.gz -C /data/bigdata/src/[root@node2 ~]# ln -s /data/bigdata/src/apache-hive-2.3.3-bin /data/bigdata/hive# 添加环境变量[root@node2 ~]# echo -e "\n# hive\nexport HIVE_HOME=/data/bigdata/hive\nexport PATH=\$HIVE_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node2 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# hbaseexport HBASE_HOME=/data/bigdata/hbaseexport PATH=$HBASE_HOME/bin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH# sparkexport SPARK_HOME=/data/bigdata/sparkexport PATH=$SPARK_HOME/bin:$PATH# kafkaexport KAFKA_HOME=/data/bigdata/kafkaexport PATH=$KAFKA_HOME/bin:$PATH# hiveexport HIVE_HOME=/data/bigdata/hiveexport PATH=$HIVE_HOME/bin:$PATH"# 生效[root@node2 ~]# source /etc/profile
3、hdfs上新建目录
在hdfs中新建目录/user/hive/warehouse首先启动hadoop任务hdfs dfs -mkdir /tmphdfs dfs -mkdir /userhdfs dfs -mkdir /user/hivehdfs dfs -mkdir /user/hive/warehousehadoop fs -chmod g+w /tmphadoop fs -chmod g+w /user/hive/warehouse# 将mysql的驱动jar包mysql-connector-java-5.1.46.jar拷入hive的lib目录下面,没有就下载:wget -P $HIVE_HOME/lib http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.46/mysql-connector-java-5.1.46.jar
4、修改server.properties配置文件:
[root@node2 ~]# cd /data/bigdata/hive/conf/[root@node2 conf]# cp hive-default.xml.template hive-site.xml[root@node2 conf]# vim hive-site.xml# 修改下列属性值(通过/指令寻找,如果第一个定位不正确,n寻找下一个) javax.jdo.option.ConnectionURL jdbc:mysql://192.168.1.102:3306/hive?createDatabaseIfNotExist=true&useSSL=false # mysql没有开启ssl JDBC connect string for a JDBC metastore; '?'符号是在URL后通过get方法传递参数的起始标志, 多个参数之间可用'&'符号连接,因为这些字符对于HTML有特殊意义, 所以在Java中要用到转义字符使用它,而&在HTML中就会被转义为'&'符号,用于参数连接。 javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName hive Username to use against metastore database javax.jdo.option.ConnectionPassword 123456 password to use against metastore database hive.metastore.warehouse.dir /data/bigdata/hive/warehouse hive.metastore.local true hive.exec.local.scratchdir /data/bigdata/hive/tmp Local scratch space for Hive jobs hive.downloaded.resources.dir /data/bigdata/hive/tmp/resources Temporary local directory for added resources in the remote file system. hive.querylog.location /data/bigdata/hive/tmp Location of Hive run time structured log file hive.server2.logging.operation.log.location /data/bigdata/hive/tmp/operation_logs Top level directory where operation logs are stored if logging functionality is enabled
- 注意:由于HTML格式问题,上面jdbc的URL中的&改为照片中红色下划线的符号
# 创建指定目录[root@node2 conf]# mkdir -pv /data/bigdata/hive/{tmp/{operation_logs,resources},warehouse}
5、初始化并运行
# 使用schematool 初始化metastore的schema:[root@node2 conf]# schematool -initSchema -dbType mysql SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/data/bigdata/src/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/data/bigdata/src/hadoop-2.7.6/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Metastore connection URL: jdbc:mysql://127.0.0.1:3306/metastore?useSSL=falseMetastore Connection Driver : com.mysql.jdbc.DriverMetastore connection User: hiveStarting metastore schema initialization to 2.3.0Initialization script hive-schema-2.3.0.mysql.sqlInitialization script completedschemaTool completed[root@node2 conf]# # 运行hive[root@node2 conf]# hive # 对应RunJar进程hive> show databases;OKdefaultTime taken: 1.881 seconds, Fetched: 1 row(s)hive> use default;OKTime taken: 0.081 secondshive> create table kylin_test(test_count int);OKTime taken: 2.9 secondshive> show tables;OKTime taken: 0.151 seconds, Fetched: 1 row(s)hive> quit;
到目前为止所启动的进程:
\ | node1 | node2 | node3 | node4 | node5 | |
---|---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ | |
QuorumPeerMain | ✔ | ✔ | ✔ | |||
JournalNode | ✔ | ✔ | ✔ | |||
NameNode | ✔ | ✔ | ||||
DFSZKFailoverController | ✔ | ✔ | ||||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ | |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ | |
ResourceManager | ✔ | ✔ | ||||
HMaster | ✔ | |||||
HRegionServer | ✔ | ✔ | ✔ | |||
HistoryServer | ✔ | |||||
Master | ✔ | |||||
Worker | ✔ | ✔ | ✔ | ✔ | ✔ | |
Kafka | ✔ | ✔ | ||||
RunJar | ✔ | # 启动hive时才有 |
十二、kylin
官方文档
1、安装前准备
安装kylin前确保:hadoop 2.4+、hbase 0.13+、hive 0.98+,1.*已经安装并启动。
Hive需要启动metastore和hiveserver2。
Apache Kylin同样可以使用集群部署,但使用集群部署并不能增加计算速度
因为计算过程使用MapReduce引擎,与Kylin自身无关,而是主要为查询提供负载均衡。本次采用单节点。
2、解压并创建环境变量
[root@node2 ~]# tar zxvf /data/tools/apache-kylin-2.3.1-hbase1x-bin.tar.gz -C /data/bigdata/src/[root@node2 ~]# ln -s /data/bigdata/src/apache-kylin-2.3.1-bin/ /data/bigdata/kylin# 添加环境变量[root@node2 ~]# echo -e "\n# kylin\nexport KYLIN_HOME=/data/bigdata/kylin\nexport PATH=\$KYLIN_HOME/bin:\$PATH" >> /etc/profile.d/bigdata_path.sh[root@node2 ~]# cat /etc/profile.d/bigdata_path.sh# zookeeperexport ZOOKEEPER_HOME=/data/bigdata/zookeeperexport PATH=$ZOOKEEPER_HOME/bin:$PATH# hadoopexport HADOOP_HOME=/data/bigdata/hadoopexport PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH# hbaseexport HBASE_HOME=/data/bigdata/hbaseexport PATH=$HBASE_HOME/bin:$PATH# scalaexport scala_HOME=/data/bigdata/scalaexport PATH=$scala_HOME/bin:$PATH# sparkexport SPARK_HOME=/data/bigdata/sparkexport PATH=$SPARK_HOME/bin:$PATH# kafkaexport KAFKA_HOME=/data/bigdata/kafkaexport PATH=$KAFKA_HOME/bin:$PATH# kylinexport KYLIN_HOME=/data/bigdata/kylinexport PATH=$KYLIN_HOME/bin:$PATH"# 生效[root@node2 ~]# source /etc/profile
3、复制hive的相关jar到kylin
将hive安装目录lib目录中的所有jar包复制到kylin安装目录下的lib目录中。
[root@node2 ~]# cp -a /data/bigdata/hive/lib/* /data/bigdata/kylin/lib/
4、配置Kylin使用的Hive数据库:
[root@node2 ~]# cd /data/bigdata/kylin/conf[root@node2 conf]# vim kylin.propertieskylin.server.cluster-servers=node2:7070 # kylin集群设置,修改主机名或ip,端口kylin.job.jar=$KYLIN_HOME/lib/kylin-job-2.3.1.jar # 修改jar包版本及路径kylin.coprocessor.local.jar=$KYLIN_HOME/lib/kylin-coprocessor-2.3.1.jar # 修改jar包版本及路径# List of web servers in use, this enables one web server instance to sync up with other serverskylin.rest.servers=node2:7070## 配置Kylin使用的Hive数据库,这里配置在Hive中使用的schema,改为当前用户kylin.job.hive.database.for.intermediatetable=root
5、如果没有启动https,请关闭
[root@node2 ~]# cd /data/bigdata/kylin/tomcat/conf[root@node2 conf]# cp -a server.xml{,_$(date +%F)}[root@node2 conf]# vim server.xml 85 maxThreads="150" SSLEnabled="true" scheme="https" secure="true"改为:85 maxThreads="150" SSLEnabled="false" scheme="https" secure="false"
如果不关闭,会报如下错误
SEVERE: Failed to load keystore type JKS with path conf/.keystore due to /data/bigdata/kylin/tomcat/conf/.keystore (No such file or directory)java.io.FileNotFoundException: /data/bigdata/kylin/tomcat/conf/.keystore (No such file or directory)
6、修改$KYLIN_HOME/bin/kylin.sh
[root@node2 conf]# vim ../bin/kylin.shexport KYLIN_HOME=/data/bigdata/kylinexport CATALINA_HOME=/data/bigdata/kylin/tomcatexport PATH=$CATALINA_HOME/bin:$PATHexport HCAT_HOME=$HIVE_HOME/hcatalogexport hive_dependency=$HIVE_HOME/conf:$HIVE_HOME/lib/*:$HCAT_HOME/share/hcatalog/hive-hcatalog-core-2.3.3.jarexport HBASE_CLASSPATH_PREFIX=$CATALINA_HOME/bin/bootstrap.jar:$CATALINA_HOME/bin/tomcatjuli.jar:$CATALINA_HOME/lib/*:$hive_dependency:$HBASE_CLASSPATH_PREFIX#使用HDFS超级用户在HDFS上为Kylin创建工作目录,并赋权给服务器登录名#[root@node2 conf]# hdfs dfs -mkdir /kylin#[root@node2 conf]# hdfs dfs -chown -R root:root /kylin
7、检查kylin依赖
进入bin目录下分别执行
[root@node2 bin]# cd $KYLIN_HOME/bin[root@node2 bin]# ./check-env.shRetrieving hadoop conf dir...KYLIN_HOME is set to /data/bigdata/kylin[root@node2 bin]# ./find-hive-dependency.shRetrieving hive dependency...[root@node2 bin]# ./find-hbase-dependency.shRetrieving hbase dependency...
8、启动kylin服务
在kylin安装根目录下执行
[root@node2 bin]# ./kylin.sh startRetrieving hadoop conf dir...KYLIN_HOME is set to /data/bigdata/kylinRetrieving hive dependency...Retrieving hbase dependency...Retrieving hadoop conf dir...Retrieving kafka dependency...Retrieving Spark dependency...Start to check whether we need to migrate acl tables..........................................省略若干...........................................2018-06-05 17:12:10,111 INFO [Thread-6] zookeeper.ZooKeeper:684 : Session: 0x300346e6b9e000d closed2018-06-05 17:12:10,111 INFO [main-EventThread] zookeeper.ClientCnxn:512 : EventThread shut down2018-06-05 17:12:10,210 INFO [close-hbase-conn] client.ConnectionManager$HConnectionImplementation:2068 : Closing master protocol: MasterService2018-06-05 17:12:10,211 INFO [close-hbase-conn] client.ConnectionManager$HConnectionImplementation:1676 : Closing zookeeper sessionid=0x20034776a7c00042018-06-05 17:12:10,214 INFO [close-hbase-conn] zookeeper.ZooKeeper:684 : Session: 0x20034776a7c0004 closed2018-06-05 17:12:10,214 INFO [main-EventThread] zookeeper.ClientCnxn:512 : EventThread shut downA new Kylin instance is started by root. To stop it, run 'kylin.sh stop'Check the log at /data/bigdata/kylin/logs/kylin.logWeb UI is at http://:7070/kylin[root@node2 bin]#
到目前为止所启动的进程: | \ | node1 | node2 | node3 | node4 | node5 |
---|---|---|---|---|---|---|
JDK | ✔ | ✔ | ✔ | ✔ | ✔ | |
QuorumPeerMain | ✔ | ✔ | ✔ | |||
JournalNode | ✔ | ✔ | ✔ | |||
NameNode | ✔ | ✔ | ||||
DFSZKFailoverController | ✔ | ✔ | ||||
DataNode | ✔ | ✔ | ✔ | ✔ | ✔ | |
NodeManager | ✔ | ✔ | ✔ | ✔ | ✔ | |
ResourceManager | ✔ | ✔ | ||||
HMaster | ✔ | |||||
HRegionServer | ✔ | ✔ | ✔ | |||
HistoryServer | ✔ | |||||
Master | ✔ | |||||
Worker | ✔ | ✔ | ✔ | ✔ | ✔ | |
Kafka | ✔ | ✔ | ||||
RunJar | ✔ | # 启动hive时才有 | ||||
RunJar | ✔ | ✔ | # kylin进程 |
服务启动后,浏览器访问地址:http://IP:7070/kylin/
用户名:ADMIN
密码:KYLIN
9、配置hive数据源
1.配置数据源
(1)依次选择 Model -> Data Source -> Load Hive Table
(2)输入 hive 中数据库的表名格式为: 数据库名.数据表名
如:db_hiveTest.student ,然后点击Sync即可。
添加成功后,效果如下图:
10、常见错误
1、界面无法同步hive表元数据
解决方法,在kylin安装目录下:
执行命令:vim ./bin/kylin.sh
需要对此脚本做以下修改:
export HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_dependency:$HBASE_CLASSPATH_PREFIX# 在路径中添加$hive_dependency。
十三、记一次日常启动
1、zookeeper:
[root@node1 ~]# cd /data/bigdata/[root@node1 bigdata]# ./zookeeper_all_op.sh startZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgStarting zookeeper ... STARTEDnode1 zookeeper start doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgStarting zookeeper ... STARTEDnode2 zookeeper start doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgStarting zookeeper ... STARTEDnode3 zookeeper start done[root@node1 bigdata]# ./zookeeper_all_op.sh statusZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: followernode1 zookeeper status doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: leadernode2 zookeeper status doneZooKeeper JMX enabled by defaultUsing config: /data/bigdata/src/zookeeper-3.4.12/bin/../conf/zoo.cfgMode: followernode3 zookeeper status done
可以看到一个leader,其它为follower就可以*
2、hadoop:
[root@node1 bigdata]# cd /data/bigdata/src/hadoop-2.7.6/sbin/[root@node1 sbin]# ./start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.shStarting namenodes on [node1 node2]node1: starting namenode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-namenode-node1.outnode2: starting namenode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-namenode-node2.outnode1: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node1.outnode2: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node2.outnode5: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node5.outnode3: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node3.outnode4: starting datanode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-datanode-node4.outStarting journal nodes [node1 node2 node3]node1: starting journalnode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-journalnode-node1.outnode2: starting journalnode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-journalnode-node2.outnode3: starting journalnode, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-journalnode-node3.outStarting ZK Failover Controllers on NN hosts [node1 node2]node2: starting zkfc, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-zkfc-node2.outnode1: starting zkfc, logging to /data/bigdata/src/hadoop-2.7.6/logs/hadoop-root-zkfc-node1.outstarting yarn daemonsstarting resourcemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-resourcemanager-node1.outnode1: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node1.outnode3: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node3.outnode5: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node5.outnode2: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node2.outnode4: starting nodemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-nodemanager-node4.out[root@node1 sbin]# jps16418 QuorumPeerMain18196 Jps17047 NameNode17194 DataNode17709 DFSZKFailoverController17469 JournalNode17999 NodeManager[root@node1 sbin]#
没有启动的去相应服务器下单独启动
resourcemanager:需单独启动
[root@node3 sbin]# ./yarn-daemon.sh start resourcemanagerstarting resourcemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-resourcemanager-node3.out[root@node3 sbin]# jps15968 Jps14264 QuorumPeerMain14872 NodeManager14634 DataNode15723 ResourceManager14749 JournalNode[root@node3 sbin]# [root@node4 sbin]# ./yarn-daemon.sh start resourcemanagerstarting resourcemanager, logging to /data/bigdata/src/hadoop-2.7.6/logs/yarn-root-resourcemanager-node4.out[root@node4 sbin]# jps2995 NodeManager4004 ResourceManager4091 Jps2813 DataNode[root@node4 sbin]#
3、spark
[root@node1 sbin]# cd /data/bigdata/src/spark-2.1.2-bin-hadoop2.7/sbin/[root@node1 sbin]# ./start-all.sh starting org.apache.spark.deploy.master.Master, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-node1.outnode5: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node5.outnode1: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node1.outnode4: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node4.outnode2: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node2.outnode3: starting org.apache.spark.deploy.worker.Worker, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-node3.out[root@node1 sbin]# ./start-history-server.sh starting org.apache.spark.deploy.history.HistoryServer, logging to /data/bigdata/spark/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-node1.out[root@node1 sbin]#
4、hbase
[root@node1 ~]# cd /data/bigdata/src/hbase-1.2.6/bin/[root@node1 bin]# ./start-hbase.sh starting master, logging to /data/bigdata/hbase/logs/hbase-root-master-node1.outnode3: starting regionserver, logging to /data/bigdata/hbase/logs/hbase-root-regionserver-node3.outnode2: starting regionserver, logging to /data/bigdata/hbase/logs/hbase-root-regionserver-node2.outnode1: starting regionserver, logging to /data/bigdata/hbase/logs/hbase-root-regionserver-node1.out[root@node1 bin]#