千家信息网

HA 模式下的 Hadoop2.7.4+ZooKeeper3.4.10搭建

发表于:2024-11-19 作者:千家信息网编辑
千家信息网最后更新 2024年11月19日,一、概述本次实验采用VMware虚拟机,linux版本为CentOS7;因为实验所需的5台机器配置大多相同,所以采用配置其中一台,然后使用克隆功能复制另外4份再做具体修改;其中有些步骤以前配置过,此处
千家信息网最后更新 2024年11月19日HA 模式下的 Hadoop2.7.4+ZooKeeper3.4.10搭建

一、概述

本次实验采用VMware虚拟机,linux版本为CentOS7;

因为实验所需的5台机器配置大多相同,所以采用配置其中一台,然后使用克隆功能复制另外4份再做具体修改;

其中有些步骤以前配置过,此处就说明一下不再做具体配置,具体配置可翻阅以前的博文。


二、实验环境

1.关闭selinux和firewall

2.hadoop-2.7.4.tar.gz;zookeeper-3.4.10.tar.gz;jdk-8u131-linux-x64.tar.gz


三、主机规划

IPHost进程
192.168.100.11hadoop1

NameNode

ResourceManager

DFSZKFailoverController

192.168.100.12hadoop2

NameNode

ResourceManager

DFSZKFailoverController

192.168.100.13hadoop3

DataNode

NodeManager

JournalNode

QuorumPeerMain

192.168.100.14hadoop4

DataNode

NodeManager

JournalNode

QuorumPeerMain

192.168.100.15hadoop5

DataNode

NodeManager

JournalNode

QuorumPeerMain


四、环境准备

1.设置IP地址:192.168.100.11

2.设置主机名:hadoop1

3.设置IP和主机名的映射

[root@hadoop1 ~]# cat /etc/hosts127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4::1         localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.100.11 hadoop1192.168.100.12 hadoop2192.168.100.13 hadoop3192.168.100.14 hadoop4192.168.100.15 hadoop5

4.配置ssh分发脚本

5.解压jdk

[root@hadoop1 ~]# tar -zxf jdk-8u131-linux-x64.tar.gz[root@hadoop1 ~]# cp -r jdk1.8.0_131/ /usr/local/jdk

6.解压hadoop

[root@hadoop1 ~]# tar -zxf hadoop-2.7.4.tar.gz [root@hadoop1 ~]# cp -r hadoop-2.7.4 /usr/local/hadoop

7.解压zookeeper

[root@hadoop1 ~]# tar -zxf zookeeper-3.4.10.tar.gz [root@hadoop1 ~]# cp -r zookeeper-3.4.10 /usr/local/hadoop/zookeeper[root@hadoop1 ~]# cd /usr/local/hadoop/zookeeper/conf/[root@hadoop1 conf]# cp zoo_sample.cfg zoo.cfg[root@hadoop1 conf]# vim zoo.cfg #修改dataDirdataDir=/usr/local/hadoop/zookeeper/data#添加下面三行server.1=hadoop3:2888:3888server.2=hadoop4:2888:3888server.3=hadoop5:2888:3888[root@hadoop1 conf]# cd ..[root@hadoop1 zookeeper]# mkdir data#此处还有操作,但是hadoop1上不部署zookeeper模块所以后面再修改

8.配置环境变量

[root@hadoop1 ~]# tail -4 /etc/profileexport JAVA_HOME=/usr/local/jdkexport HADOOP_HOME=/usr/local/hadoopexport ZOOKEEPER_HOME=/usr/local/hadoop/zookeeperexport PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH[root@hadoop1 ~]# source /etc/profile

9.测试环境变量可用

[root@hadoop1 ~]# java -versionjava version "1.8.0_131"Java(TM) SE Runtime Environment (build 1.8.0_131-b11)Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)[root@hadoop1 ~]# hadoop versionHadoop 2.7.4Subversion Unknown -r UnknownCompiled by root on 2017-08-28T09:30ZCompiled with protoc 2.5.0From source with checksum 50b0468318b4ce9bd24dc467b7ce1148This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.4.jar


五、配置hadoop

1.core-site.xml

                fs.defaultFS        hdfs://master/                    hadoop.tmp.dir        /usr/local/hadoop/tmp                    ha.zookeeper.quorum        hadoop3:2181,hadoop4:2181,hadoop5:2181    

2.hdfs-site.xml

            dfs.namenode.name.dir        /usr/local/hadoop/dfs/name                dfs.datanode.data.dir        /usr/local/hadoop/dfs/data                dfs.replication        2                        dfs.nameservices        master                    dfs.ha.namenodes.master        nn1,nn2                    dfs.namenode.rpc-address.master.nn1        hadoop1:9000                dfs.namenode.rpc-address.master.nn2        hadoop2:9000                    dfs.namenode.http-address.master.nn1        hadoop1:50070                dfs.namenode.http-address.master.nn2        hadoop2:50070                        dfs.journalnode.http-address        0.0.0.0:8480                dfs.journalnode.rpc-address        0.0.0.0:8485                        dfs.namenode.shared.edits.dir        qjournal://hadoop3:8485;hadoop4:8485;hadoop5:8485/master                        dfs.journalnode.edits.dir        /usr/local/hadoop/dfs/journal                        dfs.client.failover.proxy.provider.master        org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider                        dfs.ha.fencing.methods        sshfence                shell(/bin/true)                    dfs.ha.fencing.ssh.private-key-files        /root/.ssh/id_rsa                    dfs.ha.fencing.ssh.connect-timeout        30000                    dfs.ha.automatic-failover.enabled        true                ha.zookeeper.quorum        hadoop3:2181,hadoop4:2181,hadoop5:2181                        ha.zookeeper.session-timeout.ms        2000    

3.yarn-site.xml

                yarn.nodemanager.aux-services        mapreduce_shuffle                yarn.resourcemanager.connect.retry-interval.ms        2000                yarn.resourcemanager.ha.enabled        true                    yarn.resourcemanager.cluster-id        yrc                    yarn.resourcemanager.ha.rm-ids        rm1,rm2                    yarn.resourcemanager.hostname.rm1        hadoop1                    yarn.resourcemanager.hostname.rm2        hadoop2                    yarn.resourcemanager.ha.automatic-failover.enabled        true                    yarn.resourcemanager.recovery.enabled         true                     yarn.resourcemanager.store.class        org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore                    yarn.resourcemanager.zk-address        hadoop3:2181,hadoop4:2181,hadoop5:2181                    yarn.resourcemanager.scheduler.address.rm1        hadoop1:8030                yarn.resourcemanager.scheduler.address.rm2        hadoop2:8030                     yarn.resourcemanager.resource-tracker.address.rm1        hadoop1:8031                yarn.resourcemanager.resource-tracker.address.rm2        hadoop2:8031                    yarn.resourcemanager.address.rm1        hadoop1:8032                yarn.resourcemanager.address.rm2        hadoop2:8032                    yarn.resourcemanager.admin.address.rm1        hadoop1:8033                yarn.resourcemanager.admin.address.rm2        hadoop2:8033                    yarn.resourcemanager.webapp.address.rm1        hadoop1:8088                yarn.resourcemanager.webapp.address.rm2        hadoop2:8088    

4.mapred-site.xml

                mapreduce.framework.name        yarn                         mapreduce.jobhistory.address        hadoop1:10020                 mapreduce.jobhistory.webapp.address        hadoop1:19888   

5.slaves

[root@hadoop1 hadoop]# cat slaves hadoop3hadoop4hadoop5

6.hadoop-env.sh

export JAVA_HOME=/usr/local/jdk    #在后面添加


六、克隆虚拟机

1.使用hadoop1为模板克隆4台虚拟机,并将网卡的MAC地址重新生成

2.修改主机名为hadoop2-hadoop5

3.修改IP地址

4.配置所有机器之间的ssh免密登陆(ssh公钥分发)


七、配置zookeeper

[root@hadoop3 ~]# echo 1 > /usr/local/hadoop/zookeeper/data/myid    #在hadoop3上[root@hadoop4 ~]# echo 2 > /usr/local/hadoop/zookeeper/data/myid    #在hadoop4上[root@hadoop5 ~]# echo 3 > /usr/local/hadoop/zookeeper/data/myid    #在hadoop5上


八、启动集群

1.在hadoop3-5上启动zookeeper

[root@hadoop3 ~]# zkServer.sh startZooKeeper JMX enabled by defaultUsing config: /usr/local/hadoop/zookeeper/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[root@hadoop3 ~]# zkServer.sh statusZooKeeper JMX enabled by defaultUsing config: /usr/local/hadoop/zookeeper/bin/../conf/zoo.cfgMode: follower[root@hadoop3 ~]# jps2184 QuorumPeerMain2237 Jps#hadoop4和hadoop5相同操作

2.在hadoop1上格式化 ZooKeeper 集群

[root@hadoop1 ~]# hdfs zkfc -formatZK

3.在hadoop3-5上启动journalnode

[root@hadoop3 ~]# hadoop-daemon.sh start journalnodestarting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-hadoop3.out[root@hadoop3 ~]# jps2244 JournalNode2293 Jps2188 QuorumPeerMain

4.在hadoop1上格式化namenode

[root@hadoop1 ~]# hdfs namenode -format...17/08/29 22:53:30 INFO util.ExitUtil: Exiting with status 017/08/29 22:53:30 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at hadoop1/192.168.100.11************************************************************/

5.在hadoop1上启动刚格式化的namenode

[root@hadoop1 ~]# hadoop-daemon.sh start namenodestarting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-hadoop1.out[root@hadoop1 ~]# jps2422 Jps2349 NameNode

6.在hadoop2上同步nn1(hadoop1)数据到nn2(hadoop2)

[root@hadoop2 ~]# hdfs namenode -bootstrapStandby...17/08/29 22:55:45 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds17/08/29 22:55:45 INFO namenode.TransferFsImage: Transfer took 0.00s at 0.00 KB/s17/08/29 22:55:45 INFO namenode.TransferFsImage: Downloaded file fsp_w_picpath.ckpt_0000000000000000000 size 321 bytes.17/08/29 22:55:45 INFO util.ExitUtil: Exiting with status 017/08/29 22:55:45 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at hadoop2/192.168.100.12************************************************************/

7.启动hadoop2上的namenode

[root@hadoop2 ~]# hadoop-daemon.sh start namenode

8.启动集群中的所有服务

[root@hadoop1 ~]# start-all.sh

9.在hadoop2上启动yarn

[root@hadoop2 ~]# yarn-daemon.sh start resourcemanager

10.开启historyserver

[root@hadoop1 ~]# mr-jobhistory-daemon.sh start historyserverstarting historyserver, logging to /usr/local/hadoop/logs/mapred-root-historyserver-hadoop1.out[root@hadoop1 ~]# jps3026 DFSZKFailoverController3110 ResourceManager3894 JobHistoryServer3927 Jps2446 NameNode

11.查看进程

[root@hadoop3 ~]# jps2480 DataNode2722 Jps2219 JournalNode2174 QuorumPeerMain2606 NodeManager[root@hadoop4 ~]# jps2608 NodeManager2178 QuorumPeerMain2482 DataNode2724 Jps2229 JournalNode[root@hadoop5 ~]# jps2178 QuorumPeerMain2601 NodeManager2475 DataNode2717 Jps2223 JournalNode


九、测试

1.连接

2.kill hadoop2上的namenode

[root@hadoop2 ~]# jps2742 NameNode3016 DFSZKFailoverController4024 JobHistoryServer4057 Jps3133 ResourceManager[root@hadoop2 ~]# kill -9 2742[root@hadoop2 ~]# jps3016 DFSZKFailoverController3133 ResourceManager4205 Jps



0