千家信息网

How to install Hadoop 2.7.3 cluster on CentOS 7.3

发表于:2025-01-23 作者:千家信息网编辑
千家信息网最后更新 2025年01月23日,##############################ENV#spark01 192.168.51.6#spark02 192.168.51.18#spark03 192.168.51.19#s
千家信息网最后更新 2025年01月23日How to install Hadoop 2.7.3 cluster on CentOS 7.3
##############################ENV#spark01 192.168.51.6#spark02 192.168.51.18#spark03 192.168.51.19#spark04 192.168.51.21#spark05 192.168.51.24##############################We must to improve file limits on every nodesecho "ulimit -SHn 204800" >> /etc/rc.localecho "ulimit -SHu 204800" >> /etc/rc.localcat >> /etc/security/limits.conf << EOF*          soft   nofile    204800*          hard   nofile    204800*          soft   nproc     204800*          hard   nproc     204800EOF##We must to disable ipv6 on every nodesecho 'net.ipv6.conf.all.disable_ipv6 = 1'>>/etc/sysctl.confecho 'net.ipv6.conf.default.disable_ipv6 = 1' >>/etc/sysctl.confecho 'vm.swappiness = 0' >> /etc/sysctl.confsysctl -pecho 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /etc/rc.localchmod +x /etc/rc.d/rc.local#1)Edit /etc/hosts file on every nodescat >/etc/hosts<>/etc/sudoers#4)set permission with opt directory on every nodeschown -R hadoop.hadoop /opt/#5)Set up key-based (passwordless) login:#just do it no spark01su - hadoopssh-keygenssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@spark01ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@spark02ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@spark03ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@spark04ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@spark05#6)install hadoop on spark01 and propagate /opt/hadoop2.7.3 to other nodes:cd /home/toolssudo wget http://god.nongdingbang.net/downloads/hadoop-2.7.3.tar.gzsudo tar zxvf hadoop-2.7.3.tar.gz -C /opt/sudo chown -R hadoop.hadoop /opt/hadoop-2.7.3scp -r /opt/hadoop-2.7.3 hadoop@spark02:/optscp -r /opt/hadoop-2.7.3 hadoop@spark03:/optscp -r /opt/hadoop-2.7.3 hadoop@spark04:/optscp -r /opt/hadoop-2.7.3 hadoop@spark05:/opt#7)Edit this file on every nodessudo su -cat >/etc/profile.d/hadoop.sh </opt/hadoop-2.7.3/etc/hadoop/core-site.xml<    fs.defaultFS    hdfs://spark01:9000    io.file.buffer.size    131072    hadoop.tmp.dir    /opt/hadoop-2.7.3/tmp/   hadoop.proxyuser.hadoop.hosts   *     hadoop.proxyuser.hadoop.groups   *EOF#9)Create HDFS DataNode data dirs on every node and change ownershipmkdir -p /opt/storage/{datanode,namenode}chown -R hadoop.hadoop /opt/storage#10)Edit /opt/hadoop/etc/hadoop/hdfs-site.xml on every nodes- set up DataNodes:###############################################cat >/opt/hadoop-2.7.3/etc/hadoop/hdfs-site.xml<  dfs.replication  3  dfs.permissions  false  dfs.datanode.data.dir  /opt/storage/datanode  dfs.namenode.data.dir  /opt/storage/namenode  dfs.secondary.http.address  spark01:50090  dfs.namenode.http-address  spark01:50070  dfs.webhdfs.enabled  trueEOF#11)Edit /opt/hadoop/etc/hadoop/mapred-site.xml on spark01.#################################################################cat > /opt/hadoop-2.7.3/etc/hadoop/mapred-site.xml <     mapreduce.framework.name    yarn        mapreduce.jobhistory.address    spark01:10020        mapreduce.jobhistory.webapp.address    spark01:19888  EOF#12)setup ResourceManager on spark01 and NodeManagers on spark02-05#########################################################################cat >/opt/hadoop-2.7.3/etc/hadoop/yarn-site.xml<  yarn.resourcemanager.hostname  spark01  yarn.nodemanager.hostname.nm1  spark02  yarn.nodemanager.hostname.nm2  spark03  yarn.nodemanager.hostname.nm3  spark04  yarn.nodemanager.hostname.nm4  spark05  yarn.nodemanager.aux-services  mapreduce_shuffleEOF#13)Edit /opt/hadoop-2.7.3/etc/hadoop/slaves on spark01##(so that master may start all necessary services on slaves automagically):###############################################################cat >/opt/hadoop-2.7.3/etc/hadoop/slaves< /etc/yum.repos.d/apache-maven.repo <


0