hadoop环境如何部署
这篇文章主要讲解了"hadoop环境如何部署",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"hadoop环境如何部署"吧!
准备工作
以下步骤要在所有节点上执行
1.1修改hostname
vi /etc/sysconfig/network
1.2关闭SELinux
查看SELinux状态getenforce
若SELinux没有关闭,按照下述方式关闭
vi /etc/selinux/config
修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机
1.3关闭防火墙
service iptables stop
chkconfig iptables off
chkconfig iptables --list
1.4网络配置
vim /etc/sysconfig/network-scripts/ifcfg-eth0
1.5修改host
127.0.0.1 localhost#必须配置
# CDH Cluster
192.168.88.11 hadoop1
192.168.88.12 hadoop2
192.168.88.13 hadoop3
1.6配置hadoop1到hadoop2免密登录
1.7所有节点配置NTP服务
集群中所有主机必须保持时间同步,如果时间相差较大会引起各种问题。 具体思路如下:
master节点作为ntp服务器与外界对时中心同步时间,随后对所有datanode节点提供时间同步服务。所有datanode节点以master节点为基础同步时间。
所有节点安装相关组件: yum install ntp 。
完成后,配置开机启动: chkconfig ntpd on ,
检查是否设置成功: chkconfig --list ntpd 其中2-5为on状态就代表成功。
主节点配置
在配置之前,先使用ntpdate手动同步一下时间,免得本机与对时中心时间差距太大,使得ntpd不能正常同步。这里选用65.55.56.206作为对时中心, ntpdate -u 202.112.10.36
vi /etc/ntp.conf
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).
driftfile /var/lib/ntp/drift
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
# Permit all access over the loopback interface. This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1
# Hosts on local network are less restricted.
# 允许内网其他机器同步时间
restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
# 中国这边最活跃的时间服务器 : http://www.pool.ntp.org/zone/cn
server 210.72.145.44 perfer # 中国国家受时中心
server 202.112.10.36 # 1.cn.pool.ntp.org
server 59.124.196.83 # 0.asia.pool.ntp.org
#broadcast 192.168.1.255 autokey # broadcast server
#broadcastclient # broadcast client
#broadcast 224.0.1.1 autokey # multicast server
#multicastclient 224.0.1.1 # multicast client
#manycastserver 239.255.254.254 # manycast server
#manycastclient 239.255.254.254 autokey # manycast client
# allow update time by the upper server
# 允许上层时间服务器主动修改本机时间
restrict 210.72.145.44 nomodify notrap noquery
restrict 202.112.10.36 nomodify notrap noquery
restrict 59.124.196.83 nomodify notrap noquery
# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.
# 外部时间服务器不可用时,以本地时间作为时间服务
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
# Enable public key cryptography.
#crypto
includefile /etc/ntp/crypto/pw
# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys
# Specify the key identifiers which are trusted.
#trustedkey 4 8 42
# Specify the key identifier to use with the ntpdc utility.
#requestkey 8
# Specify the key identifier to use with the ntpq utility.
#controlkey 8
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
service ntpd start
ntpstat
这个一般需要5-10分钟后才能成功连接和同步
[root@hadoop1 ~]# netstat -tlunp | grep ntp
udp 0 0 192.168.88.11:123 0.0.0.0:* 17339/ntpd ############################
udp 0 0 127.0.0.1:123 0.0.0.0:* 17339/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 17339/ntpd
udp 0 0 fe80::20c:29ff:fe7c:123 :::* 17339/ntpd
udp 0 0 ::1:123 :::* 17339/ntpd
udp 0 0 :::123 :::* 17339/ntpd
[root@hadoop1 ~]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
202.118.1.130 .INIT. 16 u - 64 0 0.000 0.000 0.000#################################
# ntpstat
unsynchronised
time server re-starting
polling server every 64 s
连接并同步后:
synchronised to NTP server (202.112.10.36) at stratum 3
time correct to within 275 ms
polling server every 256 s
# yum install ntp
# chkconfig ntp on
# vim /etc/ntp.conf
driftfile /var/lib/ntp/drift
restrict 127.0.0.1
restrict -6 ::1
# 配置时间服务器为本地的时间服务器
server 192.168.1.135
restrict 192.168.1.135 nomodify notrap noquery
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
[root@hadoop2 soft]# ntpdate -u hadoop1
2.cloudra安装 所有节点
2.1下载cloudera-manager.repo wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo
1将cloudera-manager.repo文件拷贝到所有节点的/etc/yum.repos.d/文件夹下
mv cloudera-manager.repo /etc/yum.repos.d/
vi /etc/yum.conf
timeout=50000
yum list|grep cloudera
如果列出的不是你安装的版本,执行下面命令重试
yum clean all
yum list | grep cloudera
2.2下载CDH将之前下载的Parcel那3个文件拷贝到/opt/cloudera/parcel-repo目录下(如果没有该目录,请自行创建)
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1###.sha1文件后缀更改为.sha,同时把内容只保留hash码部分
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/manifest.json
2.4在master[hadoop1] 节点安装daemons、server、agent(先装daemons)
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-server-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-agent-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
yum --nogpgcheck localinstall cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum --nogpgcheck localinstall cloudera-manager-server-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum --nogpgcheck localinstall cloudera-manager-agent-5.2.1-1.cm521.p0.109.el6.x86_64.rpm(注:agent安装需要联网)
2.5在slave-1[hadoop2]、slave-2[hadoop3]节点安装daemons、agent(先装daemons)
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-agent-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
yum --nogpgcheck localinstall cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum --nogpgcheck localinstall cloudera-manager-agent-5.2.1-1.cm521.p0.109.el6.x86_64.rpm(注:agent安装需要联网)
2.6在master、slave-1、slave-2 节点安装JDK、oraclejdk
rpm -ivh jdk-6u31-linux-amd64.rpm
3.在master节点安装mysql 数据库,并配置cdh需要的数据库选项
yum install mysql-server mysql mysql-devel
chkconfig mysqld on
service mysqld start
mysql -u root
use mysql
update user set password=password('1234') where user='root'
update user set password=password('1234') where host='localhost'
update user set password=password('1234') where host='hadoop1'
service mysqld restart
mysql -u root -p1234
create database cloudera
4.在master节点配置cloudera manager 数据库并启动cm的server及agent程序
1.拷贝mysql-connector-java-5.1.7-bin.jar 到 /usr/share/java 下并重命名mysql-connector-java.jar
2.运行 /usr/share/cmf/schema/scm_prepare_database.sh -h hadoop1 mysql cloudera root 1234
3.启动cm server :service cloudera-scm-server start
4.添加cm server服务 :chkconfig cloudera-scm-server on
5.启动cm agent :chkconfig cloudera-scm-agent on
6.添加cm agent服务 :service cloudera-scm-server start
5、修改所有节点的agent 配置文件
/etc/cloudera-scm-agent/config.ini 将配置文件中的host 改成 cdh-master
6、在slave节点配置cloudera manager agent程序
1.启动cm agent :chkconfig cloudera-scm-agent on
2.添加cm agent服务 :service cloudera-scm-agent start
7、测试agent和server是否通信成功
service cloudera-scm-server status
service cloudera-scm-agent status
netstat -anp | grep 7182
# server 端开启的是7182端口,用于和agent进行通讯
启动失败时可以查看日志
server 日志 /var/log/cloudera-scm-server
agent 日志 /var/log/cloudera-scm-agent
8设置parcel[master]
mv CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel /opt/cloudera/parcel-repo
[root@hadoop1 parcel-repo]# tail -5 manifest.json
"replaces": "IMPALA, SOLR, SPARK",
"hash": "7dcb31e557a7da951bfb6337e02b0b884aa3d2a2\n"
}
]
[root@hadoop1 parcel-repo]# tail -1 CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1
7dcb31e557a7da951bfb6337e02b0b884aa3d2a2\n
[root@hadoop1 parcel-repo]# mv CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1 CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha
9.[root@hadoop1 soft]# rpm -ivh oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm 所有节点
CDH集群安装
CM安装成功后浏览器输入http://ip:7180,ip是CM安装的主机ip或者主机名。显示如下界面,用户名和密码都输入admin,进入web管理界面。
免费版-〉继续->查找并选择需要安装 CDH 的机器,点击"继续" 192.168.88.[11-13]->
二、卸载步骤
记录卸载过程和问题。现有环境Cloudera Manager + (1 + 2 )的CDH环境。
1、先在Manage管理端移除所有服务。
2、删除Manager Server
在Manager节点运行
/usr/share/cmf/uninstall-cloudera-manager.sh如果没有该脚本,则可以手动删除,先停止服务:
service cloudera-scm-server stop
service cloudera-scm-server-db stop然后删除:
yum remove cloudera-manager-serversudo
yum remove cloudera-manager-server-db3 、删除所有CDH节点上的CDH服务,先停止服务:
service cloudera-scm-agent hard_stop卸载安装的软件:
yum remove 'cloudera-manager-*' hadoop hue-common 'bigtop-*'4、删除残余数据:
rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera*
5、kill掉所有Manager和Hadoop进程(选作,如果你正确停止Cloud Manager和所有服务则无须此步)
$ for u in hdfs mapred cloudera-scm hbase hue zookeeper oozie hive impala flume; do sudo kill $(ps -u $u -o pid=); done6、删除Manager的lock文件
在Manager节点运行:
rm /tmp/.scm_prepare_node.lock至此,删除完成。
/var/log/cloudera-manager-installer/3.install-cloudera-manager-server.log
http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/
获取锁 卸载装
Couldn't resolve host 'archive.cloudera.com'" dns8.8.8.8
注意主机名要与host一致,若不一致就删掉,重新搜索
正在搜索要重新卸载,再安装
[root@h02 soft]# service cloudera-scm-agent status
cloudera-scm-agent dead but pid file exists
[root@client ~]# cd /var/run
[root@client ~]# rm -f cloudera-scm-agent.pid
在日志中发现这样一条错误信息:
ERROR ENGINE Error in HTTP server: shutting down Traceback (most recent call last)
IOError: [Errno 2] No such file or directory: '/var/lib/cloudera-scm-agent/uuid'
[root@h02 cloudera-scm-agent]# mkdir /var/lib/cloudera-scm-agent/
[root@h02 cloudera-scm-agent]# chmod 777 /var/lib/cloudera-scm-agent/
感谢各位的阅读,以上就是"hadoop环境如何部署"的内容了,经过本文的学习后,相信大家对hadoop环境如何部署这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是,小编将为大家推送更多相关知识点的文章,欢迎关注!