千家信息网

大数据之---CDH集群版本部署全网终极篇---更新中

发表于:2024-11-25 作者:千家信息网编辑
千家信息网最后更新 2024年11月25日,1、软件环境和IP规划RHEL6 角色jdk-8u45apache-maven-3.3.9hive-1.1.0-cdh6.7.1-src.tar.gzhadoop-2.8.1.tar.gzmysql-
千家信息网最后更新 2024年11月25日大数据之---CDH集群版本部署全网终极篇---更新中

1、软件环境和IP规划

RHEL6 角色

jdk-8u45apache-maven-3.3.9

hive-1.1.0-cdh6.7.1-src.tar.gz

hadoop-2.8.1.tar.gz

mysql-connector-java-6.0.6.tar.gz

apache-maven-3.3.9

cloudera-manager-el6-cm5.9.3_x86_64.tar

mysql-5.7

CDH-5.9.3-1.cdh6.9.3.p0.4-el6


172.16.18.133 NN && SN && Jobtrack hadoop01

172.16.18.134 DN && tasktrack hadoop02

172.16.18.136 DN && tasktrack hadoop03

172.16.18.143 DN && tasktrack hadoop04

172.16.18.145 DN && tasktrack hadoop05

NN =namenode SN=secondarynamenode DN=datanode

集群介绍:

不收费的Hadoop版本主要有三个(均是国外厂商),分别是:Apache(最原始的版本,所有发行版均基于这个版本进行改进)、Cloudera版本(Cloudera's Distribution Including Apache Hadoop,简称CDH)、Hortonworks版本(Hortonworks Data Platform,简称"HDP"),对于国内而言,绝大多数选择CDH版本。

CDH (Cloudera's Distribution, including Apache Hadoop),是Hadoop众多分支中的一种,由Cloudera维护,基于稳定版本的Apache Hadoop构建,并集成了很多补丁,可直接用于生产环境。

Cloudera Manager则是为了便于在集群中进行Hadoop等大数据处理相关的服务安装和监控管理的组件,对集群中主机、Hadoop、Hive、Spark等服务的安装配置管理做了极大简化。

集群安装,本文采用选择 离线安装CDH

https://www.cloudera.com/downloads/cdh/5-9-0.html

官网对CDH的描述,CHD对system JDK database 等版本支持列表

官网看支持jdk1.8但是部分1.8版本会报错,所以我们选择jdk1.7

2、软件包装备

Cloudera Manager软件包

http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

CDH软件包 (下载对应Linux版本包)

http://archive.cloudera.com/cdh6/parcels/5.9.3/CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel

http://archive.cloudera.com/cdh6/parcels/5.9.3/CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel.sha1

mysql jdbc驱动版本是:

http://download.softagency.net/MySQL/Downloads/Connector-J/mysql-connector-java-6.0.6.tar.gz

3、系统相关配置

所有主机相同 安装JDK 关闭selinux iptables 配置/etc/hosts 配置yum

[root@hadoop01 ~]# vim /etc/profile

export JAVA_HOME=/usr/java/jdk1.7.0_79

export PATH=$JAVA_HOME/bin:$ORACLE_HOME/bin:$R_HOME/bin:$PATH

[root@hadoop01 ~]# getenforce

Disabled

[root@hadoop01 ~]# iptables -L

Chain INPUT (policy ACCEPT)

target prot opt source destination

Chain FORWARD (policy ACCEPT)

target prot opt source destination

Chain OUTPUT (policy ACCEPT)

target prot opt source destination

[root@hadoop01 ~]# cat /etc/hosts

127.0.0.1 localhost

172.16.18.133 hadoop01

172.16.18.134 hadoop02

172.16.18.136 hadoop03

172.16.18.143 hadoop04

172.16.18.145 hadoop05

[root@hadoop01 ~]# hostname

hadoop01

4、配置ssh自动登录互信

参考伪分布式ssh互信配置

每个节点验证不需要进行交互输入yes

useradd hadoop ----建立用户

ssh hadoop01 date

ssh hadoop02 date

ssh hadoop03 date

ssh hadoop04 date

ssh hadoop05 date

5、修改swap空间的swappiness=0

cat /proc/sys/vm/swappiness

sysctl vm.swappiness=0

echo 0 > /proc/sys/vm/swappiness

关闭告警:echo never > /sys/kernel/mm/transparent_hugepage/defrag

6、配置NTP服务器

先选定主服务器,其他服务器都同步这台主服务器的时间

# hwclock -w

配置开机启动

[root@hadoop01 ~]# chkconfig ntpd on

[root@hadoop01 ~]# chkconfig --list ntpd

[root@hadoop01 ~]#vi /etc/ntp.conf

(找到这一行,放开restrict的注释,并且修改ip地址)

# Hosts on local network are less restricted.

restrict 192.168.128.1 mask 255.255.255.0 nomodify notrap

(找到这一行,注释下面的server)

# Please consider joining the pool (http://www.pool.ntp.org/join.html).

#server 0.rhel.pool.ntp.org iburst

#server 1.rhel.pool.ntp.org iburst

#server 2.rhel.pool.ntp.org iburst

#server 3.rhel.pool.ntp.org iburst

添加下面两行

server 127.0.1.0 # local clock

fudge 127.0.1.0 stratum 10

配置其他服务器

vi /etc/ntp.conf

# Hosts on local network are less restricted.

restrict 192.168.128.1 nomodify notrap noquery

(注释下面server)

# Please consider joining the pool (http://www.pool.ntp.org/join.html).

#server 0.rhel.pool.ntp.org iburst

#server 1.rhel.pool.ntp.org iburst

#server 2.rhel.pool.ntp.org iburst

#server 3.rhel.pool.ntp.org iburst

指定时间服务

server 192.168.128.51

所有重启ntp服务

[root@hadoop01 ~]# service ntpd restart

[root@hadoop01 ~]# ntpstat

synchronised to NTP server (172.16.18.33) at stratum 12

time correct to within 18 ms

polling server every 64 s

[root@hadoop01 ~]# date

7、禁用ipv6和"透明大页面"

[root@hadoop01 ~]# echo "alias ipv6 off" >> /etc/modprobe.d/dist.conf

[root@hadoop01 ~]# echo "alias net-pf-10 off" >> /etc/modprobe.d/dist.conf

[root@hadoop01 ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag

[root@hadoop01 ~]# echo 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /etc/rc.local

[root@hadoop01 ~]#

8、准备好mysql数据库

修改 mysql 权限:

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

flush privileges;

delete from user where host !='%';

[root@hadoop01 software]# mysql -h 172.16.18.133 -uroot -p

###############################准备工作########################################

hadoop01 Server || Agent

hadoop02 Agent

hadoop03 Agent

CDH采用3台服务器,剩下2台做集群添加节点使用

########################################################################

10.CM安装

安装cloudera Manager Server、Agent

cdh集群节点都要安装 软件准备 账号建立

主节点:

[root@hadoop01 software]# ls cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

[root@hadoop01 software]# pwd

/opt/software

[root@hadoop01 software]# ls cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

cloudera-manager-el6-cm5.9.3_x86_64.tar.gz

[root@hadoop01 software]# mkdir /opt/cloudera-manager

[root@hadoop01 software]# tar zxvf cloudera-manager-el6-cm5.9.3_x86_64.tar.gz -C /opt/cloudera-manager/

客户端配置

/opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-agent/config.ini

server_host=hadoop01 ---在cm server主机名

[root@hadoop01 software]# useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

[root@hadoop01 software]# id cloudera-scm

uid=495(cloudera-scm) gid=492(cloudera-scm) 组=492(cloudera-scm)

haoop02 hadoop03所有从节点

useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

mkdir /opt/cloudera-manager


[root@hadoop01 opt]# scp -r /opt/cloudera-manager/cm-5.9.3 hadoop02:/opt/cloudera-manager/

[root@hadoop01 opt]# scp -r /opt/cloudera-manager/cm-5.9.3 hadoop03:/opt/cloudera-manager/

11、配置CM Server数据库

我们开始准备mysql数据库建立

[root@hadoop01 ~]# mysql -h272.16.18.133 -uroot -p

mysql>

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION;

flush privileges;

mysql> flush privileges;

[root@hadoop01 schema]# pwd

/opt/cloudera-manager/cm-5.9.3/share/cmf/schema

[root@hadoop01 schema]# ./scm_prepare_database.sh mysql -hhadoop01 -uroot -p123 --scm-host hadoop01 cmdb root 123

JAVA_HOME=/usr/java/jdk1.7.0_79

Verifying that we can write to /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

Sat Apr 28 14:20:38 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

Creating SCM configuration file in /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

Executing: /usr/java/jdk1.7.0_79/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/cloudera-manager/cm-5.9.3/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.

Sat Apr 28 14:20:39 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.

[ main] DbCommandExecutor INFO Successfully connected to database.

All done, your SCM database is configured correctly!

说明:这个脚本就是用来创建和配置CMS需要的数据库的脚本。各参数是指:

mysql:数据库用的是mysql,如果安装过程中用的oracle,那么该参数就应该改为oracle。

-hadoop01:数据库建立在hadoop01主机上面,也就是主节点上面。

-uroot:root身份运行mysql。-123:mysql的root密码是

--scm-host hadoop01 :CMS的主机,一般是和mysql安装的主机是在同一个主机上,

最后三个参数是:数据库名,数据库用户名,数据库密码。

12、制作CDH本地源

Server节点

mkdir -p /opt/cloudera/parcel-repo

chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

Agent节点

mkdir -p /opt/cloudera/parcels

chown cloudera-scm:cloudera-scm /opt/cloudera/parcels

上传到CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel manifest.json主节点/opt/cloudera/parcel-repo/路径

[root@hadoop01 CDH]# cd /opt/cloudera/parcel-repo/

[root@hadoop01 parcel-repo]# ls

CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel manifest.json

[root@hadoop01 parcel-repo]# ls

CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel manifest.json

[root@hadoop01 parcel-repo]# mv manifest.json CDH-5.9.3-1.cdh6.9.3.p0.4-el6.parcel.sha

manifest.json改名文件名与你的 parel包名一致,并加上.sha后缀

13、启动

保障mysql先启动

server:hadoop01

[root@hadoop01 init.d]# pwd

/opt/cloudera-manager/cm-5.9.3/etc/init.d

[root@hadoop01 init.d]# ./cloudera-scm-server start

Starting cloudera-scm-server:

agent:hadoop01 hadoop02 hadoop02

/opt/cloudera-manager/cm-5.9.3/etc/init.d

./cloudera-scm-agent start

正在启动 cloudera-scm-agent: [确定]

2018-04-28 14:43:37,022 INFO WebServerImpl:org.mortbay.log: jetty-6.1.26.cloudera.4

2018-04-28 14:43:37,024 INFO WebServerImpl:org.mortbay.log: Started SelectChannelConnector@0.0.0.0:7180

2018-04-28 14:43:37,024 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.

出现下面内容表示启动成功

14、图形访问

错误大全:

问题1:JDBC driver驱动

[root@hadoop01 schema]# ./scm_prepare_database.sh mysql cmdb -h hadoop01 -uroot -p123456 --scm-host hadoop01 scm scm scm

JAVA_HOME=/usr/java/jdk1.7.0_79

Verifying that we can write to /opt/cloudera-manager/cm-5.9.3/etc/cloudera-scm-server

[ main] DbProvisioner ERROR Unable to find the MySQL JDBC driver. Please make sure that you have installed it as per instruction in the installation guide.

[ main] DbProvisioner ERROR Stack Trace:

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)[:1.7.0_79]

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)[:1.7.0_79]

at java.security.AccessController.doPrivileged(Native Method)[:1.7.0_79]

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)[:1.7.0_79]

解决方法

[root@hadoop01 software]# ls mysql-connector-java-5.1.46.zip

mysql-connector-java-5.1.46.zip

[root@hadoop01 software]# unzip mysql-connector-java-5.1.46.zip ^C

[root@hadoop01 software]# cp mysql-connector-java-5.1.46/

build.xml mysql-connector-java-5.1.46-bin.jar README.txt

CHANGES mysql-connector-java-5.1.46.jar src/

COPYING README

[root@hadoop01 software]# cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46.jar /usr/share/java/

[root@hadoop01 software]# mv /usr/share/java/mysql-connector-java-5.1.46.jar /usr/share/java/mysql-connector-java.jar

问题2:

dbc url 'jdbc:mysql://hadoop01/?useUnicode=true&characterEncoding=UTF-8'

java.sql.SQLException: Access denied for user 'root'@'hadoop01' (using password: YES)

at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)[mysql-connector-java.jar:5.1.46]


0