千家信息网

hadoop该怎么部署

发表于:2025-02-03 作者:千家信息网编辑
千家信息网最后更新 2025年02月03日,这篇文章主要讲解了"hadoop该怎么部署",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"hadoop该怎么部署"吧!hadoop部署Hadoop介绍:
千家信息网最后更新 2025年02月03日hadoop该怎么部署

这篇文章主要讲解了"hadoop该怎么部署",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"hadoop该怎么部署"吧!

hadoop部署

Hadoop介绍:

广义: 以apache hadoop软件为主的生态圈(hive zookeeper spark hbase)

狭义: 单指apache hadoop软件

相关官网:

hadoop.apache.org

hive.apache.org

spark.apache.org

cdh-hadoop:http://archive.cloudera.com/cdh6/cdh/5/hadoop-2.6.0-cdh6.7.0.tar.gz

hadoop软件及版本:

1.x 企业不用

2.x 主流

3.x 没有企业敢用

a.采坑

b.很多公司都是CDH5.x部署大数据环境 (www.cloudera.com),即2.6.0-cdh6.7.0 =? apache hadoop2.6.0

很多公司都是CDH5.X部署大数据环境(www.cloudera.com),相当于是把一个生态圈的组件,集中成为一个系统。

作为基础环境,里面装的2.6.0-cdh6.7.0,注意此版本不等于apache hadoop2.6.0,因为

cdh6.7.0中hadoop做了bug升级。

hadoop软件:

hdfs:存储 分布式文件系统

mapreduce:计算。用java计算job1,job2,但企业不用java(开发难度大,代码复杂)

yarn: 资源和作业调度(cpu memory分配),即:哪个作业分配到哪个节点中调度。

--如果需要按照ssh

Ubuntu Linux:

$ sudo apt-get install ssh

$ sudo apt-get install rsync

----------------------------------------------------------------------------------------------------

安装部分:

环境:CentOS 伪分布安装:即单节点安装

HADOOP版本:hadoop-2.6.0-cdh6.7.0.tar.gz

JDK版本:jdk-8u45-linux-x64.gz

安装原则:不同软件需要指定对应的用户

linux root用户

mysql mysqladmin用户

hadoop hadoop用户

1.创建hadoop用户和上传hadoop软件

******************************

useradd hadoop

su - hadoop

mkdir app

cd app/

上传hadoop包

结果如下:

[hadoop@hadoop app]$ pwd

/home/hadoop/app

[hadoop@hadoop app]$ ls -l

total 304288

drwxr-xr-x 15 hadoop hadoop 4096 Feb 14 23:37 hadoop-2.6.0-cdh6.7.0

-rw-r--r-- 1 root root 311585484 Feb 14 17:32 hadoop-2.6.0-cdh6.7.0.tar.gz

***********************************

2.部署jdk ,要用CDH版本的JDK

***********************************

创建JDK目录,上传JDK包要用CDH版本的JDK

su - root

mkdir /usr/java #上传JDK包到此目录

mkdir /usr/share/java #部署CDH环境时jdbc jar包需要放到此目录,否则报错

cd /usr/java

tar -xzvf jdk-8u45-linux-x64.gz #解压JDK

drwxr-xr-x 8 uucp 143 4096 Apr 11 2015 jdk1.8.0_45 #注意解压后用户、组是不对的,需要改用户组为root:root

chown -R root:root jdk1.8.0_45

drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.8.0_45

结果如下:

[root@hadoop java]# pwd

/usr/java

[root@hadoop java]# ll

total 169216

drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.8.0_45

-rw-r--r-- 1 root root 173271626 Jan 26 18:35 jdk-8u45-linux-x64.gz

*****************************************

3.设置java环境变量

su - root

vi /etc/profile

export JAVA_HOME=/usr/java/jdk1.8.0_45

export JRE_HOME=$JAVA_HOME/jre

export CLASSPATH=.:$JAVA_HOME/lib:$JER_HOME/lib:$CLASSPATH

export PATH=$JAVA_HOME/bin:$JER_HOME/bin:$PATH

source /etc/profile

[root@hadoop java]# which java

/usr/java/jdk1.8.0_45/bin/java

**********************

4.解压hadoop

su - hadoop

cd /home/hadoop/app

[hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh6.7.0.tar.gz

[hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh6.7.0

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ ll

total 76

drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 bin 可执行脚本

drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 bin-mapreduce1

drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 cloudera

drwxr-xr-x 6 hadoop hadoop 4096 Mar 24 2016 etc 配置目录(conf)

drwxr-xr-x 5 hadoop hadoop 4096 Mar 24 2016 examples

drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 examples-mapreduce1

drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 include

drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 lib jar包目录

drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 libexec

drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 sbin hadoop组件的启动 停止脚本

drwxr-xr-x 4 hadoop hadoop 4096 Mar 24 2016 share

drwxr-xr-x 17 hadoop hadoop 4096 Mar 24 2016 src

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

*********************************************************

4.解压并配置hadoop

su - hadoop

cd app

tar -xzvf hadoop-2.6.0-cdh6.7.0.tar.gz


cd /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/etc/hadoop

vi core-site.xml

fs.defaultFS

hdfs://localhost:9000

vi hdfs-site.xml

dfs.replication

1

配置hadoop的环境变量,否则会在启动时候报错

vi /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/etc/hadoop/hadoop-env.sh

export HADOOP_CONF_DIR=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0/etc/hadoop

export JAVA_HOME=/usr/java/jdk1.8.0_45

*****************************

*****************************

5.配置ssh localhost无密码信任关系

su - hadoop

ssh-keygen #一直回车

cd .ssh #可以看到两个文件


cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys #生成authorized_keys信任文件

ssh localhost date

The authenticity of host 'localhost (127.0.0.1)' can't be established.

RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (RSA) to the list of known hosts.

Wed Feb 13 22:41:17 CST 2019

chmod 600 authorized_keys #非常重要,如果不更改权限,执行ssh localhost date时会让输入密码,但hadoop用户根本无密码,此时就是权限搞的猫腻。

**********************************

6.格式化

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs namenode -format

***************************************

cd /home/hadoop/app/hadoop-2.6.0-cdh6.7.0

bin/hdfs namenode -format #为何进入bin 再 hdfs namenode -format说找不到hdfs命令

***************************************

7.启动hadoop服务

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ sbin/start-dfs.sh

19/02/13 22:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [localhost]

localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-namenode-hadoop002.out

localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-datanode-hadoop002.out

Starting secondary namenodes [0.0.0.0]

The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.

RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.

Are you sure you want to continue connecting (yes/no)? yes #输入yes,,因为ssh 信任关系 是配置的是localhost,而非0.0.0.0

0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.

0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out

19/02/13 22:49:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ sbin/stop-dfs.sh

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ sbin/start-dfs.sh

19/02/13 22:57:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [localhost]

localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-namenode-hadoop002.out

localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-datanode-hadoop002.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh6.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out

19/02/13 22:57:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ jps #检验是否正常启动,需启动以下四个服务

15059 Jps

14948 SecondaryNameNode 第二名称节点 老二

14783 DataNode 数据节点 小弟

14655 NameNode 名称节点 老大 读写

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

open http://ip:50070 #安装成功可以打开hadoop的web管理界面:如图


8.配置hadoop命令环境变量

***************************************************************

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ cat ~/.bash_profile

# .bash_profile

# Get the aliases and functions

if [ -f ~/.bashrc ]; then

. ~/.bashrc

fi

# User specific environment and startup programs

export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0

export PATH=$HADOOP_PREFIX/bin:$PATH

source ~/.bash_profile

/home/hadoop/app/hadoop-2.6.0-cdh6.7.0

***************************************************************

9.操作hadoop, hdfs dfs操作命令和Linux命令极其相似

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs dfs -ls /

19/02/13 23:08:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs dfs -ls /

19/02/13 23:11:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ ls /

bin dev home lib64 media opt root sbin srv tmp var

boot etc lib lost+found mnt proc run selinux sys usr

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs dfs -mkdir /ruozedata

19/02/13 23:11:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs dfs -ls /

19/02/13 23:11:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Found 1 items

drwxr-xr-x - hadoop supergroup 0 2019-02-13 23:11 /ruozedata

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ ls /

bin dev home lib64 media opt root sbin srv tmp var

boot etc lib lost+found mnt proc run selinux sys usr

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$

10.查看帮助

[hadoop@hadoop002 hadoop-2.6.0-cdh6.7.0]$ bin/hdfs --help

作业:

1.ssh博客 阅读 摘抄

http://blog.itpub.net/30089851/viewspace-1992210/

http://blog.itpub.net/30089851/viewspace-2127102/

2.部署hdfs伪分布式

3.博客要写到hdfs伪分布式

小提示:

如果 su - zookeeper不能切换

解决方法:

更改:/etc/passwd中zookeeper用户的登录方式由/sbin/nologin==>/bin/bash即可

感谢各位的阅读,以上就是"hadoop该怎么部署"的内容了,经过本文的学习后,相信大家对hadoop该怎么部署这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是,小编将为大家推送更多相关知识点的文章,欢迎关注!

0