千家信息网

hadoop-2.6.2 lzo的配置过程

发表于:2025-01-24 作者:千家信息网编辑
千家信息网最后更新 2025年01月24日,本篇内容介绍了"hadoop-2.6.2 lzo的配置过程"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所
千家信息网最后更新 2025年01月24日hadoop-2.6.2 lzo的配置过程

本篇内容介绍了"hadoop-2.6.2 lzo的配置过程"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!

Hadoop集群说明

集群有三台主机,主机名分别是:bi10,bi12,bi13。我们的操作都在bi10上面进行。

安装依赖包

安装lzo需要一些依赖包,如果你已经安装过了,那么可以跳过这一步。首先你需要切换到root用户下

yum install gcc gcc-c++ kernel-develyum install git

除了以上两个之外,你还需要配置maven环境,下载之后直接解压并配置环境变量即可使用

wget http://apache.fayea.com/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gztar -xzf apache-maven-3.3.9-bin.tar.gz

配置maven环境变量,maven软件包放置到/home/hadoop/work/apache-maven-3.3.9

[hadoop@bi10 hadoop-2.6.2]$ vim ~/.bash_profile#init maven environmentexport MAVEN_HOME=/home/hadoop/work/apache-maven-3.3.9export PATH=$PATH:$MAVEN_HOME/bin
LZO安装

下载lzo安装包

[hadoop@bi10 apps]$ wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.09.tar.gz

解压并编译安装lzo到:/usr/local/hadoop/lzo/,安装时切换到root用户下

[hadoop@bi10 apps]$ tar -xzf lzo-2.09.tar.gz [hadoop@bi10 apps]$ cd lzo-2.09[hadoop@bi10 apps]$ su root[root@bi10 lzo-2.09]$ ./configure -enable-shared -prefix=/usr/local/hadoop/lzo/ [root@bi10 lzo-2.09]$ make && make test && make install

查看安装目录

[hadoop@bi10 lzo-2.09]$ ls /usr/local/hadoop/lzo/include  lib  share
HADOOP-LZO安装

下载hadoop-lzo

git clone https://github.com/twitter/hadoop-lzo.git

设置环境变量,并使用maven编译

[hadoop@bi10 hadoop-lzo]$ export CFLAGS=-m64[hadoop@bi10 hadoop-lzo]$ export CXXFLAGS=-m64[hadoop@bi10 hadoop-lzo]$ export C_INCLUDE_PATH=/usr/local/hadoop/lzo/include[hadoop@bi10 hadoop-lzo]$ export LIBRARY_PATH=/usr/local/hadoop/lzo/lib[hadoop@bi10 hadoop-lzo]$ mvn clean package -Dmaven.test.skip=true

将编译好的文件拷贝到hadoop的安装目录

[hadoop@bi10 hadoop-lzo]$ tar -cBf - -C target/native/Linux-amd64-64/lib . | tar -xBvf - -C $HADOOP_HOME/lib/native/[hadoop@bi10 hadoop-lzo]$ cp target/hadoop-lzo-0.4.20-SNAPSHOT.jar $HADOOP_HOME/share/hadoop/common/[hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi12:$HADOOP_HOME/share/hadoop/common/[hadoop@bi10 hadoop-lzo]$ scp target/hadoop-lzo-0.4.20-SNAPSHOT.jar bi13:$HADOOP_HOME/share/hadoop/common/

将编译好的文件分别复制到集群其他机器对应的目录,其中native目录需要先打包再拷贝到集群的其他机器上,然后解压。

tar -czf hadoop-native.tar.gz /$HADOOP_HOME/lib/native/scp hadoop-native.tar.gz bi12:/$HADOOP_HOME/libscp hadoop-native.tar.gz bi13:/$HADOOP_HOME/lib
修改hadoop配置文件

修改hadoop-env.sh,增加一条

# The lzo libraryexport LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib

修改core-site.xml

      io.compression.codecs    org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec        io.compression.codec.lzo.class    com.hadoop.compression.lzo.LzoCodec  

修改mapred-site.xml

        mapred.compress.map.output    true        mapred.map.output.compression.codec    com.hadoop.compression.lzo.LzoCodec        mapred.child.env    LD_LIBRARY_PATH=/usr/local/hadoop/lzo/lib  

拷贝三个配置文件到集群其他机器

scp etc/hadoop/hadoop-env.sh bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/scp etc/hadoop/hadoop-env.sh bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/scp etc/hadoop/core-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/scp etc/hadoop/core-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/scp etc/hadoop/mapred-site.xml bi12:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/scp etc/hadoop/mapred-site.xml bi13:/home/hadoop/work/hadoop-2.6.2/etc/hadoop/
测试hadoop lzo

安装lzop,需要切换到root用户下

yum install lzop

进入hadoop安装目录然后对LICENSE.txt执行lzo压缩,会生成一个lzo压缩文件LICENSE.txt.lzo

lzop LICENSE.txt

上传压缩文件到hdfs

[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -mkdir /user/hadoop/wordcount/lzoinput[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -put LICENSE.txt.lzo /user/hadoop/wordcount/lzoinput[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinputFound 1 items-rw-r--r--   2 hadoop supergroup       7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo

对lzo压缩文件建立索引

hadoop jar ./share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar com.hadoop.compression.lzo.DistributedLzoIndexer /user/hadoop/wordcount/lzoinput/[hadoop@bi10 hadoop-2.6.2]$ hdfs dfs -ls /user/hadoop/wordcount/lzoinput/Found 2 items-rw-r--r--   2 hadoop supergroup       7773 2016-02-16 20:59 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo-rw-r--r--   2 hadoop supergroup          8 2016-02-16 21:02 /user/hadoop/wordcount/lzoinput/LICENSE.txt.lzo.index

对lzo压缩文件执行wordcount

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /user/hadoop/wordcount/lzoinput/ /user/hadoop/wordcount/output2

"hadoop-2.6.2 lzo的配置过程"的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注网站,小编将为大家输出更多高质量的实用文章!

0