千家信息网

Sqoop简介及安装部署

发表于:2024-10-07 作者:千家信息网编辑
千家信息网最后更新 2024年10月07日,简介:Apache Sqoop是专为Apache Hadoop和结构化数据存储如关系数据库之间的数据转换工具的有效工具。你可以使用Sqoop从外部结构化数据存储的数据导入到Hadoop分布式文件系统或
千家信息网最后更新 2024年10月07日Sqoop简介及安装部署

简介:

Apache Sqoop是专为Apache Hadoop和结构化数据存储如关系数据库之间的数据转换工具的有效工具。你可以使用Sqoop从外部结构化数据存储的数据导入到Hadoop分布式文件系统或相关系统如Hive和HBase。相反,Sqoop可以用来从Hadoop的数据提取和导出到外部结构化数据存储如关系数据库和企业数据仓库。
Sqoop专为大数据批量传输设计,能够分割数据集并创建Hadoop任务来处理每个区块。

下面介绍下安装部署的步骤:

1.下载安装包并解压

下载地址 作者使用的是sqoop-1.4.6-cdh6.7.0.tar.gz版本

# wget下载安装包 也可下载到本地 然后上传至Linux[hadoop@hadoop000 software]$ pwd/home/hadoop/software[hadoop@hadoop000 software]$ wget http://archive.cloudera.com/cdh6/cdh/5/sqoop-1.4.6-cdh6.7.0.tar.gz# 解压[hadoop@hadoop000 software]$ tar -xzvf sqoop-1.4.6-cdh6.7.0.tar.gz -C /home/hadoop/app/# 查看解压目录[hadoop@hadoop000 sqoop-1.4.6-cdh6.7.0]$ ls -lhtotal 1.9Mdrwxr-xr-x 2 hadoop hadoop  4.0K Jul  3 16:00 bin --执行脚本目录-rw-rw-r-- 1 hadoop hadoop   60K Mar 24  2016 build.xml-rw-rw-r-- 1 hadoop hadoop  1.1K Mar 24  2016 cdh.build.properties-rw-rw-r-- 1 hadoop hadoop   35K Mar 24  2016 CHANGELOG.txtdrwxr-xr-x 4 hadoop hadoop  4.0K Jul  3 16:00 cloudera-rw-rw-r-- 1 hadoop hadoop  6.8K Mar 24  2016 cloudera-pom.xml-rw-rw-r-- 1 hadoop hadoop  9.7K Mar 24  2016 COMPILING.txtdrwxr-xr-x 2 hadoop hadoop  4.0K Jul  3 16:00 conf  --配置文件目录drwxr-xr-x 5 hadoop hadoop  4.0K Jul  3 16:00 docs  --文档drwxr-xr-x 2 hadoop hadoop  4.0K Jul  3 16:00 ivy-rw-rw-r-- 1 hadoop hadoop   17K Mar 24  2016 ivy.xmldrwxr-xr-x 2 hadoop hadoop  4.0K Jul  3 16:00 lib  --lib依赖包-rw-rw-r-- 1 hadoop hadoop   15K Mar 24  2016 LICENSE.txt-rw-rw-r-- 1 hadoop hadoop   505 Mar 24  2016 NOTICE.txt-rw-rw-r-- 1 hadoop hadoop   19K Mar 24  2016 pom-old.xml-rw-rw-r-- 1 hadoop hadoop  1.1K Mar 24  2016 README.txt-rw-rw-r-- 1 hadoop hadoop 1012K Mar 24  2016 sqoop-1.4.6-cdh6.7.0.jar  --完整jar包-rw-rw-r-- 1 hadoop hadoop  6.5K Mar 24  2016 sqoop-patch-review.py-rw-rw-r-- 1 hadoop hadoop  641K Mar 24  2016 sqoop-test-1.4.6-cdh6.7.0.jardrwxr-xr-x 7 hadoop hadoop  4.0K Mar 24  2016 src  --源码drwxr-xr-x 4 hadoop hadoop  4.0K Jul  3 16:00 testdata
2.配置环境变量
# 添加sqoop环境变量 可加入全局 也可只配置个人环境变量[hadoop@hadoop000 ~]$ sudo vi/etc/profileexport SQOOP_HOME=/home/hadoop/app/sqoop-1.4.6-cdh6.7.0export PATH=$SQOOP_HOME/bin:$PATH[hadoop@hadoop000 ~]$ source /etc/profile
3.修改sqoop配置文件
[hadoop@hadoop000 conf]$ pwd/home/hadoop/app/sqoop-1.4.6-cdh6.7.0/conf[hadoop@hadoop000 conf]$ cp sqoop-env-template.sh sqoop-env.sh# 添加hadoop及hive目录[hadoop@hadoop000 conf]$ vi sqoop-env.sh#Set path to where bin/hadoop is available export HADOOP_COMMON_HOME=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0#Set path to where hadoop-*-core.jar is availableexport HADOOP_MAPRED_HOME=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0#set the path to where bin/hbase is available#export HBASE_HOME=#Set the path to where bin/hive is available                                                                                    export HIVE_HOME=/home/hadoop/app/hive-1.1.0-cdh6.7.0#Set the path for where zookeper config dir is                                                                                  #export ZOOCFGDIR=                 
4.拷贝jdbc驱动包到sqoop/lib目录下
# 将Hive lib目录下的mysql驱动包拷贝过来[hadoop@hadoop000 lib]$ pwd/home/hadoop/app/sqoop-1.4.6-cdh6.7.0/lib[hadoop@hadoop000 lib]$ cp /home/hadoop/app/hive-1.1.0-cdh6.7.0/lib/mysql-connector-java-5.1.46.jar .
5.sqoop简单测试使用
# 查看命令帮助[hadoop@hadoop000 ~]$ sqoop helpWarning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../hbase does not exist! HBase imports will fail.Please set $HBASE_HOME to the root of your HBase installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../hcatalog does not exist! HCatalog jobs will fail.Please set $HCAT_HOME to the root of your HCatalog installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../zookeeper does not exist! Accumulo imports will fail.Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.18/07/03 16:23:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh6.7.0usage: sqoop COMMAND [ARGS]Available commands:  codegen            Generate code to interact with database records  create-hive-table  Import a table definition into Hive  eval               Evaluate a SQL statement and display the results  export             Export an HDFS directory to a database table  help               List available commands  import             Import a table from a database to HDFS  import-all-tables  Import tables from a database to HDFS  import-mainframe   Import datasets from a mainframe server to HDFS  job                Work with saved jobs  list-databases     List available databases on a server  list-tables        List available tables in a database  merge              Merge results of incremental imports  metastore          Run a standalone Sqoop metastore  version            Display version informationSee 'sqoop help COMMAND' for information on a specific command.# 查看sqoop版本[hadoop@hadoop000 ~]$ sqoop versionWarning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../hbase does not exist! HBase imports will fail.Please set $HBASE_HOME to the root of your HBase installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../hcatalog does not exist! HCatalog jobs will fail.Please set $HCAT_HOME to the root of your HCatalog installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../accumulo does not exist! Accumulo imports will fail.Please set $ACCUMULO_HOME to the root of your Accumulo installation.Warning: /home/hadoop/app/sqoop-1.4.6-cdh6.7.0/../zookeeper does not exist! Accumulo imports will fail.Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.18/07/03 16:23:30 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh6.7.0Sqoop 1.4.6-cdh6.7.0git commit id Compiled by jenkins on Wed Mar 23 11:30:51 PDT 2016# 这里的警告是因为我没有配置hbase,zookeeper,HCatalog
0