千家信息网

Hive 部署

发表于:2025-01-23 作者:千家信息网编辑
千家信息网最后更新 2025年01月23日,安装mysql ,并创建hive库及授权如下操作:[root@oversea-stable mysql]# systemctl start mysqld[root@oversea-stable mys
千家信息网最后更新 2025年01月23日Hive 部署

安装mysql ,并创建hive库及授权如下操作:

[root@oversea-stable mysql]# systemctl start mysqld[root@oversea-stable mysql]# grep temporary /var/log/mysqld.log 2018-06-15T06:51:27.814464Z 1 [Note] A temporary password is generated for root@localhost: rdhC?1g+NSH,2018-06-15T06:51:31.666047Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables[root@oversea-stable mysql]# mysql -pEnter password: Welcome to the MySQL monitor.  Commands end with ; or \g.Your MySQL connection id is 2Server version: 5.7.22Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql> alter user 'root'@'localhost' identified by '********';Query OK, 0 rows affected (0.00 sec)mysql> create database hive character set utf8;Query OK, 1 row affected (0.00 sec)mysql> grant all privileges on hive.* to 'hive'@'localhost' identified by '********';Query OK, 0 rows affected, 1 warning (0.00 sec)mysql> grant all privileges on hive.* to 'hive'@'%' identified by '********'; Query OK, 0 rows affected, 1 warning (0.00 sec)mysql> flush privileges;Query OK, 0 rows affected (0.01 sec)mysql> quitBye[root@oversea-stable ~]# systemctl enable mysqld

下载hive 2.3.3版本,并解压如下所示:

[hadoop@bus-stable ~]$ wget http://mirrors.hust.edu.cn/apache/hive/stable-2/apache-hive-2.3.3-bin.tar.gz--2018-06-15 15:00:41--  http://mirrors.hust.edu.cn/apache/hive/stable-2/apache-hive-2.3.3-bin.tar.gzResolving mirrors.hust.edu.cn (mirrors.hust.edu.cn)... 202.114.18.160Connecting to mirrors.hust.edu.cn (mirrors.hust.edu.cn)|202.114.18.160|:80... connected.HTTP request sent, awaiting response... 200 OKLength: 232229830 (221M) [application/octet-stream]Saving to: 'apache-hive-2.3.3-bin.tar.gz'100%[==========================================================================>] 232,229,830  635KB/s   in 12m 2s 2018-06-15 15:12:43 (314 KB/s) - 'apache-hive-2.3.3-bin.tar.gz' saved [232229830/232229830][hadoop@bus-stable ~]$ tar xfz apache-hive-2.3.3-bin.tar.gz  -C /opt/[hadoop@bus-stable ~]$ cd /opt/[hadoop@bus-stable opt]$ ln -s apache-hive-2.3.3-bin hive[hadoop@bus-stable opt]$ cd hive/conf/ 默认没有提供hive-site.xml,仅提供了配置模板hive-default.xml.template,需要把它复制成hive-site.xml[hadoop@bus-stable conf]$ cp hive-env.sh{.template,}[hadoop@bus-stable conf]$ cp hive-default.xml.template hive-site.xml[hadoop@bus-stable conf]$ [hadoop@bus-stable conf]$ vim hive-env.sh[hadoop@bus-stable conf]$ tail -8 hive-env.sh# Set HADOOP_HOME to point to a specific hadoop install directoryHADOOP_HOME=/opt/hadoop# Hive Configuration Directory can be controlled by:export HIVE_CONF_DIR=/opt/hive/conf# Folder containing extra libraries required for hive compilation/execution can be controlled by:export HIVE_AUX_JARS_PATH=/opt/hive/lib[hadoop@bus-stable conf]$ 

配置JDBC
复制JDBC的连接文件到 hive/lib 下,如下操作:

[hadoop@bus-stable hive]$ cd /tmp/[hadoop@bus-stable tmp]$ tar xfz mysql-connector-java-5.1.46.tar.gz [hadoop@bus-stable tmp]$ cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /opt/hive/lib/[hadoop@bus-stable tmp]$ [root@bus-stable ~]# vim /etc/profileexport HIVE_HOME=/opt/hivePATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH

在HDFS上创建hive所用目录
因为在 hive-site.xml 中有默认如下配置:

      hive.metastore.warehouse.dir    /user/hive/warehouse    location of default database for the warehouse  

所以进入 hadoop 安装目录 执行hadoop命令,在hdfs上新建/user/hive/warehouse目录,并授权用于存储文件,如下操作:

[hadoop@bus-stable opt]$ cd hadoop[hadoop@bus-stable hadoop]$ bin/hadoop fs -mkdir -p /user/hive/warehouse  [hadoop@bus-stable hadoop]$ bin/hadoop fs -mkdir -p /user/hive/tmp [hadoop@bus-stable hadoop]$ bin/hadoop fs -mkdir -p /user/hive/log  [hadoop@bus-stable hadoop]$ bin/hadoop fs -chmod -R 777 /user/hive/warehouse  [hadoop@bus-stable hadoop]$ bin/hadoop fs -chmod -R 777 /user/hive/tmp  [hadoop@bus-stable hadoop]$ bin/hadoop fs -chmod -R 777 /user/hive/log[hadoop@bus-stable hadoop]$用以下命令检查目录是否创建成功[hadoop@bus-stable hadoop]$ hadoop fs -ls /user/hiveFound 3 itemsdrwxrwxrwx   - hadoop supergroup          0 2018-06-15 16:13 /user/hive/logdrwxrwxrwx   - hadoop supergroup          0 2018-06-15 16:13 /user/hive/tmpdrwxrwxrwx   - hadoop supergroup          0 2018-06-15 16:13 /user/hive/warehouse[hadoop@bus-stable hadoop]$ 

修改 hive-site.xml
(1) 搜索hive.exec.scratchdir,将该name对应的value修改为/user/hive/tmp

      hive.exec.scratchdir      /user/hive/tmp    

(2) 搜索hive.querylog.location,将该name对应的value修改为/user/hive/log/hadoop

    hive.querylog.location    /user/hive/log/hadoop    Location of Hive run time structured log file

(3) 搜索javax.jdo.option.connectionURL,将该name对应的value修改为MySQL的地址

    javax.jdo.option.ConnectionURL    jdbc:mysql://oversea-stable:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false          JDBC connect string for a JDBC metastore.      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.      

(4) 搜索javax.jdo.option.ConnectionDriverName,将该name对应的value修改为MySQL驱动类路径

    javax.jdo.option.ConnectionDriverName    com.mysql.jdbc.Driver    Driver class name for a JDBC metastore

(5) 搜索javax.jdo.option.ConnectionUserName,将对应的value修改为MySQL数据库登录名

    javax.jdo.option.ConnectionUserName    hive    Username to use against metastore database

(6) 搜索javax.jdo.option.ConnectionPassword,将对应的value修改为MySQL数据库的登录密码

    javax.jdo.option.ConnectionPassword    hive_db_password    password to use against metastore database

(7) 指定hive 运行的tmp目录
创建 tmp 目录
[hadoop@bus-stable hive]$ mkdir tmp
并在 hive-site.xml 中修改
把 ${system:java.io.tmpdir} 改成 /opt/hive/tmp
把 {system:user.name} 改成 {user.name}

初始化mysql 如下操作:

[hadoop@bus-stable hive]$ bin/schematool -initSchema -dbType mysql SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Metastore connection URL:        jdbc:mysql://oversea-stable:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=falseMetastore Connection Driver :    com.mysql.jdbc.DriverMetastore connection User:       hiveStarting metastore schema initialization to 2.3.0Initialization script hive-schema-2.3.0.mysql.sqlInitialization script completedschemaTool completed[hadoop@bus-stable hive]$ 

运行hive,并测试 如下操作:

[hadoop@bus-stable hive]$ hivewhich: no hbase in (/usr/java/latest/bin:/opt/hadoop/bin:/opt/hive/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/home/hadoop/bin)SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Logging initialized using configuration in jar:file:/opt/apache-hive-2.3.3-bin/lib/hive-common-2.3.3.jar!/hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive>  create database inspiry; OKTime taken: 5.999 secondshive> use inspiry; OKTime taken: 0.032 secondshive>  create table test (mykey string,myval string);OKTime taken: 0.657 secondshive> insert into test values("1","www.inspiry.cn");WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.Query ID = hadoop_20180615164443_64b15698-8522-45d0-8867-a0271f2c88beTotal jobs = 3Launching Job 1 out of 3Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1529036356241_0005, Tracking URL = http://oversea-stable:8088/proxy/application_1529036356241_0005/Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1529036356241_0005Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02018-06-15 16:45:08,423 Stage-1 map = 0%,  reduce = 0%2018-06-15 16:45:17,910 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.58 secMapReduce Total cumulative CPU time: 2 seconds 580 msecEnded Job = job_1529036356241_0005Stage-4 is selected by condition resolver.Stage-3 is filtered out by condition resolver.Stage-5 is filtered out by condition resolver.Moving data to directory hdfs://inspiryhdfs/user/hive/warehouse/inspiry.db/test/.hive-staging_hive_2018-06-15_16-44-43_063_5287772895297314577-1/-ext-10000Loading data to table inspiry.testMapReduce Jobs Launched: Stage-Stage-1: Map: 1   Cumulative CPU: 2.58 sec   HDFS Read: 4081 HDFS Write: 85 SUCCESSTotal MapReduce CPU Time Spent: 2 seconds 580 msecOKTime taken: 37.688 secondshive>  select * from test;OK1       www.inspiry.cnTime taken: 0.235 seconds, Fetched: 1 row(s)hive> quit;[hadoop@bus-stable hive]$ 

在浏览器中查看HDFS 下inspiry库test表 如下图所示:

0