导航：首页 > 互联网科技 >

hive on Spark部署

发表于：2025-01-31 作者：千家信息网编辑

千家信息网最后更新 2025年01月31日，一、环境1、zk集群10.10.103.144:2181,10.10.103.246:2181,10.10.103.62:21812、metastore数据库10.10.103.246:3306二、安

千家信息网最后更新 2025年01月31日hive on Spark部署

一、环境

1、zk集群

10.10.103.144:2181,10.10.103.246:2181,10.10.103.62:2181

2、metastore数据库

10.10.103.246:3306

二、安装

1、安装配置数据库

yum -y install mysql55-server mysql55

GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'10.10.103.246' IDENTIFIED BY 'hive';GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'127.0.0.1' IDENTIFIED BY 'hive';CREATE DATABASE IF NOT EXISTS metastore;USE metastore;SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-1.1.0.mysql.sql；#执行这个会报错，然后在执行下面sqlsource  /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-txn-schema-0.13.0.mysql.sql;

2、安装hive

yum -y install hive hive-jdbc hive-metastore hive-server2

3、配置

vim /etc/hive/conf/hive-site.xml   hive.execution.engine  spark  hive.enable.spark.execution.engine  true  spark.master  yarn-client  spark.enentLog.enabled  true  spark.enentLog.dir  hdfs://mycluster:8020/spark-log  spark.serializer  org.apache.spark.serializer.KryoSerializer  spark.executor.memeory  1g  spark.driver.memeory  1g  spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"  hive.metastore.uris  thrift://10.10.103.246:9083  hive.metastore.local  false  javax.jdo.option.ConnectionURL  jdbc:mysql://10.10.103.246/metastore  javax.jdo.option.ConnectionDriverName  com.mysql.jdbc.Driver  javax.jdo.option.ConnectionUserName  hive  javax.jdo.option.ConnectionPassword  hive  datanucleus.autoCreateSchema  false  datanucleus.fixedDatastore  true  datanucleus.autoStartMechanism   SchemaTable   hive.support.concurrency  true  hive.zookeeper.quorum  10.10.103.144:2181,10.10.103.246:2181,10.10.103.62:2181  hive.aux.jars.path    file:///usr/lib/hive/lib/zookeeper.jar  hive.metastore.schema.verificationfalse

4、启动metastore服务

/etc/init.d/hive-metastore start

5、验证

[root@ip-10-10-103-246 conf]# hiveJava HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.017/05/12 15:04:47 WARN conf.HiveConf: HiveConf of name hive.metastore.local does not exist17/05/12 15:04:47 WARN conf.HiveConf: HiveConf of name hive.enable.spark.execution.engine does not existLogging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.propertiesWARNING: Hive CLI is deprecated and migration to Beeline is recommended.hive>     >     >     > create table navy1(ts BIGINT,line STRING); OKTime taken: 0.925 secondshive> select count(*) from navy1;Query ID = root_20170512150505_8f7fb28e-cf32-4efc-bb95-6add37f13fb6Total jobs = 1Launching Job 1 out of 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=Starting Spark Job = f045ab15-baaa-40e7-9641-d821fa313abeRunning with YARN Application = application_1494472050574_0014Kill Command = /usr/lib/hadoop/bin/yarn application -kill application_1494472050574_0014Query Hive on Spark job[0] stages:01Status: Running (Hive on Spark job[0])Job Progress FormatCurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]2017-05-12 15:05:30,835 Stage-0_0: 0(+1)/1      Stage-1_0: 0/12017-05-12 15:05:33,853 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 FinishedStatus: Finished successfully in 16.05 secondsOK0Time taken: 19.325 seconds, Fetched: 1 row(s)hive>

6、遇到的问题

报错：

hive> select count(*) from test;Query ID = root_20170512143232_48d9f363-7b60-4414-9310-e6348104f476Total jobs = 1Launching Job 1 out of 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.initiateSparkConf(HiveSparkClientFactory.java:74)        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.setup(SparkSessionManagerImpl.java:81)        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:102)        at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:99)        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1979)        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1692)        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1424)        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1208)        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1198)        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:775)        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at java.lang.reflect.Method.invoke(Method.java:498)        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)        ... 24 moreFAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. org/apache/hadoop/hbase/HBaseConfiguration

解决：