Spark SQL 使用beeline访问hive仓库
发表于:2025-01-31 作者:千家信息网编辑
千家信息网最后更新 2025年01月31日,一、添加hive-site.xml在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据vim hive-site.xml
千家信息网最后更新 2025年01月31日Spark SQL 使用beeline访问hive仓库
一、添加hive-site.xml
在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据
vim hive-site.xml
javax.jdo.option.ConnectionURL jdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName root javax.jdo.option.ConnectionPassword 123456 hive.cli.print.header true hive.cli.print.current.db true
注意:在节点上不需要部署hive,只要是你可以连接到hive的元数据就可以!
二、启动thriftserver服务
[hadoop@hadoop003 spark]$ ./sbin/start-thriftserver.sh --jars ~/softwares/mysql-connector-java-5.1.47.jar starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out
检查日志,确认thriftserver服务正常启动
[hadoop@hadoop003 spark]$ tail -50f /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out19/05/21 09:39:14 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:15 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:15 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.019/05/21 09:39:15 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added admin role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added public role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_all_databases19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_all_databases 19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_functions: db=default pat=* 19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/73df82dd-1fd3-4dd5-97f1-680d53bd44bc_resources19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc/_tmp_space.db19/05/21 09:39:16 INFO client.HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/spark-warehouse19/05/21 09:39:16 INFO session.SessionManager: Operation log root directory is created: /tmp/hadoop/operation_logs19/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread pool size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread wait queue size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread keepalive time: 10 seconds19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: CLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: HiveServer2 is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:CLIService is started.19/05/21 09:39:16 INFO metastore.ObjectStore: ObjectStore, initialize called19/05/21 09:39:16 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing19/05/21 09:39:16 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:16 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: get_databases: default19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_databases: default 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Shutting down the object store...19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Shutting down the object store... 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Metastore shutdown complete.19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Metastore shutdown complete. 19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is started.19/05/21 09:39:16 INFO service.AbstractService: Service:HiveServer2 is started.19/05/21 09:39:16 INFO thriftserver.HiveThriftServer2: HiveThriftServer2 started19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a7c72af{/sqlserver,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e24bab6{/sqlserver/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@391e2a{/sqlserver/session,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@641cea11{/sqlserver/session/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO thrift.ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads#标志启动成功
三、启动beeline
[hadoop@hadoop003 spark]$ ./bin/beeline -u jdbc:hive2://localhost:10000 -n hadoopConnecting to jdbc:hive2://localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Supplied authorities: localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Resolved authority: localhost:1000019/05/21 09:46:19 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10000Connected to: Spark SQL (version 2.4.2)Driver: Hive JDBC (version 1.2.1.spark2)Transaction isolation: TRANSACTION_REPEATABLE_READBeeline version 1.2.1.spark2 by Apache Hive0: jdbc:hive2://localhost:10000> select * from student.student limit 5;+---------+-----------+-----------------+--------------------------------------------+--+| stu_id | stu_name | stu_phone_num | stu_email |+---------+-----------+-----------------+--------------------------------------------+--+| 1 | Burke | 1-300-746-8446 | ullamcorper.velit.in@ametnullaDonec.co.uk || 2 | Kamal | 1-668-571-5046 | pede.Suspendisse@interdumenim.edu || 3 | Olga | 1-956-311-1686 | Aenean.eget.metus@dictumcursusNunc.edu || 4 | Belle | 1-246-894-6340 | vitae.aliquet.nec@neque.co.uk || 5 | Trevor | 1-300-527-4967 | dapibus.id@acturpisegestas.net |+---------+-----------+-----------------+--------------------------------------------+--+5 rows selected (3.275 seconds)0: jdbc:hive2://localhost:10000>
启动成功
四、注意
1、最好在spark/bin目录下启动beeline
因为如果你启动sparkbeeline的机器还部署了hive,恰巧你的hive环境变量正好在spark环境变量之前,那么很可能启动的是hive的beeline
比如:
[hadoop@hadoop003 spark]$ beelinels: cannot access /home/hadoop/app/spark/lib/spark-assembly-*.jar: No such file or directorywhich: no hbase in (/home/hadoop/app/hive/bin:/home/hadoop/app/spark/bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//sbin:/home/hadoop/app/zookeeper/bin:/usr/java/jdk1.8.0_131/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)Beeline version 1.1.0-cdh6.7.0 by Apache Hive # 这不就是hive么beeline>
此时你查看下环境变量
[hadoop@hadoop003 spark]$ cat ~/.bash_profile # .bash_profile# Get the aliases and functionsif [ -f ~/.bashrc ]; then . ~/.bashrcfi# User specific environment and startup programsPATH=$PATH:$HOME/binexport PATH#####JAVA_HOME#####export JAVA_HOME=/usr/java/jdk1.8.0_131####ZOOKEEPER_HOME####export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper#####HADOOP_HOME######export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0/export SPARK_HOME=/home/hadoop/app/spark#####HIVE_HOME#####export HIVE_HOME=/home/hadoop/app/hiveexport PATH=$HIVE_HOME/bin:$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$PATH
果然如果不指定beeline路径就会优先使用hive的beeline
变量
环境
成功
数据
服务
这不
就是
文件
日志
最好
机器
标志
目录
目的
节点
路径
检查
配置
仓库
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
河北前端软件开发定制
网络安全相关的期刊排名
软件开发流程图visio
沧县网络技术
数据库安全的前景
淘宝卖服务器发哪个类目
网络安全周杭州
金蝶私有云服务器怎么启动
福建企业软件开发市场价
湖南金谱刷互联网科技有限公司
安徽软件开发工资标准
软件开发项目规范清华大学
hosts最新数据库文件
数据库 图书
软件开发中需要分析的主要任务是
宝山区工商数据库系统研发好处
全球同服游戏服务器实现
软件开发工作做什么
mysql用什么软件开发
大工数据库原理离线作业作业答案
大同软件开发行业标准
curl命令获取服务器时间
互联网科技与制造
飞宇互联网科技
科技巨头布局互联网券商
软件开发公司面试需要学历吗
南京综合软件开发销售价格
华为服务器集中管理
数据库应用程序代码
公安网络安全考试如何重考