Spark SQL 使用beeline访问hive仓库
发表于:2024-10-06 作者:千家信息网编辑
千家信息网最后更新 2024年10月06日,一、添加hive-site.xml在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据vim hive-site.xml
千家信息网最后更新 2024年10月06日Spark SQL 使用beeline访问hive仓库
一、添加hive-site.xml
在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据
vim hive-site.xml
javax.jdo.option.ConnectionURL jdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName root javax.jdo.option.ConnectionPassword 123456 hive.cli.print.header true hive.cli.print.current.db true
注意:在节点上不需要部署hive,只要是你可以连接到hive的元数据就可以!
二、启动thriftserver服务
[hadoop@hadoop003 spark]$ ./sbin/start-thriftserver.sh --jars ~/softwares/mysql-connector-java-5.1.47.jar starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out
检查日志,确认thriftserver服务正常启动
[hadoop@hadoop003 spark]$ tail -50f /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out19/05/21 09:39:14 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:15 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:15 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.019/05/21 09:39:15 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added admin role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added public role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_all_databases19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_all_databases 19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_functions: db=default pat=* 19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/73df82dd-1fd3-4dd5-97f1-680d53bd44bc_resources19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc/_tmp_space.db19/05/21 09:39:16 INFO client.HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/spark-warehouse19/05/21 09:39:16 INFO session.SessionManager: Operation log root directory is created: /tmp/hadoop/operation_logs19/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread pool size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread wait queue size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread keepalive time: 10 seconds19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: CLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: HiveServer2 is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:CLIService is started.19/05/21 09:39:16 INFO metastore.ObjectStore: ObjectStore, initialize called19/05/21 09:39:16 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing19/05/21 09:39:16 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:16 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: get_databases: default19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_databases: default 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Shutting down the object store...19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Shutting down the object store... 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Metastore shutdown complete.19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Metastore shutdown complete. 19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is started.19/05/21 09:39:16 INFO service.AbstractService: Service:HiveServer2 is started.19/05/21 09:39:16 INFO thriftserver.HiveThriftServer2: HiveThriftServer2 started19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a7c72af{/sqlserver,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e24bab6{/sqlserver/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@391e2a{/sqlserver/session,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@641cea11{/sqlserver/session/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO thrift.ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads#标志启动成功
三、启动beeline
[hadoop@hadoop003 spark]$ ./bin/beeline -u jdbc:hive2://localhost:10000 -n hadoopConnecting to jdbc:hive2://localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Supplied authorities: localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Resolved authority: localhost:1000019/05/21 09:46:19 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10000Connected to: Spark SQL (version 2.4.2)Driver: Hive JDBC (version 1.2.1.spark2)Transaction isolation: TRANSACTION_REPEATABLE_READBeeline version 1.2.1.spark2 by Apache Hive0: jdbc:hive2://localhost:10000> select * from student.student limit 5;+---------+-----------+-----------------+--------------------------------------------+--+| stu_id | stu_name | stu_phone_num | stu_email |+---------+-----------+-----------------+--------------------------------------------+--+| 1 | Burke | 1-300-746-8446 | ullamcorper.velit.in@ametnullaDonec.co.uk || 2 | Kamal | 1-668-571-5046 | pede.Suspendisse@interdumenim.edu || 3 | Olga | 1-956-311-1686 | Aenean.eget.metus@dictumcursusNunc.edu || 4 | Belle | 1-246-894-6340 | vitae.aliquet.nec@neque.co.uk || 5 | Trevor | 1-300-527-4967 | dapibus.id@acturpisegestas.net |+---------+-----------+-----------------+--------------------------------------------+--+5 rows selected (3.275 seconds)0: jdbc:hive2://localhost:10000>
启动成功
四、注意
1、最好在spark/bin目录下启动beeline
因为如果你启动sparkbeeline的机器还部署了hive,恰巧你的hive环境变量正好在spark环境变量之前,那么很可能启动的是hive的beeline
比如:
[hadoop@hadoop003 spark]$ beelinels: cannot access /home/hadoop/app/spark/lib/spark-assembly-*.jar: No such file or directorywhich: no hbase in (/home/hadoop/app/hive/bin:/home/hadoop/app/spark/bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//sbin:/home/hadoop/app/zookeeper/bin:/usr/java/jdk1.8.0_131/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)Beeline version 1.1.0-cdh6.7.0 by Apache Hive # 这不就是hive么beeline>
此时你查看下环境变量
[hadoop@hadoop003 spark]$ cat ~/.bash_profile # .bash_profile# Get the aliases and functionsif [ -f ~/.bashrc ]; then . ~/.bashrcfi# User specific environment and startup programsPATH=$PATH:$HOME/binexport PATH#####JAVA_HOME#####export JAVA_HOME=/usr/java/jdk1.8.0_131####ZOOKEEPER_HOME####export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper#####HADOOP_HOME######export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0/export SPARK_HOME=/home/hadoop/app/spark#####HIVE_HOME#####export HIVE_HOME=/home/hadoop/app/hiveexport PATH=$HIVE_HOME/bin:$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$PATH
果然如果不指定beeline路径就会优先使用hive的beeline
变量
环境
成功
数据
服务
这不
就是
文件
日志
最好
机器
标志
目录
目的
节点
路径
检查
配置
仓库
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
信息化网络安全委员会
服务器安全屋
阅趣红包软件开发
小米电视 升级 服务器
计算机网络技术用人标准
观塘区的网络技术公司
拓逊网络技术有限公司
幼儿园校园网络安全手抄报文字
数据库的连接需要网线么
千万pv的服务器配置
虚拟直播间软件开发
开区服务器怎么样
深圳专业软件开发服务费
科技网络安全教育的手抄报竖版
警察局比对数据库
onlinedb数据库是什么
甘南互联网科技
青阳自动化软件开发服务生产过程
响应国家网络安全宣传周
block一级数据库
服务器如何运作
eric数据库
长宁区信息化数据库系统研发资格
宝鸡网络技术哪家好
总体国家安全观网络安全
macos远程管理服务器
计算机网络技术学到了什么
软件开发职位做什么
拉萨网络安全培训机构费用多少
软件开发编程用什么软件好