导航：首页 > 互联网科技 >

0010-Hive多分隔符支持示例

发表于：2024-11-24 作者：千家信息网编辑

千家信息网最后更新 2024年11月24日，温馨提示：要看高清无码套图，请使用手机打开并单击图片放大查看。1.问题描述如何将多个字符作为字段分割符的数据文件加载到Hive表中，事例数据如下：字段分隔符为"@#$"test1@#$test1nam

千家信息网最后更新 2024年11月24日0010-Hive多分隔符支持示例

温馨提示：要看高清无码套图，请使用手机打开并单击图片放大查看。

1.问题描述

如何将多个字符作为字段分割符的数据文件加载到Hive表中，事例数据如下：

字段分隔符为"@#$"

test1@#$test1name@#$test2valuetest2@#$test2name@#$test2valuetest3@#$test3name@#$test4value

如何将上述事例数据加载到Hive表(multi_delimiter_test)中，表结构如下：

字段名	字段类型
s1	String
s2	String
s3	String

2.Hive多分隔符支持

Hive在0.14及以后版本支持字段的多分隔符，参考https://cwiki.apache.org/confluence/display/Hive/MultiDelimitSerDe

3.实现方式

测试环境说明
测试环境为CDH5.11.1Hive版本为1.1.0操作系统为RedHat6.5
操作步骤

1.准备多分隔符文件并装载到HDFS对应目录

[ec2-user@ip-172-31-8-141  ~]$ cat multi_delimiter_test.dattest1@#$test1name@#$test2valuetest2@#$test2name@#$test2valuetest3@#$test3name@#$test4value  [ec2-user@ip-172-31-8-141  ~]$ hadoop dfs -put multi_delimiter_test.dat /fayson/multi_delimiter_test[ec2-user@ip-172-31-8-141  ~]$ hadoop dfs -ls /fayson/multi_delimiter_testDEPRECATED: Use of this  script to execute hdfs command is deprecated.Instead use the hdfs  command for it.Found 1 items-rw-r--r--   3 user_r supergroup         93 2017-08-23 03:24  /fayson/multi_delimiter_test/multi_delimiter_test.dat[ec2-user@ip-172-31-8-141  ~]$

2.基于准备好的多分隔符文件建表

create  external table multi_delimiter_test(s1 string,s2 string,s3 string)ROW FORMAT  SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH  SERDEPROPERTIES ("field.delim"="@#$")stored as  textfile location '/fayson/multi_delimiter_test';

3.测试

2: jdbc:hive2://localhost:10000/default>  select * from multi_delimiter_test;+--------------------------+--------------------------+--------------------------+--+|  multi_delimiter_test.s1  |  multi_delimiter_test.s2  |  multi_delimiter_test.s3  |+--------------------------+--------------------------+--------------------------+--+| test1                    | test1name                | test2value               || test2                    | test2name                | test2value               || test3                    | test3name                | test4value               |+--------------------------+--------------------------+--------------------------+--+

2:  jdbc:hive2://localhost:10000/default> select count(*) from  multi_delimiter_test;INFO  : Ended Job = job_1503469952834_0006INFO  : MapReduce Jobs Launched:INFO  : Stage-Stage-1: Map: 1  Reduce: 1    Cumulative CPU: 3.25 sec   HDFS  Read: 6755 HDFS Write: 2 SUCCESSINFO  : Total MapReduce CPU Time Spent: 3 seconds  250 msecINFO  : Completed executing command(queryId=hive_20170823041818_ce58aae2-e6db-4eed-b6af-652235a6e66a);  Time taken: 33.286 secondsINFO  : OK+------+--+| _c0  |+------+--+| 3    |+------+--+1 row selected (33.679  seconds)2:  jdbc:hive2://localhost:10000/def

4.常见问题

1.执行count查询时报错

异常日志

通过beeline执行count查询时报错

2:  jdbc:hive2://localhost:10000/default> select count(*) from  multi_delimiter_test;INFO  : Compiling  command(queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97):  select count(*) from multi_delimiter_testINFO  : Semantic Analysis CompletedINFO  : Returning Hive schema:  Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)],  properties:null)INFO  : Completed compiling  command(queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97);  Time taken: 0.291 secondsINFO  : Executing  command(queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97):  select count(*) from multi_delimiter_testINFO  : Query ID =  hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97INFO  : Total jobs = 1INFO  : Launching Job 1 out of 1INFO  : Starting task [Stage-1:MAPRED] in serial  modeINFO  : Number of reduce tasks determined at  compile time: 1INFO  : In order to change the average load for a  reducer (in bytes):INFO  :    set hive.exec.reducers.bytes.per.reducer=INFO  : In order to limit the maximum number of  reducers:INFO  :    set hive.exec.reducers.max=INFO  : In order to set a constant number of  reducers:INFO  :    set mapreduce.job.reduces=INFO  : number of splits:1INFO  : Submitting tokens for job:  job_1503469952834_0002INFO  : Kind: HDFS_DELEGATION_TOKEN, Service:  ha-hdfs:nameservice1, Ident: (token for hive: HDFS_DELEGATION_TOKEN  owner=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM,  renewer=yarn, realUser=, issueDate=1503475160778, maxDate=1504079960778,  sequenceNumber=27, masterKeyId=9)INFO  : The url to track the job:  http://ip-172-31-9-186.ap-southeast-1.compute.internal:8088/proxy/application_1503469952834_0002/INFO  : Starting Job = job_1503469952834_0002,  Tracking URL =  http://ip-172-31-9-186.ap-southeast-1.compute.internal:8088/proxy/application_1503469952834_0002/INFO  : Kill Command =  /opt/cloudera/parcels/CDH-5.10.2-1.cdh6.10.2.p0.5/lib/hadoop/bin/hadoop  job  -kill job_1503469952834_0002INFO  : Hadoop job information for Stage-1:  number of mappers: 1; number of reducers: 1INFO  : 2017-08-23 03:59:32,039 Stage-1 map =  0%,  reduce = 0%INFO  : 2017-08-23 04:00:08,106 Stage-1 map =  100%,  reduce = 100%ERROR : Ended Job =  job_1503469952834_0002 with errorsERROR : FAILED: Execution  Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTaskINFO  : MapReduce Jobs Launched:INFO  : Stage-Stage-1: Map: 1  Reduce: 1    HDFS Read: 0 HDFS Write: 0 FAILINFO  : Total MapReduce CPU Time Spent: 0 msecINFO  : Completed executing  command(queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97);  Time taken: 48.737 secondsError: Error while  processing statement: FAILED: Execution Error, return code 2 from  org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

使用Hive的shell操作报错如下

Error:  java.lang.RuntimeException: Error in configuring object        at  org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)        at  org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)        at  org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)        at  org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)        at  java.security.AccessController.doPrivileged(Native Method)        at  javax.security.auth.Subject.doAs(Subject.java:415)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)        at  org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by:  java.lang.reflect.InvocationTargetException        at  sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)        at  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at  java.lang.reflect.Method.invoke(Method.java:606)        at  org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)        ... 9 moreCaused by:  java.lang.RuntimeException: Error in configuring object        at  org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)        at  org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)        at  org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)        at  org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)        ... 14 moreCaused by:  java.lang.reflect.InvocationTargetException        at  sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)        at  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)        at  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)        at  java.lang.reflect.Method.invoke(Method.java:606)        at  org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)        ... 17 moreCaused by:  java.lang.RuntimeException: Map operator initialization failed        at  org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147)        ... 22 moreCaused by:  org.apache.hadoop.hive.ql.metadata.HiveException:  java.lang.ClassNotFoundException: Class  org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found        at  org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:323)        at  org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333)        at  org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116)        ... 22 moreCaused by:  java.lang.ClassNotFoundException: Class  org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found        at  org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2105)        at  org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:140)        at  org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:297)        ... 24 moreFAILED: Execution Error,  return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTaskMapReduce Jobs Launched:Stage-Stage-1: Map:  1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAILTotal MapReduce CPU Time  Spent: 0 ms

问题原因分析

org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe类是hive-contrib.jar包里。

在执行非聚合类操作查询时，sql能正常执行，在进行聚合类函数操作时报错，说明在执行MapReduce任务时缺少jar依赖包；MapReduce属于yarn作业，所以yarn运行环境缺少hive-contrib.jar的依赖包。

解决方法

在CDH集群的所有节点一下操作，将hive-contrib-1.1.0-cdh6.10.2.jar包拷贝到yarn的lib目录下

sudo scp -r  /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib-1.1.0-cdh6.10.2.jar  /opt/cloudera/parcels/CDH/lib/hadoop-yarn/lib/

重新运行count语句，执行成功

醉酒鞭名马，少年多浮夸！岭南浣溪沙，呕吐酒肆下！挚友不肯放，数据玩的花！
温馨提示：要看高清无码套图，请使用手机打开并单击图片放大查看。

很赞哦！

分隔符字段数据文件时报环境问题查询测试支持温馨事例图片手机版本目录高清无码准备提示数据库的安全要保护哪些东西数据库安全各自的含义是什么生产安全数据库录入数据库的安全性及管理数据库安全策略包含哪些海淀数据库安全审计系统建立农村房屋安全信息数据库易用的数据库客户端支持安全管理连接数据库失败ssl安全错误数据库的锁怎样保障安全公安网络安全保密遵守魔兽世界怀旧服各服务器战斗网络技术工程师工资数据库技术基础实验报告 ncre网络安全技术软件开发兴趣爱好及特长互联网科技感名片徐汇区上门软件开发厂家售后保障 unity 游戏服务器雅马哈373音乐数据库怎么用三国志战略版服务器满了进不去数据库purge是什么抵赖行为是网络安全威胁吗兰州博瑞四季软件开发有限公司谷歌手机服务器怎么删除云服务器学习网络安全教育内容讲座商务数据库如何连编项目方舟搭服务器主机网络安全等级保护定级流程三级网络技术怎么准备滴滴打车软件开发的代码网游服务器延时 erp软件开发合同正式免费版网络安全红蓝对抗总结软件开发方法可以独立开发数据库分区存储过程每天执行设计一个学号数据库代码服务器桌面有两个启动项鹤壁市网络安全宣传周

千家信息网

千家信息网

0010-Hive多分隔符支持示例

Laravel 5中怎么实现多个视图共享数据

RocketMQ延迟消息的实现方法

相关文章