导航：首页 > 互联网科技 >

Federated HDFS+beeline+hiveserver2 遇到的坑

发表于：2024-12-12 作者：千家信息网编辑

千家信息网最后更新 2024年12月12日，遇到的坑：1、 Hive的任务会从临时目录移动数据到数据仓库目录，默认hive使用/tmp作为临时目录，用户通常使用/user/hive/warehouse/作为数据仓库目录。在Federated H

千家信息网最后更新 2024年12月12日Federated HDFS+beeline+hiveserver2 遇到的坑

遇到的坑：

1、 Hive的任务会从临时目录移动数据到数据仓库目录，默认hive使用/tmp作为临时目录，用户通常使用/user/hive/warehouse/作为数据仓库目录。在Federated HDFS情况下，/tmp 和 /user视为两个不同的ViewFS mount table，所以hive任务在这两个目录之间移动数据。Federated HDFS不支持这样做，所以任务会失败。

报错信息：

ERROR : Failed with exception Unable to move sourceviewfs://cluster9/tmp/.hive-staging_hive_2015-07-29_12-34-11_306_6082682065011532871-5/-ext-10002to destinationviewfs://cluster9/user/hive/warehouse/tandem.db/cust_loss_alarm_unit

org.apache.hadoop.hive.ql.metadata.HiveException: Unable to movesourceviewfs://cluster9/tmp/warehouse/.hive-staging_hive_2015-07-29_12-34-11_306_6082682065011532871-5/-ext-10002to destinationviewfs://cluster9/user/hive/warehouse/tandem.db/cust_loss_alarm_unit

atorg.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2521)

atorg.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)

atorg.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

atorg.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

atorg.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

atorg.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

atorg.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

atorg.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

atorg.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)

at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)

atorg.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)

atorg.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)

atjava.security.AccessController.doPrivileged(Native Method)

atjavax.security.auth.Subject.doAs(Subject.java:415)

atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)

atjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

atjava.util.concurrent.FutureTask.run(FutureTask.java:262)

atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

atjava.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: Renames across Mount points notsupported

atorg.apache.hadoop.fs.viewfs.ViewFileSystem.rename(ViewFileSystem.java:444)

atorg.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2509)

... 21 more

相关代码：

org.apache.hadoop.fs.viewfs.ViewFileSystem

/**

// Alternate 1: renames within same file system -valid but we disallow

// Alternate 2: (as described in next para - valid butwe have disallowed it

// Note we compare the URIs. the URIs include the linktargets.

// hence we allow renames across mount links as longas the mount links

// point to the same target.

if (!re***c.targetFileSystem.getUri().equals(

resDst.targetFileSystem.getUri())) {

throw new IOException("Renames acrossMount points not supported");

}

// Alternate 3 : renames ONLY within the the samemount links.

if (re***c.targetFileSystem!=resDst.targetFileSystem) {

throw new IOException("Renames acrossMount points not supported");

}

Workaround:

a、在hdfs中创建 /user/hive/warehouse/staging 目录，赋予777权限

然后添加配置：

hive.exec.stagingdir

/user/hive/warehouse/staging/.hive-staging

b、只创建一个加载点如 /cluser 然后在此加载点下创建/tmp /user等目录，最后修改hive相关目录的默认值。

2、当查询返回结果集很大的时候，beeline客户端会卡住或out-of-memory

报错信息：

org.apache.thrift.TException: Error in calling method FetchResults

atorg.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1271)

atcom.sun.proxy.$Proxy0.FetchResults(Unknown Source)

atorg.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:363)

at org.apache.hive.beeline.BufferedRows.(BufferedRows.java:42)

atorg.apache.hive.beeline.BeeLine.print(BeeLine.java:1756)

atorg.apache.hive.beeline.Commands.execute(Commands.java:806)

atorg.apache.hive.beeline.Commands.sql(Commands.java:665)

atorg.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:974)

atorg.apache.hive.beeline.BeeLine.execute(BeeLine.java:810)

atorg.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)

at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)

atorg.apache.hive.beeline.BeeLine.main(BeeLine.java:463)

atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

atjava.lang.reflect.Method.invoke(Method.java:606)

atorg.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Caused by: java.lang.OutOfMemoryError: Java heap space

atjava.lang.Double.valueOf(Double.java:521)

Workaround：

查看源码发现：beeline获取结果集有两种模式一种增量模式，一种buffer模式

org.apache.hive.beeline.BeeLine

int print(ResultSet rs) throws SQLException {

String format = getOpts().getOutputFormat();

OutputFormat f = (OutputFormat)formats.get(format);

if (f == null) {

error(loc("unknown-format", new Object[] {

format,formats.keySet()}));

f = new TableOutputFormat(this);

}

Rows rows;

if (getOpts().getIncremental()) {

rows = new IncrementalRows(this,rs); // 增量模式

} else {

rows = new BufferedRows(this, rs);buffer模式

}

return f.print(rows);

}

org.apache.hive.beeline.BeeLineOpts

private boolean incremental = false; //默认为buffer模式

但是通过beeline -help没有发现相关设置

beeline --help

Usage: java org.apache.hive.cli.beeline.BeeLine

-u the JDBC URL to connect to

-n the username to connect as

-p the password to connect as

-d the driver class to use

-i script file for initialization

-e query that should be executed

-f script file that should be executed

-w (or) --password-file the password file to read password from

--hiveconfproperty=value Use value for given property

--hivevarname=value hive variable name and value

This is Hive specific settings in which variables

can be set at session level and referenced in Hive

commands or queries.

--color=[true/false] control whether color is used for display

--showHeader=[true/false] show column namesin query results

--headerInterval=ROWS; the interval between which heades are displayed

--fastConnect=[true/false] skip buildingtable/column list for tab-completion

--autoCommit=[true/false] enable/disableautomatic transaction commit

--verbose=[true/false] show verbose error messages and debug info

--showWarnings=[true/false] display connection warnings

--showNestedErrs=[true/false] displaynested errors

--numberFormat=[pattern] formatnumbers using DecimalFormat pattern

--force=[true/false] continue running script even after errors

--maxWidth=MAXWIDTH the maximum width of the terminal

--maxColumnWidth=MAXCOLWIDTH themaximum width to use when displaying columns

--silent=[true/false] be more silent

--autosave=[true/false] automatically save preferences

--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode forresult display

Note that csv, and tsv are deprecated - use csv2, tsv2 instead

--truncateTable=[true/false] truncatetable column when it exceeds length

--delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |)

--isolation=LEVEL set the transaction isolation level

--nullemptystring=[true/false] set to true toget historic behavior of printing null as empty string

--help display this message

Beeline version 1.1.0-cdh6.4.3 by Apache Hive

但是没关系通过

beeline -u jdbc:hive2://10.17.28.173:10000 -n xxxx -pxxxx --incremental=true 还是能进入增量模式

很赞哦！

目录模式数据任务增量两个仓库信息结果移动不同很大没关系之间代码客户客户端情况时候权限数据库的安全要保护哪些东西数据库安全各自的含义是什么生产安全数据库录入数据库的安全性及管理数据库安全策略包含哪些海淀数据库安全审计系统建立农村房屋安全信息数据库易用的数据库客户端支持安全管理连接数据库失败ssl安全错误数据库的锁怎样保障安全浙江传媒大学网络安全专业数据库基础学生信息表作业裕华青少年网络安全宣传周活动万方数据库系统迷雾世界服务器列表数据库外键设置魔兽世界的服务器配置大数据和软件开发有什么不同开网络技术工作室郑州c语言软件开发报价江苏苹果软件开发价位淘宝软件开发公司排名无线传感器网络技术物联网网络安全法案例题目怎么防止服务器丢失深海娱游网络技术长宁区网络技术咨询哪家好轻量服务器能放解析吗网络安全顶层设计规划不健全测试内网数据库连接速度物联网网络技术包括网络安全演练图片材料网络安全管理常用工具计算机网络技术在部队的应用五年级的网络安全宣传手抄报怎样通过软件开发新客户汽车公告数据库安保网络安全方案网络安全防护有哪几项指标景区网络安全管理制度

千家信息网

千家信息网

Federated HDFS+beeline+hiveserver2 遇到的坑

如何使用网摘推广网站

C++怎么实现整数转化成罗马数字

相关文章