千家信息网

apache-hive-1.2.1中local mr的示例分析

发表于:2025-02-02 作者:千家信息网编辑
千家信息网最后更新 2025年02月02日,这篇文章将为大家详细讲解有关apache-hive-1.2.1中local mr的示例分析,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。在hive中运行sql有很多
千家信息网最后更新 2025年02月02日apache-hive-1.2.1中local mr的示例分析

这篇文章将为大家详细讲解有关apache-hive-1.2.1中local mr的示例分析,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。

在hive中运行sql有很多是比较小的SQL,数据量小,计算量小。这些比较小的SQL 如果也采用分布式的方式来执行,那么就得不偿失,因为SQL真正执行的时间可能只有10s,但是分布式任务生成的其他过程执行可能要1min。这样小任务采用local mr方式执行,就是本地执行,通过把输入数据拉回到客户端来执行

三个参数来决定:

hive.exec.mode.local.auto=true 是否启动本地mr模式

hive.exec.mode.local.auto.input.files.max=4 input files的数量,默认是4个

hive.exec.mode.local.auto.inputbytes.max=134217728 input files的大小,默认是128M

注意:

hive.exec.mode.local.auto是大前提,只有设置为true,才可能会启用local mr模式

hive.exec.mode.local.auto.input.files.max 和 hive.exec.mode.local.auto.inputbytes.max是 与的关系,只有同时满足才会执行local mr

t_1==> 5个文件

t_2==> 2个文件

hive>set hive.exec.mode.local.auto=falsehive> select * from t_2 order by id;Query ID = hadoop_20160125132157_d767beb0-f674-4962-ac3c-8fbdd2949d01Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=Starting Job = job_1453706740954_0006, Tracking URL = http://hftest0001.webex.com:8088/proxy/application_1453706740954_0006/Kill Command = /home/hadoop/hadoop-2.7.1/bin/hadoop job  -kill job_1453706740954_0006Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12016-01-25 13:22:19,210 Stage-1 map = 0%,  reduce = 0%2016-01-25 13:22:26,497 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.47 sec2016-01-25 13:22:40,207 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.68 secMapReduce Total cumulative CPU time: 3 seconds 680 msecEnded Job = job_1453706740954_0006MapReduce Jobs Launched: Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.68 sec   HDFS Read: 5465 HDFS Write: 32 SUCCESSTotal MapReduce CPU Time Spent: 3 seconds 680 msecOK......hive>set hive.exec.mode.local.auto=truehive> select * from t_2 order by id;hive> select * from t_2 order by id;Automatically selecting local only mode for query                ==> 启动用本地模式Query ID = hadoop_20160125132322_9649b904-ad87-47fa-89ad-5e5f67315ac8Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=Job running in-process (local Hadoop)2016-01-25 13:23:27,192 Stage-1 map = 100%,  reduce = 100%Ended Job = job_local1850780899_0002MapReduce Jobs Launched: Stage-Stage-1:  HDFS Read: 1464 HDFS Write: 1618252652 SUCCESSTotal MapReduce CPU Time Spent: 0 msecOK......hive>set hive.exec.mode.local.auto=truehive> select * from t_1 order by id;Query ID = hadoop_20160125132411_3ecd7ee9-8ccb-4bcc-8582-6d797c13babdTotal jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=Cannot run job locally: Number of Input Files (= 5) is larger than hive.exec.mode.local.auto.input.files.max(= 4)                            ==>5 > 4 还是启用了分布式Starting Job = job_1453706740954_0007, Tracking URL = http://hftest0001.webex.com:8088/proxy/application_1453706740954_0007/Kill Command = /home/hadoop/hadoop-2.7.1/bin/hadoop job  -kill job_1453706740954_0007Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12016-01-25 13:24:38,775 Stage-1 map = 0%,  reduce = 0%2016-01-25 13:24:52,115 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.55 sec2016-01-25 13:24:59,548 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.84 secMapReduce Total cumulative CPU time: 3 seconds 840 msecEnded Job = job_1453706740954_0007MapReduce Jobs Launched: Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.84 sec   HDFS Read: 5814 HDFS Write: 56 SUCCESSTotal MapReduce CPU Time Spent: 3 seconds 840 msecOK......hive>set hive.exec.mode.local.auto=truehive> set hive.exec.mode.local.auto.input.files.max=5;            ==> 设置输入文件数max=5hive> select * from t_1 order by id;Automatically selecting local only mode for query                 ==> 启用了本地模式Query ID = hadoop_20160125132558_db2f4fca-f6bf-4b91-9569-c779a3b13386Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:  set hive.exec.reducers.max=In order to set a constant number of reducers:  set mapreduce.job.reduces=Job running in-process (local Hadoop)2016-01-25 13:26:03,232 Stage-1 map = 100%,  reduce = 100%Ended Job = job_local264155444_0003MapReduce Jobs Launched: Stage-Stage-1:  HDFS Read: 1920 HDFS Write: 1887961792 SUCCESSTotal MapReduce CPU Time Spent: 0 msecOK

关于"apache-hive-1.2.1中local mr的示例分析"这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。

0