Java知识分享网 - 轻松学习从此开始!    

Java知识分享网

Java1234官方群25:java1234官方群17
Java1234官方群25:838462530
        
SpringBoot+SpringSecurity+Vue+ElementPlus权限系统实战课程 震撼发布        

最新Java全栈就业实战课程(免费)

springcloud分布式电商秒杀实战课程

IDEA永久激活

66套java实战课程无套路领取

锋哥开始收Java学员啦!

Python学习路线图

锋哥开始收Java学员啦!
当前位置: 主页 > Java文档 > 大数据云计算 >

Hive on Spark EXPLAIN statement PDF 下载


分享到:
时间:2020-03-28 18:11来源:http://www.java1234.com 作者:小锋  侵权举报
Hive on Spark EXPLAIN statement PDF 下载
失效链接处理
Hive on Spark EXPLAIN statement  PDF 下载

本站整理下载:
提取码:l2nu 
 
 
相关截图:
 
主要内容:
Hive on Spark EXPLAIN statement In Hive, command EXPLAIN can be used to show the execution plan of a query. The language manual has lots of good information. For Hive on Spark, this command itself is not changed. It behaves the same as before. It still shows the dependency graph, and plans for each stage. However, if the query engine (hive.execution.engine) is set to “spark”, it shows the execution plan with the Spark query engine, instead of the default (“mr”) MapReduce query engine. Dependency Graph Dependency graph shows the dependency relationship among stages. For Hive on Spark, there are Spark stages instead of Map Reduce stages. There is no difference for other stages, for example, Move stage, Stats­Aggr stage, etc.. For most queries, there is just one Spark stage since many map and reduce works can be done in one Spark work. Therefore, for a same query, with Hive on Spark, there may be less number of stages. For some queries, there are multiple Spark stages, for example, queries with map join, skew join, etc.. One thing should be pointed out that here a stage means a Hive stage. It is very different from the stage concept in Spark. A Hive stage could correspond to multiple stages in Spark. In Spark, a stage usually means a group of tasks that can be processed in one executor. In Hive, a stage contains a list of operations that can be processed in one job. Spark Stage Plan The plans for each stage are shown by command EXPLAIN, besides dependency graph. For Hive on Spark, the Spark stage is new. It replaces the Map Reduce stage for Hive on MapReduce. The Spark stage shows the Spark work graph, which is a DAG (directed acyclic graph). It contains: ● DAG name, the name of the Spark work DAG; ● Edges, that shows the dependency relationship among works in this DAG; ● Vertices, that shows the operator tree of each work. For each individual operator tree, there is no change for Hive on Spark. The difference is dependency graph. For MapReduce, you can’t have a reducer without a mapper. For Spark, that’s not a problem. Therefore, Hive on Spark can optimize the plan and get rid of those mappers not needed. The edge information is new for Hive on Spark. There is no such information for MapReduce. Different edge type indicates different shuffle requirement. For example,

 
------分隔线----------------------------

锋哥公众号


锋哥微信


关注公众号
【Java资料站】
回复 666
获取 
66套java
从菜鸡到大神
项目实战课程

锋哥推荐