What's Wrong With Hue Oozie Editor?

本文原文出处: http://blog.csdn.net/bluishglc/article/details/47021019 严禁任何形式的转载,否则将委托CSDN官方维护权益!

First, let’s make the topic clear:

Comparing with providing raw Oozie workflow/coordinator xml file, what’s disadvantages to create workflow/coordinator with Hue Oozie Editor? ( The Hue Oozie Editor version discussed by this artical is HDP 2.2.4)

If no deep understanding with Hue Oozie Editor, everybody will like it at the first glance, why not? It’s so easy to use, what you see is what you get, who want to write the ugly xml file manually?

But the truth is: the Hue Oozie Editor is not so good as it looks, it’s far away to stable and powerful tool to create/manage workflows.

Here are the problems:

  1. As core source codes for workflows, the raw xml files should add into version control. The Hue Oozie Editor have no or very weak version control ability.
  2. If we maintain raw xml file in project, with building tools, we can configure environment related parameters, i.e. namenode, input/output data location and etc. And then we can easily build project for dev, test or production environment.

On the contrary, what if using Hue Oozie Editor? Congratulations! Please do the duplicated job again on production cluster: re-create the workflows/coordinators on the production cluster manually. Well, there’s an import/export feature in Hue Oozie Editor, but it’s only for workflow not for coordinator, and even for workflows, you still have to change all environment related parameters manually.

  1. It can’t support some advanced features, so we have to edit raw xml file. For example: you can’t assign the expression between input-events and dataset, i.e. ${coord:current(-1)}, you can only map them directly.
  2. Can’t import/export coordinators.

Well, at least, could we import our raw workflow file into Hue Oozie Editor?

Let’s look at how weak the current Hue Oozie Editor:

  1. For schema version, Hue Oozie Editor only support not higher than 0.4 of workflow and not higher than 0.2 of hive-action, otherwise you can’t import your raw file.
  2. It’s hard to believe: the property name: jobTracker and nameNode are HARD CODE! If you don’t use the two property name, again, you can’t import your raw file.
  3. Some parameters accept embedded parameter, i.e. nameNode/data/year/{month}, but some not, as for which accept which not? Try by yourself one by one, otherwise, you can’t import your raw file still.

Nobody hate UI design tools, but it has to be good enough. by now, I would say, building workflows above Hue Oozie Editor is unwise.

Obviously, we should choose raw xml file not Hue Oozie Editor.

But there is only one small trouble, the Hue can only start a workflow/coordinator edited by Hue Oozie Editor.Note: once a workflow/coordinator started, you can monitor & stop it from Hue even it’s described by raw xml.

First, I don’t think this is a trouble, we can start a workflow/coordinator with command line. Please do remember: normally, a workflow/coordinator is long-term running & background service, we scarcely start/stop it. So, the command line is enough for the operation and maintenance.

Besides command line, you can also start a workflow/coordinator via Oozie Restful API from remote.

版权声明:本文为博主原创文章,未经博主允许不得转载。

What's Wrong With Hue Oozie Editor?

时间: 2024-10-20 10:55:29

What's Wrong With Hue Oozie Editor?的相关文章

HDP 2.2.4 Hue Oozie Editor生成workflow.xml的几点问题

本文原文出处: http://blog.csdn.net/bluishglc/article/details/45888279 严禁任何形式的转载,否则将委托CSDN官方维护权益! 如果你想让你手写的workflow.xml成功的倒入到Hue的Oozie Designer里,你需要注意如下几点: 关于chema的版本:oozie最高只能是0.4,hive-action最高只能是0.2, istcp-action最高只能是0.1!! 对于jobTracker和nameNode属性,hue的oozi

Hue - Oozie Editor: Retrying connect to server: localhost/127.0.0.1:8050 的错误解决方法

本文原文出处: 严禁任何形式的转载,否则将委托CSDN官方维护权益! 问题描述 在当前版本的HUE(2.6.1-2)里,oozie editor存在一个糟糕的BUG: 用户无法在workflow的配置中指定${jobTracker}和${nameNode}的值,尽管我尝试了所有可以尝试的地方,但是当通过hue去submit(不是通过oozie的命令行)一个workflow时,${jobTracker}和${nameNode}的值都被无情的重置了!如果这两个属性的值刚好是正确的,那么自然相安无事,

#数据技术选型#即席查询Shib+Presto,集群任务调度HUE+Oozie

郑昀 创建于2014/10/30 最后更新于2014/10/31 一)选型:Shib+Presto 应用场景:即席查询(Ad-hoc Query) 1.1.即席查询的目标 使用者是产品/运营/销售运营的数据分析师: 要求数据分析师掌握查询SQL查询脚本编写技巧,掌握不同业务的数据存储在不同的数据集市里: 不管他们的计算任务是提交给 数据库 还是 Hadoop,计算时间都可能会很长,不可能在线等待: 所以, 使用者提交了一个计算任务(PIG/SQL/Hive SQL),控制台告知任务已排队,给出大

吐槽Oozie: 挖好坑,等你跳!

本文原文出处: http://blog.csdn.net/bluishglc/article/details/46050083 严禁任何形式的转载,否则将委托CSDN官方维护权益! 说说Oozie一些糟糕的地方吧,确实需要吐槽一下,作为开发者,因为使用的工具存在这样或那样的缺陷而导致话费大量时间去查找问题的根源是很不开心的一件事情,整体上,Oozie的完备性.文档的准确性以及很多细节的地方都需要改进和提高. Oozie原生的工作流文件和Hue的Oozie Editor生成的工作流文件的不兼容问题

Hue的全局配置文件hue.ini(图文详解)

Hue版本:hue-3.9.0-cdh5.5.4 需要编译才能使用(联网) 说给大家的话:大家电脑的配置好的话,一定要安装cloudera manager.毕竟是一家人的.同时,我也亲身经历过,会有部分组件版本出现问题安装起来要个大半天时间去排除,做好心里准备.废话不多说,因为我目前读研,自己笔记本电脑最大8G,只能玩手动来练手.纯粹是为了给身边没高配且条件有限的学生党看的! 但我已经在实验室机器群里搭建好cloudera manager 以及 ambari都有. 大数据领域两大最主流集群管理工

Spark教程——(9)Oozie编排Spark任务

进入Hue管理界面,打开Oozie Editor: 将打包好的Spark程序上传到HDFS上,拖拽Spark任务,编辑任务属性,选择打包好的Spark程序,设置主函数所在类,设置选项参数: 保存为任务: 执行: 原文地址:https://www.cnblogs.com/ratels/p/11552481.html

【转】Cloudera Hue Issues

转自 http://molisa.iteye.com/blog/1953390   我主要是根据这个说明调整的HUE的时区问题 在使用Cloudera Hue时遇到一问题: 1. 使用Sqoop导入功能时,由于配置错误,使得“保存运行”后Job并不能正常提交,且界面上没有相关提示: 使用Hue的Sqoop shell -> start job --jid * 提交会出现一些错误提示 然后再去/var/log/sqoop/里面查看log 2. 第二个问题是在使用Job Designer设计Sqoo

#研发中间件介绍#定时任务调度与管理JobCenter

郑昀 最后更新于2014/11/11 关键词:定时任务.调度.监控报警.Job.crontab.Java 本文档适用人员:研发员工 没有JobCenter时我们要面对的: 电商业务链条很长,业务逻辑也较为复杂,需要成百上千种定时任务.窝窝的大多数定时任务其实调用的是本地或远端 Java/PHP/Python Web Service.如果没有一个统一的调度和报警,在集群环境下,我们会: 不知道哪一个定时任务执行失败或超时,不见得能第一时间知道——直到最终用户投诉反馈过来: 要求每一个定时任务输出统

列一下Cloudera丧心病狂的CCP:DS认证大纲

Required Exams · DS700 – Descriptive and Inferential Statistics on Big Data · DS701 – Advanced Analytical Techniques on Big Data · DS702 - Machine Learning at Scale Each exam may be taken in any order. All three exams must be passed within 365 days o