/ / Przepływ pracy powłoki Oozie - shell, hadoop, hdfs, oozie, oozie-koordynator

Oozie shell workflow - shell, hadoop, hdfs, oozie, oozie-coordinator

Próbuję napisać prostą akcję powłoki w oozie, która skopiuje pliki ze zdalnego do hdfs.Ale dostaję błąd.

Oto moja praca flow.xml

<workflow-app name="WorkFlowCopyLocalTohdfs" xmlns="uri:oozie:workflow:0.1">
<start to="sshAction"/>
<action name="sshAction">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>/user/root5/Oozie/Workflow/WorkFlowCopyLocalTohdfs/uploadFile.sh</exec>
<file>/user/root5/Oozie/Workflow/WorkFlowCopyLocalTohdfs/uploadFile.sh#upload    File.sh</file>
<capture-output/>
</shell>
<ok to="end" />
<error to="killAction"/>
</action>
<kill name="killAction">
<message>"Killed job due to error"</message>
</kill>
<end name="end"/>
</workflow-app>

Mój plik uploadFile.sh to

#!/bin/bash -e

hadoop fs -copyFromLocal    /home/root5/Desktop/Avinash_sampleData/DataFolder/Data_04-05-2016 /user/root5/Oozie/DataFolder

Mój job.properties to

nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default

oozie.libpath=${nameNode}/user/root/oozie-workflows/lib
oozie.use.system.libpath=true
oozie.wf.rerun.failnodes=true

oozieProjectRoot=${nameNode}/user/root5/Oozie
appPath=${oozieProjectRoot}/Workflow/WorkFlowCopyLocalTohdfs
oozie.wf.application.path=${appPath}

#inputDir=${oozieProjectRoot}/data
focusNodeLogin=root@localhost

I ślad stosu w oozie jest

2016-05-04 16:09:36,023  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] Start action [0000012-160425173341619-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:36,023  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] [***0000012-160425173341619-oozie-oozi-W@:start:***]Action status=DONE
2016-05-04 16:09:36,023  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@:start:] [***0000012-160425173341619-oozie-oozi-W@:start:***]Action updated in DB!
2016-05-04 16:09:36,209  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] Start action [0000012-160425173341619-oozie-oozi-W@sshAction] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:36,353  WARN ShellActionExecutor:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] credentials is null for the action
2016-05-04 16:09:37,441  INFO ShellActionExecutor:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] checking action, external ID [job_201604251732_0160] status [RUNNING]
2016-05-04 16:09:37,544  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] [***0000012-160425173341619-oozie-oozi-W@sshAction***]Action status=RUNNING
2016-05-04 16:09:37,544  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] [***0000012-160425173341619-oozie-oozi-W@sshAction***]Action updated in DB!
2016-05-04 16:09:53,082  INFO CallbackServlet:539 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] callback for action [0000012-160425173341619-oozie-oozi-W@sshAction]
2016-05-04 16:09:53,317  INFO ShellActionExecutor:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] action completed, external ID [job_201604251732_0160]
2016-05-04 16:09:53,346  WARN ShellActionExecutor:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
2016-05-04 16:09:53,576  INFO ActionEndXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@sshAction] ERROR is considered as FAILED for SLA
2016-05-04 16:09:53,754  INFO ActionStartXCommand:539 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] Start action [0000012-160425173341619-oozie-oozi-W@killAction] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-05-04 16:09:53,755  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] [***0000012-160425173341619-oozie-oozi-W@killAction***]Action status=DONE
2016-05-04 16:09:53,755  WARN ActionStartXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[0000012-160425173341619-oozie-oozi-W@killAction] [***0000012-160425173341619-oozie-oozi-W@killAction***]Action updated in DB!
2016-05-04 16:09:53,943  WARN CoordActionUpdateXCommand:542 - USER[root5] GROUP[-] TOKEN[] APP[WorkFlowCopyLocalTohdfs] JOB[0000012-160425173341619-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100

Proszę, pomóż mi, jak w dalszym postępowaniu.

Hive-workflow.xml
<workflow-app name="WorkFlowCopyLocalTohdfs" xmlns="uri:oozie:workflow:0.1">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>hive-site.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
<property>
<name>oozie.hive.defaults</name>
<value>hive-site.xml</value>
</property>
</configuration>
<script>Hive_script.hql</script>
</hive>
<ok to="end"/>
<error to="killAction"/>
</action>
<kill name="killAction">
<message>"Hive failed, error   message[${wf:errorMessage(wf:lastErrorNode())}]"</message>
</kill>
<end name="end"/>
</workflow-app>
And the Hive_script.hql
.# LOAD DATA inpath "/user/root5/Oozie/DataFolder/Data_04_05_2016.txt" INTO TABLE OOZIE_TABLE1;

Odpowiedzi:

0 dla odpowiedzi № 1

Myślę, że napotykasz tutaj podstawowy problem. Po przesłaniu przepływu pracy Oozie nigdy nie wiadomo, w którym węźle przepływ pracy jest wykonywany. Dlatego nigdy nie powinieneś odwoływać się do lokalnego systemu plików w oozie.

Co możesz zamiast tego zrobić:

  • ręcznie umieść plik w ścieżce hdfs
  • zaimplementuj tę ścieżkę do przepływu pracy
  • pozwól oozie skopiować plik stamtąd

Upewnij się także, że używasz odpowiednich poleceń powłoki hadoop dla zainstalowanej wersji. Jestem przyzwyczajony do czegoś takiego jak hdfs dfs -put, ale możesz pracować nad inną wersją.