Talend Open Studio: Scheduling and command line execution
In this tutorial we will take a look at how to export a Talend Open Studio ETL job to an autonomous folder and schedule the job via crontab. In order to follow this tutorial, the reader should be familiar with the basic functionality of Talend Open Studio for Data Integration.How to export a job
In the export settings define:
- the export folder and file name
- the Job Version
- set the Export type to Autonomous Job
- tick Export dependencies
- define the Context and tick Apply to children
How to execute the job from the command line
<jobname>_<version>/<jobname>
Within this folder you will find an executable shell and/or batch file:
Open this file in a text editor:
Note that the context is defined as a command line argument. It is currently set to the value which you specified on export, but you can change it any time to another value here.
To execute the job on the command line simply navigate to this folder and run:
sh ./<jobname>_run.sh
How to execute a job with specific context variables
As you might have guessed, the approach is very similar to the one shown above, we just add command line arguments:sh ./<jobname>_run.sh --context_param variable1=value1 --context_param variable2=value2
How to change the default context variables
If you ever need to change the value of any of your context variables, you can find the property file for each context in:<jobname>_<version>/<jobname>/<projectname>/<jobname>_<version>/contexts/
Which in my case is:
Open one of them to understand how they are structured:
As you can see it is extremely easy to change these values.
How to schedule a job
If you make use of context variables regularly, then it is best to include them directly in the *_run.sh or *_run.bat file. Just open the file with your favourite text editor and add the variables after the context argument similar to this one:Ideally though, especially if you are dealing with dates, you want to make this more dynamic, like this one:
On Linux use Crontab to schedule a job:
crontab -e
And then set it up similar to the one shown below:
On Windows you can use the Windows Scheduler. As this one has a GUI, it is quite straight forward to set it up and hence will not be explained here.
thats a nice detailed explanation, too bad Talend Open Studio does not not its own scheduler
ReplyDeleteThanks a lot for your feedback! Much appreciated!
DeleteThanks for the nice explanation Diethard
ReplyDeleteThanks for your feedback!
DeleteCan you please provide how to work on windows.. i tried but its not working
ReplyDeleteNice documentation ! useful for beginners :)
ReplyDeleteThanks a lot for your feedback!
ReplyDeleteExcellent, thank you. I've been looking for this answer.
ReplyDeleteThanks a lot
ReplyDeleteI have a tMongoDBOutput on Talend it works well but on the .bat file it gives NULLPointerException
ReplyDeleteThanks! We found this very helpful!
ReplyDelete