Now that you know how to create basic workflows, let's try some more serious things. One of the most useful advanced concepts are loops and conditions.
Here we'll explain how to create dynamic loops, based on a task output. Here is our scenario:
We have a directory containing .jpeg files and want to resize all of the files. We could write everything in a big monolitic script but we would have less visibility. Moreover, the script might not be usable for something else.
Instead, we will create two basic tasks, one that lists files, and one that resizes a single image. This way we have two elementary tasks that can be used anywhere else.
The workflow can be downloaded here : resize_jpg.xml.
The first task of the workflow will take a directory as input and list all '.jpg' files in the directory. It will then output an XML string listing all the files.
Let's have a look at this simple shell script:
#!/bin/bash echo "<files>" for file in $(ls $1/*.jpg) do echo "<file>$(basename $file)</file>" done echo "</files>"
It takes a directory path as first parameter and generate an output XML listing all .jpg files found. Here is a sample execution:
And the associated output:
The second task we need is a resizing tool. For this purpose we will use ImageMagick tool and call its command line directly. Ensure that you have the package installed:
Below is a simple command line call to resize an image:
Creating the workflow
Ok, so now let's put it all together. Create a new workflow, name it "resize_jpg" and add one parameter named "directory".
Create a new task in the empty job and change its type from "binary" to "script".
In the "Name" parameter, enter a name for your task, we'll use "lsjpeg".
Edit the script source ( icon) and paste the content of the shell script write wrote above.
Close the script editor and go to the input tab. Add an input (name it as you like) add add one 'Xpath value' part, pointing to the workflow parameter 'directory'.
Go to the "Output" tab and change "Output method" to XML.
This is mandatory if you want to use loops or conditions as they are based on the XPath language that only supports XML.
We're done with this task. Close the dialog and add a new job below this one. Create a new task in this job.
Edit the task and set its path to: /usr/bin/convert. Close the dialog.
Now it is time to set the loop up. What we want to do is read the previous task output, and execute the resize step for each file found at previous step. Open the job (the task container) by clicking on the grey title bar above the convert task. Go to the 'Conditions & loops' tab.
Open the XPath helper ( icon) next to the 'Loop' input.
In the 'Choose task' box, select the 'find_jpg' task under the 'Parent job 1' (i.e. your first ancestor) group. In the 'Choose output node', type 'files/file'. This part depends on the XML format of the previous task (find_jpg). If you change this format, you'll have to adapt this input.
Close the dialog with "Select this node". Close the job edition dialog. You can see that the workflow graph now indicates that the job has a loop.
Note the icon on the job, indicating it contains a loop.
Edit the convert task again and go to the 'Inputs' tab. This time we need several inputs, 4 exactly:
- The source file
- The resize flag
- The new size
- The destination file
Create the first input (name it 'source') and add one part of type 'XPath value'. Choose the 'directory' parameter of the workflow.
Add a new part, of type 'Simple text' and type '/'.
Add the last part, of type 'XPath value' and in the 'Choose task' field, select 'Loop context'. This will point to the filename of the current loop iteration.
Add the next input (name it flag_resize), of type text: '-resize'.
Add the next input (name it size), of type text: '100x100'.
Last output is the destination file (name it destination). It is essentialy the same as the input filename, but the middle text part ('/') will be '/small-'.
That's it! Exit the edition interface (don't forget to save your changes).
Launch your brand new workflow and point it to some directory with some jpg files. Once terminated, you should see something like this:
As you can see, although our workflow only have 2 tasks (find_jpg and convert), the resulting instance have 4 tasks (find_jpg and 3 converts). The pointed directory will also have resized jpg files now.
As you can see, using the loop concept, we now have a fully generic workflow that can resize jpg files in any local directory. Another advantage is that resize tasks are now automatically parallelized by evQueue. This paralellization can be controlled by the use of specific queues, which will be covered later.