Writing a Workflow with FireWorks Workflow Management System

CommandLineTask and FileTransferTask Explanation

FireWorks Architecture

FireWorks is a famous Workflow Management System used to run workflows in high-performance computing environment. Although its source code is comprehensively documented, there is a very low amount of information on its new workflow definition format. Figuring out its workflow definition format by a newcomer to the FireWorks becomes an overwhelming task because the schema of the format is not to be found. This article is written using my experience and understanding of FireWorks workflow definition format with the hope that it will help a beginner to adapt to FireWorks quickly in the future. The workflow used in the example is taken from https://gitlab.com/ikondov/gridka-school-fireworks and I would like to thank the creators for the list of exercises available here.

Please note that this article requires a basic understanding of FireWorks, Workflow.

Input Image used in the Workflow

Workflow

The above workflow file swirls an image of the letter ‘A’ with different angles and creates a GIF animation from the workflow. Although it looks overwhelming at first, it's quite an easy task once you get the hang of it. We will learn two types of FireTasks in this article — FileTransferTask and CommandLineTask. Let's go over the fireworks in the workflow one by one.

FileTransferTask

As you may already know each firework has its own fw_id, name, and spec. Inside each firework, there are multiple firetasks. Those are listed under the _tasks section under the spec. The type of the firetask is given in Firework.spec._tasks[‘_fw_name’]. The FileTransferTask firetasks are used for file manipulation. If you define a FileTransferTask, you have to define 3 compulsory parameters. They are,

The above firetask copies a set of files given in ‘files’ array to the destination given in ‘dest’. You can use other options like move, copytree and try them out. The operations done by each option is available in Python official documentation at https://docs.python.org/3/library/shutil.html

CommandLineTask

CommandLineTasks are used to define shell commands. They are the most important type of tasks when running a workflow on a high-performance computing environment. In fact, you can write the whole workflow only using CommandLineTasks as every task we do can be converted to a shell command. The following are the main parameters associated with CommandLineTasks.

We can define labels for the files or variable used, in the command_spec. Optionally, we can label the input/output files used inside the firetasks in the spec and use them in the command_spec as well.

Here, I have labeled the files/variables using 3 methods.

  1. using {source: letter} format — To do this letter variable must have been labeled in the spec as shown in the gist letter: {type: path, value: /tmp/letter.png} .
  2. using source: {type: data, value: 90} format — The variable is directly defined in the command_spec here. It is identified as a variable due to the type label and assigned the value 90.
  3. using target: {type: path, value: /tmp} format — This is used to define output files. type path shows that this is a file and value /tmp shows its location.
  4. Further, you can use the option {binding: {}} as well to label the files.

Then we can use the labeled variables as an inputs/output to our CommandLineTask

Please note that the labels of the output files created in a firework can be used directly by the fireworks below it without defining it again.

Finally, the command in command_spec is used to write the shell command to be used. The command given in the gist command: [convert, -swirl] converts to the shell command sh convert -swirl 90 letter when executing and creates the output file 90 out in /tmp location.

Final Output

Now we can use multiple fireworks with the above two types of firetasks to create a nice GIF using the gist of the workflow given above.

Output Animation created from the Workflow

Software Engineering Intern @WSO2 @CERN| GSoC Participant | Undergrad @UOM | Computer Science and Engineering