Contents

  1. Introduction
  2. Definitions
  3. Queues
  4. Tasks
  5. Workflows
  6. Scheduled workflows

Introduction

evQueue is a task queueing engine and a fast and scalable job scheduler. It can handle simple tasks or tasks combination (called “Workflows”).

The engine is entirely based on kernel events. There is no active wait in any parts of the process. When chaining tasks, the next task starts when the kernel signals the end of the preceding process. This allows very low overhead and effective CPU management.

The web control interface allows simple monitoring of tasks and workflows and help system administrators keeping a trace of what happens on a platform. User management system also provides a simple way to restrict rights so you can give some users the rights to launch defined tasks when they need.

Beside web interface, evQueue also provides a full XML based API which allows launching and state gathering of workflows. This can be used by web processes to launch time taking jobs (eventually on remote machine) and monitoring status asynchronously (with AJAX for example).

Definitions

Queues

Queues are used to manage tasks concurrency. Each task must be associated to a queue within a workflow. The same task can be associated to different queues in separate workflows. When a task is executed, the counter of running tasks of the queue gets incremented. When the task is terminated, the counter gets decremented. The running tasks counter cannot be higher than the maximum concurrency, specified when creating the queue.

When the running tasks counter is below the maximum concurrency, tasks in the queue are executed immediately. If the maximum concurrency is reached, tasks are queued (not executed). They get executed when one or more tasks are terminated.

Queues are global, they do not depend of a workflow or workflow instance. The tasks are placed in the same queue, even if they are launched from different workflows. This allows a good control of CPU occupation and can prevent from overloading the system.

Example

Let's take the example of pictures resizing, which is a quite CPU intensive task. Suppose we have a queue named “photo_resize” with a maximum concurrency of 4. The resize workflow contains one (or more) tasks to resize pictures that are bound to this queue. No matter how many workflow instances are launched, the number of concurrent photo_resize will never exceed 4.

Utility of queues

Create a queue

To create a queue, go to “Settings → Queues”. Use the blue “+” at the right of the title bar. Enter queue name and maximum concurrency and click “submit”.

Note that you have to restart evQueue engine after creating or editing a queue.

Tasks

A task is the smallest unit of control in evQueue. A task is mainly composed of a name, used to call the task in a workflow, and a path to a binary file that will be executed. When a task is used in a workflow, it must be bound to a queue. The same task can be bound to different queues within different workflows.

A task can be executed remotely using SSH. In this case, you have to give a user and host for remote execution. The path to the binary is then relative to the destination machine. Specifying a remote host in the task definition means that the task will always be executed on this host, regardless of the workflow execution host (task definition has the highest priority).

Parameters can be passed as command line arguments (the traditional way of passing parameters) or by environment variables. Environment variables are very useful when the number of arguments begins to increase. Is is much more easy to read and maintain. On the other hand, system binaries (like “ls” or “bzip”) only accept command line arguments.

Task output will be imported in the workflow instance. Two output formats are supported: text or XML. It is only possible to use the result of task with an XML output. The result can be used to create loops, conditions, or as arguments for sub-tasks. Text output is generally used with system binaries.

Workflows

A workflow is a set of tasks that are executed in a defined order. This order is controlled by specific control structures used when defining workflow. The workflow itself is described by an XML structure, which is also used by evQueue for its internal representation.

The workflow contains the description of the tasks that must be executed. It can also contain loops, conditions or variables. When the workflow is launched, a workflow instance is created. In this instance, all variables gets replaced by their values and tasks are expended with their result.

The workflow instance is a well formed XML. A task declaring XML output will be considered failed if its XML output is not well formed (regardless of its return value). For text output, evQueue will create an “output” node to encapsulated the task output.

A task can refer to any of the values returned by its ancestors by using XPath expressions. These expressions can be used in any of the control structures that are explained below. These values can be used as input parameters or in loops and conditions.

Control structures

In order to create a workflow, you have to understand how control structures works:

Workflow instances

Workflow instances (executing or terminated) can be found in “System state → Workflow instances”. Here you can see which workflows are currently executing as well as terminated workflows. You can also see the output and status of tasks (executing, terminated, skiped...).

Scheduled workflows

A scheduled workflow is a workflow that will be launched by evQueue at given time(s). The execution of their workflows are also event driven. There is no active polling to determine when to launch a workflow. Instead, the evQueue engine will compute at which time the next execution should be made and ask the kernel to be woken up at this time. This allows a very small resolution for task launching. The smallest amount of time is set to the second for practical reasons, although it could be reduced to about 100ms.

A schedule will never be triggered again while it is still running. That is, if you plan every minute a task running for 60 minutes, it will be launched 1 minute after the previous execution's ending (61 minutes in this case).

Scheduled workflow status

The status of scheduled workflows is summarized is “System state → Scheduled workflows”. Here you can see all scheduled workflows with their last execution time and status, and their next execution time. The "eye" can be used to see all the instances that have been launched for this scheduled workflow.