Tasks Scheduling

While Nodes are all about data, Tasks are for actions. The definition of the actual task to accomplish is entirely managed by the user, only the mechanism to enable orchestration is provided through the API.

Tasks are essentially transitory objects. They first get created by sending a message to a particular queue via the /task/schedule endpoint and they last until they're done. During this time, they are stored internally in Redis and may optionally be saved persistently in a Node object by the user.

On the receiving end of the message queues are the Schedulers, or basically consumer services in charge of executing the tasks and completing them. As each Scheduler will have particular characteristics with access to specific runtime environments and different provisioning methods, they need to get registered with the API using a dedicated queue. There can be multiple identical instances of a Scheduler waiting for messages on the same queue as each message will be delivered to only one of them. It's a typical design pattern in distributed systems which makes it easy to scale up or down dynamically.

Object Model

Task objects follow a model defined by the API. Like Node objects, Tasks are read-only. Their state is determined by the fact that they transition from a message queue to the in-memory database and then get deleted. Any application data produced by the task should be stored in Node objects.

Let's take a closer look at each field of the Task model:

id: uuid.UUID = Field(default_factory=uuid.uuid4, description="Task UUID")

The task identifier is a classic UUID. That means it won't collide with other API instances and makes it agnostic of any database engine where it may be stored.

attributes: Dict[str, Any] = Field(description="Task attributes")

Attributes are arbitrary, a bit like Node.data. They are set by the user creating the task to describe how it should be run. For example, a Scheduler receiving the task as a message could use its attributes to set the priority, provision a particular type of resources or run a certain variant of the task.

scheduler: str = Field(description="Scheduler queue name")

This is the same value as the name of the message queue where the task was posted. In most cases, it won't be used by the Scheduler since it's already listening on that queue. It's mostly useful for special cases such as when a task failed to get accepted by a Scheduler or after the fact when it's done and stored in a Node to keep a trace of where it was run.

timeout: datetime = Field(
    default_factory=DefaultTimeout(hours=6).get_timeout,
    description="Task expiry timestamp"
)

If a task is still not done (i.e. aborted or complete) when reaching the timeout then it gets removed automatically by the Timeout daemon running on the server side of the API.

Runtime Environments

Work in progress

The Scheduler services are in charge of receiving tasks from message queues and scheduling them in concrete runtime environments. Those are based on the Runtime abstract class, currently only Inline is available to run tasks directly as Python methods in the current process. Arbitrary ones can be implemented such as: Docker, Kubernetes, on-demand VMs and various kinds of hardware infrastructure.