Workflow Error Handling
Task Retries
When an error is raised from the workflow itself, the workflow execution will fail - it will end with failed
status, and should have an error message under its error
field. There is no built-in retry mechanism for the entire workflow.
However, there’s a retry mechanism for task execution within a workflow. Two types of errors can occur during task execution: Recoverable and NonRecoverable. By default, all errors originating from tasks are Recoverable. The maximum number of retries for workflow operations is 60, with retries occuring at 15 second intervals for a maximum of 15 minutes.
If a NonRecoverable error occurs, the workflow execution will fail, similarly to the way described for when an error is raised from the workflow itself.
If a Recoverable error occurs, the task execution might be attempted again from its start. This depends on the configuration of the task_retries
and max_retries
parameters, which determines how many retry attempts will be given by default to any failed task execution.
The task_retries
and max_retries
parameters can be set in one of the following manners:
-
If the operation
max_retries
parameter has been set for a certain operation, it will be used. -
task_retries
,task_retry_interval
andsubgraph_retries
can also all be set using the CLI (cfy config
).
If the parameter is not set, it will default to the value of -1
, which means maximum retries (i.e. 60).
In addition to the task_retries
parameter, there’s also the task_retry_interval
parameter, which determines the minimum amount of wait time (in seconds) after a task execution fails before it is retried. It can be set in the very same way task_retries
and max_retries
are set. If it isn’t set, it will default to the value of 15
.