A main topic of a workflow engine is the exception handling of tasks which failed during batch processing.
Luigi does not provide a pre-defined strategy for exception handling which have to be implemented individually
by the user, if possible. However, Luigi offers a mechanism for so called 'idempotent' tasks which are only run
once, given a specific set of parameters. Hereby, Luigi skipps complete tasks while running a pipeline
the second time after encounting an unresolvable dependency (i.e. a predecessor task which failed while execution).
A more detailed strategy for exception handling is given in the following sections.
Exception: Workflow Engine
Description:
Exception handling through the workflow engine (e.g. after power failure, ...)
Approaches:
ID | Process | Exception Handling |
---|---|---|
1 | None. | None. |
2 | Tasks are stored within a database and updated whenever a task terminates.
| While start up Luigi, check and re-run open pipeline tasks. |
A successful task (with a docker container as target) can be defined in different ways:
- Given, a result file with a specified format (Contra: additional effort for data handling, error-prone due file formatting)
- Exit code of the docker container is equal to 0.
The second approach seems more accurate due simplicity.
Exception: Algorithm
Description:
Exception handling through algorithm containers (e.g. invalid parameters, unreachable database ...)
Approaches:
ID | Process | Exception Handling |
---|---|---|
1 | None. | None. |
2 | An algorithm continuously saves results which are deleted if an exception is thrown.
Clean up results Terminate with an error Else: Terminate without an error | While encountering an exception, clean up inconsistent results and terminate container with an error. |
3 | An algorithm persistently saves all or none results.
Terminate with an error Else: Save results Terminate without an error | Results are only saved if no exception was throws. Each algorithm is responsible for saving consistent results. |