Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Avoiding to run an algorithm twice, which results are already available.

Approaches:

Using Luigi's mechanism of idempotent algorithms (see preliminary).

IDProcess
1Using Luigi's mechanism of idempotent algorithms (see preliminary).
2Each algorithm is responsible for fulfilling the idempotent property by itself. Luigi's mechanism is omitted.
Drawback: More complexer algorithms, error-prone due wrong implementations

Given the mechanism of idempotent tasks and the fact as we can verify an algorithm's exit code, Luigi could simply infer if an algorithm has to be
run or was already completed. A problem appears after restarting Luigi which discards all containers and there exit codes.
A recommended solution is to save all container's exit codes and standard streams (stdout/strerr) within Luigi's database. The solution supports
traceability for historical batch processes and provides Luigi a way to identify complete tasks.

 

 

The approach, not to re-run algorithms with a valid result file, is inadequate as the container may be deleted but its results are already present within a database (e.g. property database).