Distributed Plans¶
When your project grows, defining the entire workflow
in a single plan.py file may become inconvenient.
Especially when working with nested directories for different parts of the project,
it may be convenient to distribute the workflow over multiple plan.py files.
Example¶
Example source files: docs/getting_started/distributed_plans/
Create a simple example with a top-level plan.py as follows:
#!/usr/bin/env python3
from stepup.core.api import plan, static
static("sub/plan.py", "part1.txt", "sub/part2.txt")
plan("./plan.py", workdir="sub")
The top-level plan defines a few static files and then calls another plan in sub/.
Create a file sub/plan.py as follows:
#!/usr/bin/env python3
from stepup.core.api import run
run("cat part2.txt", inp="part2.txt")
run("cat ../part1.txt", inp="../part1.txt")
Also create two files part1.txt and sub/part2.txt with a bit of text.
Make both plans executable and run StepUp as follows:
You will get the following output:
DIRECTOR │ Listening on /tmp/stepup-########/director (StepUp Core 3.2.3.post54)
STARTUP │ (Re)initialized boot script
PHASE │ build
START │ ./plan.py
SUCCESS │ ./plan.py
START │ ./plan.py # wd=sub
SUCCESS │ ./plan.py # wd=sub
START │ cat ../part1.txt # wd=sub
SUCCESS │ cat ../part1.txt # wd=sub
─────────────────────────────── Standard output ────────────────────────────────
This is part 1.
────────────────────────────────────────────────────────────────────────────────
START │ cat part2.txt # wd=sub
SUCCESS │ cat part2.txt # wd=sub
─────────────────────────────── Standard output ────────────────────────────────
This is part 2.
────────────────────────────────────────────────────────────────────────────────
DIRECTOR │ Trying to delete 0 outdated output(s)
DIRECTOR │ See you!
The main advantage over run() is that plan() will give your additional plan.py steps
higher priority than non-plan steps.
This results in earlier knowledge of the workflow, which improves scheduling efficiency.
It also clarifies the intent of the step, which improves the readability of your workflow.
Practical Considerations¶
- The main benefit of having multiple
plan.pyfiles is to improve the logical structure of your project. It may also be helpful when a part of yourplan.pyis computationally demanding, in which case it can be factored out so that it does not slow down the rest of the build. However, ideally, theplan.pyscripts execute quickly, leaving the hard work to other steps. - When there are multiple
plan.pyfiles, keep in mind that their order of execution cannot be relied upon. They are executed in parallel, and their relative starting times depend on factors unknown a priori, such as system load and number of parallel jobs.