Dependencies¶
This tutorial demonstrates how StepUp tracks dependencies.
Example¶
Example source files: getting_started/dependencies/
The following plan.py
defines two steps, with the second making use of the output from the first.
#!/usr/bin/env python
from stepup.core.api import step
from stepup.core.interact import graph
step("echo First line. > ${out}; echo Second line. >> ${out}", out="story.txt")
step("grep First ${inp}", inp="story.txt")
graph("graph")
The placeholders ${inp}
and ${out}
are replaced by the inp
and out
keyword arguments.
(This happens early, before the steps are sent to the director process.)
The graph()
function writes the graph in a few formats, which are used for visualization below.
Now run StepUp with two workers:
You will see the following output:
DIRECTOR │ Listening on /tmp/stepup-########/director
DIRECTOR │ Launched worker 0
DIRECTOR │ Launched worker 1
PHASE │ run
START │ ./plan.py
START │ echo First line. > story.txt; echo Second line. >> story.txt
SUCCESS │ echo First line. > story.txt; echo Second line. >> story.txt
START │ grep First story.txt
SUCCESS │ grep First story.txt
─────────────────────────────── Standard output ────────────────────────────────
First line.
────────────────────────────────────────────────────────────────────────────────
SUCCESS │ ./plan.py
WORKFLOW │ Dumped to .stepup/workflow.mpk.xz
DIRECTOR │ Stopping workers.
DIRECTOR │ See you!
Despite the fact that StepUp has launched two workers, it carries out the steps sequentially, because it knows that the output of the first step will be used by the second.
Note, however, that the echo
commands are already started before ./plan.py
has finished.
This is the expected behavior: even without a complete overview of all the build steps,
StepUp will start the steps for which it has sufficient information.
Graphs¶
The plan.py
script writes a few files to analyze and visualize the graphs StepUp uses internally.
The file graph.txt
is a detailed human-readable version of .stepup/workflow.mpk.xz
:
root:
version = v1
creates file:./
creates file:plan.py
creates step:./plan.py
file:plan.py
path = plan.py
state = STATIC
created by root:
consumes file:./
supplies step:./plan.py
file:./
path = ./
state = STATIC
created by root:
supplies file:plan.py
supplies file:story.txt
supplies step:./plan.py
supplies step:echo First line. > story.txt; echo Second line. >> story.txt
supplies step:grep First story.txt
step:./plan.py
workdir = ./
command = ./plan.py
state = RUNNING
created by root:
consumes file:./
consumes file:plan.py
creates step:echo First line. > story.txt; echo Second line. >> story.txt
creates step:grep First story.txt
step:echo First line. > story.txt; echo Second line. >> story.txt
workdir = ./
command = echo First line. > story.txt; echo Second line. >> story.txt
state = QUEUED
created by step:./plan.py
consumes file:./
creates file:story.txt
supplies file:story.txt
file:story.txt
path = story.txt
state = PENDING
created by step:echo First line. > story.txt; echo Second line. >> story.txt
consumes file:./
consumes step:echo First line. > story.txt; echo Second line. >> story.txt
supplies step:grep First story.txt
step:grep First story.txt
workdir = ./
command = grep First story.txt
state = PENDING
created by step:./plan.py
consumes file:./
consumes file:story.txt
This text format may not always be the most convenient way to understand how StepUp connects all the steps and files.
A more intuitive picture can be created with GraphViz using the .dot
files as input.
The figures below were created using the following commands:
dot -v graph_supplier.dot -Tsvg -o graph_supplier.svg
dot -v graph_creator.dot -Tsvg -o graph_creator.svg
The workflow in StepUp consists of two graphs involving (a subset of) the same set of nodes: the supplier graph and the creator graph.
Supplier Graph¶
This graph shows how information is passed from one node to the next as the steps are executed.
This is an intuitive graph showing the execution flow. A similar graph is used by most other build tools. Not shown in this diagram are the directories, which StepUp treats in the same way as files.
Creator Graph¶
This one shows who created each node in the graph:
This diagram is a little less intuitive and requires more explanation. Each node in StepUp’s workflow is created by exactly one other node, except for the Root node, which is its own creator. (Arrow not shown.) In this example, there are three nodes that create other nodes:
-
The
root
node is an internal node controlled by StepUp. Upon startup, StepUp createsroot
and a few other nodes by default:- The initial
plan.py
file - The initial
./plan.py
step (with working directory./
.) - The working directory
./
is created just like any other directory that is used. - The
vaccum
node is a special node to hold all the nodes to be deleted. In most visualizations it will not have any children as they are usually removed rather quickly.
- The initial
-
The
./plan.py
step creates two nodes, see the twostep()
function calls in theplan.py
script above.- The
grep
step. - The
echo
step.
- The
-
The
echo
step creates one output file:story.txt
.
This creator graph is used by StepUp to decide which steps to vacuum.
For example, when plan.py
is modified, all nodes created by the ./plan.py
step will be transferred to the vacuum
node.
If the new plan.py
is recreated in the same way, its products are taken back from the vacuum
node (including known file and step hashes).
However, if the new plan.py
defines different steps, vacuumed nodes that no longer exist in the new plan.py
are effectively removed, after all steps have successfully completed.
At this stage, any output files owned by vacuum will be removed from disk (if their last recorded hash still matches the hash of the file being deleted).
Try the Following¶
-
Run
stepup -n -w2
again. As expected, the steps are now skipped. -
Modify the
grep
command to select the second line and runstepup -n -w2
again. Theecho
commands are skipped as they have not changed. -
Change the order of the two steps in
plan.py
and runstepup -n -w2
. The step./plan.py
is executed because the file has changed, but theecho
andgrep
steps are skipped. This shows thatplan.py
is nothing but a plan, not its execution. When it is executed, it sends the plan to the director process.