Website Pages:

Welcome to Pipeline Design Patterns
A Theory of Pipeline
Pipeline Design Patterns
Bake / Baking
Case Studies

Sections in this Page:

Execution Cache
Access Cache


A Vocabulary for Diagrams


A cache is the result of a process, saved to avoid re-executing that process unless needed. Typically caches are checked against the process inputs to decide whether to re-execute.

There are several reasons to cache a file:

  • Execution is expensive and needs to be avoided if possible.

  • Dependencies are burdensome to manage. It's useful to get an output and keep it if the processes are changing constantly.

  • Access is expensive. For example, getting the input from a remote location, due to network latency, etc. In this case, the input can be cached locally.

Execution Cache

This is how we draw execution or evaluation caches.

Click on diagram to Zoom/Unzoom.

This is interpreted as:

  • check the timestamps of the Cache vs the Input, and if the former is not out of date:

  • optimize the exection path inside Process in some way, perhaps by eliding some or all of the Process.


Here's a way to think of this:

Click on diagram to Zoom/Unzoom.

In this diagram there is an Intermediate File between Process1 and Process2. If we check the timestamp of that against the Input File, we can potentially elide Process1.

Or, to draw it another way, we can combine Process1 and Process2 into a single Composite Process, and move the Intermediate File down below.

Click on diagram to Zoom/Unzoom.

Note: This is an example of how we can think of many of the patterns as graph transformations. Any time we find a file between two processes, we have an opportunity to cache.

Access Cache

coming soon