ploomber.executors.Parallel

class ploomber.executors.Parallel(processes=None, print_progress=False)

Runs a DAG in parallel using multiprocessing

Parameters
  • processes (int, default=None) – The number of processes to use. If None, uses os.cpu_count

  • print_progress (bool, default=False) – Whether to print progress to stdout, otherwise just log it

Examples

Spec API:

# add at the top of your pipeline.yaml
executor: parallel

Python API:

>>> from ploomber import DAG
>>> from ploomber.executors import Parallel
>>> dag = DAG(executor='parallel') # use with default values
>>> dag = DAG(executor=Parallel(processes=2)) # customize

Notes

If any task crashes, downstream tasks execution is aborted, building continues until no more tasks can be executed

See also

ploomber.executors.Serial

Serial executor

Methods