Command-line interface

Note

This is an introductory tutorial to the command line interface; for a complete API description, see: Command line interface.

Entry points

By default, the CLI looks for an pipeline.yaml file in certain standard locations (Default locations). If your pipeline exists in a non-standard location, pass the --entry-point argument.

The pipeline.yaml file is known as “entry point”. However, this is not the only type of entry point (See this guide to learn more: Spec API vs. Python API).

Basic commands

Build pipeline (skips up-to-date tasks):

ploomber build

Forced build (runs all tasks, regardless of status):

ploomber build --force

Generate pipeline plot:

ploomber plot

New in Ploomber 0.18.2: You can plot the pipeline without installing extra dependencies. pygraphviz is still supported but optional. To learn more, see this.

Interactive sessions

Interactive sessions allow you to access the structure of your pipeline to help you test and debug:

ploomber interact

The command above starts a Python session, parses your pipeline, and exposes a dag variable (an instance of the ploomber.DAG class).

For example, to generate the plot:

dag.plot()

Get task names:

list(dag)

You can also interact with specific tasks:

task = dag['task_name']

Tip: If using IPython or Jupyter, press Tab to get autocompletion when typing the task name: dag['some_task']

Get task’s product:

dag['some_task'].product

If the product is a dictionary:

dag['some_task'].product['product_name']

You can use this to avoid hardcoding paths to load products:

import pandas as pd

df = pd.read_csv(dag['some_task'].product)

If you are working with Python tasks (functions, scripts, or notebooks), you can start a line by line debugging session:

dag['some_task'].debug()

Enter quit to exit the debugging session. Refer to The Python Debugger documentation for details.

To print the source code of a given task:

dag['some_task'].source

To find the source code location of a given task:

dag['some_task'].source.loc

Get upstream dependencies:

dag['some_task'].upstream

Get downstream tasks:

dag.get_downstream('some_task')

Other commands

Some commands didn’t cover here:

  • examples: Download examples

  • install: Install dependencies

  • nb (short for notebook): Manage notebooks and scripts

  • report: Generate a pipeline report

  • scaffold: Create a new project

  • status: Pipeline status summary

  • task: Execute a single task

See the CLI API documentation Command line interface for a detailed overview of each command.

Enabling Completion

To configure autocompletion for the CLI, you need to configure your shell.

If using bash, add this to ~/bashrc:

eval "$(_PLOOMBER_COMPLETE=zsh_source ploomber)"

If using zsh, add this to ~/.zshrc:

eval "$(_PLOOMBER_COMPLETE=zsh_source ploomber)"

If using fish, add this to ~/.config/fish/completions/ploomber.fish:

eval (env _PLOOMBER_COMPLETE=fish_source ploomber)