Command line interface¶
This document summarizes commonly used commands. To get full details, execute
ploomber --help
or ploomber {command_name} --help
.
When applicable, we use this sample pipeline to demonstrate which tasks will be executed after issuing a given command:
Assume yellow tasks are outdated and green tasks are up-to-date.
Executed tasks are shown in blue and skipped tasks are shown in white in diagrams below.
Build pipeline¶
ploomber build
Execute your pipeline end-to-end and speed it up by skipping tasks whose source code has not changed.
(Skips B2
because it’s up-to-date)
Build pipeline (forced)¶
ploomber build --force
Execute all tasks regardless of status.
Build pipeline partially¶
ploomber build --partially C
Builds your pipeline until it reaches task named C
.
(Skips B2
because it’s up-to-date)
(Skips D
because it’s not needed to build C
)
To force the execution of tasks regardless of status, use the --force/-f
option.
You can also select several tasks at the same time using wildcards:
ploomber build --partially 'fit-*' # note the single quotes
The previous command will execute all tasks with the fit-*
prefix and all
their upstream dependencies.
You may skip building upstream dependencies using the --skip-upstream
ploomber build --partially 'fit-*' --skip-upstream # note the single quotes
Note that the previous command fails if the upstream products of fit-*
tasks
do not exist yet.
Plot¶
ploomber plot
Creates a pipeline plot and stores it.
New in Ploomber 0.18.2: You can plot the pipeline without installing extra dependencies.
pygraphviz
is still supported but optional. To learn more, see this.
To include the task’s products in the plot (only supported when using the pygraphviz
backend):
ploomber plot --include-products
Status¶
ploomber status
Show a table with pipeline status. For each task: name, last execution time, status, product, docstring (first line) and file location.
Report¶
ploomber report
Create an HTML report and save it in a pipeline.html
file. The file
includes the pipeline plot and a table with a summary for each task.
Build a single task¶
ploomber task C
To force execution regardless of status use the --force/-f
option.
Get task status¶
ploomber task task_name --status
If you also want to build the task, you must explicitly pass --build
.
Task source code¶
ploomber task task_name --source
If you also want to build the task, you must explicitly pass --build
.
Create new project¶
The scaffold
command allows you to start a new project:
ploomber scaffold
The command above generates a project with sample pipeline. To create an empty project:
ploomber scaffold --empty
New in 0.16: You can pass a positional argument ploomber scaffold myproject
.
Note that if you run this command in a folder that already has a
pipeline.yaml
in a Default locations, it will parse your
pipeline declaration looking for declared tasks whose source code file does
not exist and proceed to create them.
ploomber scaffold
If you’d like to package your project:
ploomber scaffold --package
After creating a project, you can install dependencies
with the ploomber install
command (to learn more: install).
For a tutorial on the ploomber scaffold
command: Scaffolding projects.
install
¶
ploomber install
installs dependencies:
ploomber install
ploomber install
installs dependencies using pip
if a
requirements.txt
file exists or conda
, if
an environment.yml
file exists.
New in 0.16: ploomber install
has a few options to customize set up, run ploomber install --help
to learn more.
New in 0.16: ploomber install
will install dependencies in the current
environment, you can request creating a virtual environment with the --create-env
option,
which will use venv or conda
(if installed). Previously, it always created a new environment
Upon installation, ploomber install
generates lock files that contain
specific versions for all required packages. Lock files are useful for ensuring
the stability of your project since upgrades to your dependencies may break
your code. To install from lock files:
ploomber install --use-lock
New in 0.16: ploomber install
(without arguments) will use lock files if they exist. Otherwise, it’ll use regular files.
nb
¶
nb
is short for notebook. This command manages notebooks and scripts in your pipeline.
Inject cell to scripts and notebooks in your pipeline:
ploomber nb --inject
Enable opening .py
as notebooks in JupyterLab with one click on the file:
ploomber nb --single-click
Re-format .ipynb
notebooks as .py
files with the percent format:
ploomber nb -f py:percent
Re-format .py
files as .ipynb
notebooks:
ploomber nb -f ipynb
The rest of the options are useful when using editors such
as VSCode or PyCharm or when running old
JupyterLab versions (<2.x
). When using recent JupyterLab versions,
script/notebooks management is automatically performed by
the Jupyter plug-in.
Other commands are available, run ploomber nb --help
to learn more.
Interactive sessions¶
To start an interactive session:
ploomber interact
Your pipeline is available in the dag
variable. Refer to
ploomber.DAG
documentation for details.
Doing dag['task_name']
returns a Task instance, all task instances have a
common API, but there are a few differences. Refer to the tasks documentation
for details: Tasks.
The CLI guide describes some of the most common use cases for interactive sessions: Interactive sessions.
Examples¶
To get a copy of the examples from the Github repository.
List examples:
ploomber examples
Get one:
ploomber examples --name {name}
To download in a specific location:
ploomber examples --name {name} --output path/to/dir
For a tutorial on the ploomber examples
command: Downloading templates.
Default locations¶
If you don’t pass the --entry-point/-e
argument to the command line,
Ploomber will try to find one automatically by searching for a
pipeline.yaml
file at the current directory and parent directories.
If no such file exists, it looks for a setup.py
. If it exists, it searches
for a src/{pkg}/pipeline.yaml
file where {pkg}
is a folder with any
name. setup.py
is only required for packaged projects.
If your pipeline has a different filename, you can create a setup.cfg
file
and indicate what file you want to set as default. Note that changing the
default affects both the command-line interface and the Jupyter plug-in.
[ploomber]
entry-point = path/to/pipeline.yaml
Note that paths are relative to the parent directory of setup.cfg
.
Alternatively, you may set the ENTRY_POINT
environment variable to a different filename (e.g.,
export ENTRY_POINT=pipeline.serve.yaml
). Note that this must be a filename,
not a path to a file.
If you want to know which file will be used based on your project’s layout:
ploomber status --help
Look at the --entry-point
description in the printed output.
changelog
New in version 0.19.6: Support for switching entry point with a setup.cfg
file