Python API¶
This section lists the available classes and functions in the Python API. If
you’re writing pipelines with the Spec API (e.g., pipeline.yaml
file), you
won’t interact with this API directly. However, you may still want to learn
about ploomber.spec.DAGSpec
if you need to load your pipeline as a Python
object.
For code examples using the Python API, click here.
DAG¶
|
A collection of tasks with dependencies |
|
A subclass of |
Execute partial DAGs in-memory. |
|
|
An object to customize DAG behavior |
|
Converts a DAG to a DAG-like object that performs all operations in memory (products are not serialized). |
Tasks¶
|
Abstract class for all Tasks |
|
Execute a Python function |
|
Run a Jupyter notebook using papermill. |
|
Similar to NotebookRunner, except it uses python to run the code, instead of papermill, hence, it doesn’t generate an output notebook. |
|
Execute a script in a SQL database to create a relation or view |
|
Dumps data from a SQL SELECT statement to a file(s) |
|
Transfers data from a SQL database to another (Note: this relies on pandas, only use it for small to medium size datasets) |
|
Upload data to a SQL database from a parquet or a csv file. |
|
Efficiently copy data to a postgres database using COPY FROM (faster alternative to SQLUpload for postgres). |
|
Execute a shell script. |
|
Download a file from a URL (uses urllib.request.urlretrieve) |
|
A dummy Task used to “plug” an external Product to a pipeline, this task is always considered up-to-date |
|
A dummy task used to represent input provided by the user, it is always considered outdated. |
Products¶
|
Abstract class for all Products |
|
A file (or directory) in the local filesystem |
|
A product that represents a SQL relation (table or view) with no metadata (incremental builds won’t work). |
|
A PostgreSQL relation |
|
A SQLite relation |
|
A GenericProduct whose identifier is a SQL relation, uses SQLite as metadata backend |
|
GenericProduct is used when there is no specific Product implementation. |
Clients¶
|
Abstract class for all clients |
|
A client for a PEP 249 compliant client library |
|
Client for connecting with any SQLAlchemy supported database |
|
Client to run command in the local shell |
|
Client for uploading File products to Amazon S3 |
|
Client for uploading File products to Google Cloud Storage |
Spec¶
|
A DAG spec is a dictionary with certain structure that can be converted to a DAG using |
Env¶
|
A function decorated with @with_env that starts and environment during the execution of a function. |
|
A function decorated with @load_env will be called with the current environment in an env keyword argument |
|
Return the current environment |
Serialization¶
|
Decorator for serializing functions |
|
A serializer that pickles everything |
|
Decorator for unserializing functions |
|
An unserializer that unpickles everything |
Executors¶
|
Executor than runs one task at a time |
|
Runs a DAG in parallel using multiprocessing |
SourceLoader¶
|
Load source files using a jinja2.Environment |