ploomber.clients.GCloudStorageClient

class ploomber.clients.GCloudStorageClient(bucket_name, parent, json_credentials_path=None, path_to_project_root=None, credentials_relative_to_project_root=True, **kwargs)

Client for uploading File products to Google Cloud Storage

Parameters:
  • bucket_name (str) – Bucket to use

  • parent (str) – Parent folder in the bucket to store files. For example, if parent='path/to', and a product in your pipeline is out/data.csv, your file will appea in the bucket at path/to/out/data.csv.

  • json_credentials_path (str, default=None) – Use the given JSON file to authenticate the client (uses Client.from_service_account_json(**kwargs)), if None, initializes the client using Client(**kwargs)

  • path_to_project_root (str, default=None) – Path to project root. If None, looks it up automatically and assigns it to the parent folder of your pipeline.yaml spec or setup.py (if your project is a package). This determines the path in remote storage. For example, if path_to_project_root is /my-project, you’re storing a product at /my-project/out/data.csv, and parent='some-dir', the file will be stored in the bucket at some-dir/out/data.csv (we first compute the path of your product relative to the project root, then prefix it with parent).

  • credentials_relative_to_project_root (bool, default=True) – If True, relative paths in json_credentials_path are so to the path_to_project_root, instead of the current working directory

  • **kwargs – Keyword arguments for the client constructor

Examples

Spec API:

Given the following clients.py:

import sqlalchemy
from ploomber.clients import GCloudStorageClient

def get():
    return GCloudStorageClient(bucket_name='my-bucket',
                               parent='my-pipeline')

Spec API (dag-level client):

clients:
    # all files from all tasks will be uploaded
    File: clients.get

tasks:
    - source: notebook.ipynb
      product: output/report.html

Spec API (dag-level client, custom arguments):

clients:
    # if your get function takes arguments, pass them like this
    File:
        dotted_path: clients.get
        arg: value
        ...

tasks:
    - source: notebook.ipynb
      product: output/report.html

Spec API (product-level client):

tasks:
    - source: notebook.ipynb
      product_client: clients.get
      # outputs from this task will be uploaded
      product: output/report.html

Python API (dag-level client):

>>> from ploomber import DAG
>>> from ploomber.products import File
>>> from ploomber.tasks import PythonCallable
>>> from ploomber.clients import GCloudStorageClient
>>> dag = DAG()
>>> client = GCloudStorageClient(bucket_name='my-bucket',
...                              parent='my-pipeline',
...                              path_to_project_root='.')
>>> dag.clients[File] = client # dag-level client
>>> dag = DAG()
>>> def my_function(product):
...     Path(product).touch()
>>> _ = PythonCallable(my_function, File('file.txt'), dag=dag)
>>> dag.build() 

Python API (product-level client):

>>> from ploomber import DAG
>>> from ploomber.products import File
>>> from ploomber.tasks import PythonCallable
>>> from ploomber.clients import GCloudStorageClient
>>> dag = DAG()
>>> client = GCloudStorageClient(bucket_name='my-bucket',
...                              parent='my-pipeline',
...                              path_to_project_root='.')
>>> dag = DAG()
>>> def my_function(product):
...     Path(product).touch()
>>> product = File('file.txt', client=client)
>>> _ = PythonCallable(my_function, product, dag=dag)
>>> dag.build() 

See also

ploomber.clients.S3Client

Client for uploading products to Amazon S3

Notes

Complete example using the Spec API

If a notebook (or script) task fails, the partially executed .ipynb file will be uploaded using this client.

Methods

close()

download(local[, destination])

Download remote copy of a given local path.

upload(local)

Upload file or folder from a local path by calling _upload as needed

close()
download(local, destination=None)

Download remote copy of a given local path. Local may be a file or a folder (all contents downloaded).

Parameters:
  • local – Path to local file or folder whose remote copy will be downloaded

  • destination – Download location. If None, overwrites local copy

upload(local)

Upload file or folder from a local path by calling _upload as needed

Parameters:

local – Path to local file or folder to upload

Attributes

parent

Parent where all products are stored