ploomber.clients.S3Client¶
- class ploomber.clients.S3Client(bucket_name, parent, json_credentials_path=None, path_to_project_root=None, credentials_relative_to_project_root=True, **kwargs)¶
Client for uploading File products to Amazon S3
- Parameters:
bucket_name (str) – Bucket to use
parent (str) – Parent folder in the bucket to store files. For example, if
parent='path/to'
, and a product in your pipeline isout/data.csv
, your file will appear in the bucket atpath/to/out/data.csv
.json_credentials_path (str, default=None) – JSON file to authenticate the client. Must contain
aws_access_key_id
andaws_secret_access_key
. If None, client is initialized without arguments (i.e.,boto3.client('s3')
)path_to_project_root (str, default=None) – Path to project root. If None, looks it up automatically and assigns it to the parent folder of your
pipeline.yaml
spec orsetup.py
(if your project is a package). This determines the path in remote storage. For example, ifpath_to_project_root
is/my-project
, you’re storing a product at/my-project/out/data.csv
, andparent='some-dir'
, the file will be stored in the bucket atsome-dir/out/data.csv
(we first compute the path of your product relative to the project root, then prefix it withparent
).credentials_relative_to_project_root (bool, default=True) – If
True
, relative paths injson_credentials_path
are so to thepath_to_project_root
, instead of the current working directory**kwargs – Keyword arguments for the client constructor
Examples
Spec API:
Given the following
clients.py
:import sqlalchemy from ploomber.clients import S3Client def get(): return S3Client(bucket_name='my-bucket', parent='my-pipeline')
Spec API (dag-level client):
clients: # all files from all tasks will be uploaded File: clients.get tasks: - source: notebook.ipynb product: output/report.html
Spec API (dag-level client, custom arguments):
clients: # if your get function takes arguments, pass them like this File: dotted_path: clients.get arg: value ... tasks: - source: notebook.ipynb product: output/report.html
Spec API (product-level client):
tasks: - source: notebook.ipynb product_client: clients.get # outputs from this task will be uploaded product: output/report.html
Python API (dag-level client):
>>> from ploomber import DAG >>> from ploomber.products import File >>> from ploomber.tasks import PythonCallable >>> from ploomber.clients import S3Client >>> dag = DAG() >>> client = S3Client(bucket_name='my-bucket', parent='my-pipeline', ... path_to_project_root='.') >>> dag.clients[File] = client # dag-level client >>> dag = DAG() >>> def my_function(product): ... Path(product).touch() >>> _ = PythonCallable(my_function, File('file.txt'), dag=dag) >>> dag.build()
Python API (product-level client):
>>> from ploomber import DAG >>> from ploomber.products import File >>> from ploomber.tasks import PythonCallable >>> from ploomber.clients import S3Client >>> dag = DAG() >>> client = S3Client(bucket_name='my-bucket', parent='my-pipeline', ... path_to_project_root='.') >>> dag = DAG() >>> def my_function(product): ... Path(product).touch() >>> product = File('file.txt', client=client) >>> _ = PythonCallable(my_function, product, dag=dag) >>> dag.build()
See also
ploomber.clients.GCloudStorageClient
Client for uploading products to Google Cloud Storage
Notes
If a notebook (or script) task fails, the partially executed
.ipynb
file will be uploaded using this client.Methods
close
()download
(local[, destination])Download remote copy of a given local path.
upload
(local)Upload file or folder from a local path by calling _upload as needed
- close()¶
- download(local, destination=None)¶
Download remote copy of a given local path. Local may be a file or a folder (all contents downloaded).
- Parameters:
local – Path to local file or folder whose remote copy will be downloaded
destination – Download location. If None, overwrites local copy
- upload(local)¶
Upload file or folder from a local path by calling _upload as needed
- Parameters:
local – Path to local file or folder to upload
Attributes
parent
Parent where all products are stored