Executing notebooks in the cloud

This tutorial will show you how you can execute notebooks in the cloud.

With Ploomber Cloud, you can run notebooks in parallel, and you can even request more resources (CPU, memory, GPU) and more!

The following sections will be covered in this tutorial. (You can click any of them to jump directly to the corresponding section.)

Pre-requisites

This section will help you setup your local environment to run notebooks in Ploomber Cloud. Basically, you only need to install Ploomber and set the API Key from your Ploomber Cloud account.

Installing Ploomber

To install the updated version of Ploomber, open a terminal and run the following command.

pip install ploomber --upgrade

Setting up the Ploomber Cloud API key

For this, you’ll need to sign in to Ploomber Cloud. Once you sign in, jou just need to copy your API key and run the following command in your terminal:

ploomber cloud set-key {your-key}

A detailed tutorial to get and set your API Key can be found here.

Notebook submission to Ploomber Cloud

For this simple example, we will use a notebook that draws random numbers with numpy and plots them with matplotlib.

To make it more interesting, we will use these random numbes to estimate a value of pi using the Monte Carlo Method. (This is just to make a simple example more interesting, but in any case, the important part is just to see how this example will allow you to run the notebook directly in Ploober Cloud.)

00f3ffeb91d04c0f8d982cddf842ecee

First, you will need to download the sample notebook we have prepared to your local folder. To do so, you just need to run the following command in your terminal:

curl https://raw.githubusercontent.com/ploomber/projects/master/guides/cloud-notebook-simple/plot.ipynb -o plot.ipynb

Note that we only need to supply the notebook. Ploomber Cloud will figure out which dependendencies to install.

[2]:
ploomber cloud nb plot.ipynb
Uploading plot-aebe61a1.ipynb...
Triggering execution of plot-aebe61a1.ipynb...

Note that the notebook plot.ipynb is uploaded to the cloud as plot-{identifier}.ipynb. Ploomber Cloud automatically versions notebooks, so it will generate an identifier for your notebook (for example plot-aebe61a1.ipynb). We can use this identifier to download the source later and to download any associated artifacts.

Congratulations! 🎉 You have submitted your first notebook run to Ploomber Cloud. We will now explore how we can monitor our submitted notebooks and how we can download our results.

Execution monitoring

Once your notebook is submitted, we can check that the task was successfully generated a runid and we can check the status by running the following command in your terminal:

[3]:
ploomber cloud list
created_at      runid                                 status
--------------  ------------------------------------  --------
31 seconds ago  2d4bcadf-5acb-49a5-8806-af2dbe1b32fe  created
6 hours ago     ee78f4c1-ee42-4ba5-ba2f-9e73ae9228d6  finished

This will list our notebook submissions. The top row is the notebook we just submitted.

Ploomber Cloud builds a Docker image with your notebook, so we can get the logs from this run:

[12]:
ploomber cloud logs @latest --image | tail -n 10
[Container] 2022/10/26 03:07:51 Phase complete: BUILD State: SUCCEEDED

[Container] 2022/10/26 03:07:51 Phase context status code:  Message:

[Container] 2022/10/26 03:07:51 Entering phase POST_BUILD

[Container] 2022/10/26 03:07:51 Phase complete: POST_BUILD State: SUCCEEDED

[Container] 2022/10/26 03:07:51 Phase context status code:  Message:

You should see something like the message above, if not, wait for a minute or so for the Docker build process to finish.

Run the list command again, you should see that the status changed to started:

[13]:
ploomber cloud list
created_at     runid                                 status
-------------  ------------------------------------  --------
6 minutes ago  2d4bcadf-5acb-49a5-8806-af2dbe1b32fe  started
6 hours ago    ee78f4c1-ee42-4ba5-ba2f-9e73ae9228d6  finished

This submission only contains a single notebook, so if we check the status, we’ll only see one task:

[14]:
ploomber cloud status @latest
Geting latest ID...
Got ID: 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe
Unknown status: started
taskid                      name            runid                       status
--------------------------  --------------  --------------------------  --------
5e830a18-42ed-4f0e-afb6-59  plot-aebe61a1-  2d4bcadf-5acb-49a5-8806-af  created
2159fc5f37                                  2dbe1b32fe

After a 1-2 minutes, you’ll see the notebook finished execution:

[15]:
ploomber cloud status @latest
Geting latest ID...
Got ID: 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe
Pipeline finished...
taskid                      name            runid                       status
--------------------------  --------------  --------------------------  --------
5e830a18-42ed-4f0e-afb6-59  plot-aebe61a1-  2d4bcadf-5acb-49a5-8806-af  finished
2159fc5f37                                  2dbe1b32fe

Congrats, your first notebook submission has been a success and you’ve learned how to monitor your submission! Now let’s see how we can download the results.

Downloading the results

All your notebooks are stored in your “products workspace”, let’s see what’s in there:

[16]:
ploomber cloud products
path
-----------------------------------
plot-aebe61a1/output/notebook.ipynb
plot-f7ad8452/output/notebook.ipynb

Let’s download the executed notebook by using the ID we got upon submission:

[17]:
ploomber cloud download 'plot-aebe61a1/*.ipynb'
Writing file into path plot-aebe61a1/output/.notebook.ipynb.metadata
Writing file into path plot-aebe61a1/output/notebook.ipynb

Up next

Some of the advantages of running parallel notebooks are that we can run different tasks or processes simultaneously by using multiple computing resources. This will allow us to work more efficiently. More details about this can be read in our blog post.

In the next tutorial, we’ll learn how to parallelize notebooks and request resources, so we can get the most out of Ploomber Cloud!