Getting Started with Cloud Notebooks API¶
This tutorial will show you how you can execute notebooks in the cloud.
With Ploomber Cloud, you can run notebooks in parallel, and you can even request more resources (CPU, memory, GPU) and more!
The following sections will be covered in this tutorial. (You can click any of them to jump directly to the corresponding section.)
Pre-requisites¶
This section will help you setup your local environment to run notebooks in Ploomber Cloud. Basically, you only need to install Ploomber and set the API Key from your Ploomber Cloud account.
Installing Ploomber¶
To install the updated version of Ploomber, open a terminal and run the following command.
pip install ploomber --upgrade
Setting up the Ploomber Cloud API key¶
For this, you’ll need to sign in to Ploomber Cloud. Once you sign in, jou just need to copy your API key and run the following command in your terminal:
ploomber cloud set-key {your-key}
A detailed tutorial to get and set your API Key can be found here.
Notebook submission to Ploomber Cloud¶
For this simple example, we will use a notebook that draws random numbers with numpy
and plots them with matplotlib
.
To make it more interesting, we will use these random numbes to estimate a value of pi using the Monte Carlo Method. (This is just to make a simple example more interesting, but in any case, the important part is just to see how this example will allow you to run the notebook directly in Ploober Cloud.)
First, you will need to download the sample notebook we have prepared to your local folder. To do so, you just need to run the following command in your terminal:
curl https://raw.githubusercontent.com/ploomber/projects/master/guides/cloud-notebooks-getting-started/plot.ipynb -o plot.ipynb
Note that we only need to supply the notebook. Ploomber Cloud will figure out which dependendencies to install.
[2]:
ploomber cloud nb plot.ipynb
Uploading plot-aebe61a1.ipynb...
Triggering execution of plot-aebe61a1.ipynb...
Note that the notebook plot.ipynb
is uploaded to the cloud as plot-{identifier}.ipynb
. Ploomber Cloud automatically versions notebooks, so it will generate an identifier for your notebook (for example plot-aebe61a1.ipynb
). We can use this identifier to download the source later and to download any associated artifacts.
Congratulations! 🎉 You have submitted your first notebook run to Ploomber Cloud. We will now explore how we can monitor our submitted notebooks and how we can download our results.
Execution monitoring¶
Once your notebook is submitted, we can check that the task was successfully generated a runid
and we can check the status by running the following command in your terminal:
[3]:
ploomber cloud list
created_at runid status
-------------- ------------------------------------ --------
31 seconds ago 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe created
6 hours ago ee78f4c1-ee42-4ba5-ba2f-9e73ae9228d6 finished
This will list our notebook submissions. The top row is the notebook we just submitted.
Ploomber Cloud builds a Docker image with your notebook, so we can get the logs from this run:
[12]:
ploomber cloud logs @latest --image | tail -n 10
[Container] 2022/10/26 03:07:51 Phase complete: BUILD State: SUCCEEDED
[Container] 2022/10/26 03:07:51 Phase context status code: Message:
[Container] 2022/10/26 03:07:51 Entering phase POST_BUILD
[Container] 2022/10/26 03:07:51 Phase complete: POST_BUILD State: SUCCEEDED
[Container] 2022/10/26 03:07:51 Phase context status code: Message:
You should see something like the message above, if not, wait for a minute or so for the Docker build process to finish.
Run the list
command again, you should see that the status
changed to started
:
[13]:
ploomber cloud list
created_at runid status
------------- ------------------------------------ --------
6 minutes ago 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe started
6 hours ago ee78f4c1-ee42-4ba5-ba2f-9e73ae9228d6 finished
This submission only contains a single notebook, so if we check the status, we’ll only see one task:
[14]:
ploomber cloud status @latest
Geting latest ID...
Got ID: 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe
Unknown status: started
taskid name runid status
-------------------------- -------------- -------------------------- --------
5e830a18-42ed-4f0e-afb6-59 plot-aebe61a1- 2d4bcadf-5acb-49a5-8806-af created
2159fc5f37 2dbe1b32fe
After a 1-2 minutes, you’ll see the notebook finished execution:
[15]:
ploomber cloud status @latest
Geting latest ID...
Got ID: 2d4bcadf-5acb-49a5-8806-af2dbe1b32fe
Pipeline finished...
taskid name runid status
-------------------------- -------------- -------------------------- --------
5e830a18-42ed-4f0e-afb6-59 plot-aebe61a1- 2d4bcadf-5acb-49a5-8806-af finished
2159fc5f37 2dbe1b32fe
Congrats, your first notebook submission has been a success and you’ve learned how to monitor your submission! Now let’s see how we can download the results.
Downloading the results¶
All your notebooks are stored in your “products workspace”, let’s see what’s in there:
[16]:
ploomber cloud products
path
-----------------------------------
plot-aebe61a1/output/notebook.ipynb
plot-f7ad8452/output/notebook.ipynb
Let’s download the executed notebook by using the ID we got upon submission:
[17]:
ploomber cloud download 'plot-aebe61a1/*.ipynb'
Writing file into path plot-aebe61a1/output/.notebook.ipynb.metadata
Writing file into path plot-aebe61a1/output/notebook.ipynb
Up next¶
Some of the advantages of running parallel notebooks are that we can run different tasks or processes simultaneously by using multiple computing resources. This will allow us to work more efficiently. More details about this can be read in our blog post.
In the next tutorial, we’ll learn how to parallelize notebooks and request resources, so we can get the most out of Ploomber Cloud!