Database configuration

To have SQL scripts as tasks, you must configure a database client. There are two available clients: ploomber.clients.SQLAlchemyClient and ploomber.clients.DBAPIClient, we recommend using the sqlalchemy client if your database is supported because it is compatible with more types of SQL tasks (e.g., ploomber.tasks.SQLDump, which dumps data into a local file).

Using SQLAlchemyClient

Ensure that you can connect to the database using sqlalchemy:

from sqlalchemy import create_engine

engine = create_engine('DATABASE_URI')

DATABASE_URI depends on the type of database. sqlalchemy supports a wide range of databases; you can find a list in their documentation, while others come in third-party packages (e.g., Snowflake).

If create_engine is successful, ensure you can query your database:

with engine.connect() as conn:
    results = conn.execute('SELECT * FROM some_table LIMIT 10')

If the query works, you can initialize a SQLAlchemyClient with the same DATABASE_URI:

from ploomber.clients import SQLAlchemyClient

client = SQLAlchemyClient('DATABASE_URI')

Using Snowflake

Here’s some sample code to configure Snowflake:

# install snowflake-sqlalchemy
pip install snowflake-sqlalchemy

Build your URL with the helper function:

from snowflake.sqlalchemy import URL

params = dict(user='user',
              password='pass',
              account='acct',
              warehouse='warehouse',
              database='db',
              schema='schema',
              role='role')

client = SQLAlchemyClient(URL(**params))

If using OAuth instead of user/password authentication, you need to include the token:

import json
import requests # pip install requests
from snowflake.sqlalchemy import URL

def get_snowflake_token(username, password, account):
    headers = {'content-type': 'application/x-www-form-urlencoded'}
    data = {
        'grant_type': 'password',
        'scope': 'SESSION:ROLE-ANY',
        'username: username,
        'password': password,
        'client_id: f'https://{account}.snowflakecomputing.com',
    }
    response = requests.post(oauth_url, data=data, headers=headers,
                             verify=False)

    return str(json.loads(response.text)['access_token']).strip()

token = get_snowflake_token('user', 'password', 'account')

params = dict(user='user',
              account='acct',
              warehouse='warehouse',
              database='db',
              schema='schema',
              role='role',
              authentication='oauth',
              token=token)

client = SQLAlchemyClient(URL(**params))

Using DBAPIClient

DBAPIClient takes a function that returns a DBAPI compatible connection and parameters to initialize such connection.

Here’s an example with SQLite:

from ploomber.clients import DBAPIClient
import sqlite3

client = DBAPIClient(sqlite3.connect, dict(database='my.db'))

Under the hood, Ploomber calls sqlite3.connect(database='my.db').

Another example, this time using Snowflake:

from ploomber.clients import DBAPIClient
import snowflake.connector

params = dict(user='USER', password='PASS', account='ACCOUNT')
client = DBAPIClient(snowflake.connector.connect, params)

Configuring the client in pipeline.yaml

Check out the SQL Pipelines to learn how to configure the database client in your pipeline.yaml file.

Examples

To see some examples using SQL connections, see this:

  1. A short example that dumps data.

  2. A SQL pipeline.

  3. An ETL pipeline.