Snowflake
Install
Install Ibis and dependencies for the Snowflake backend:
Install with the snowflake
extra:
pip install 'ibis-framework[snowflake]'
And connect:
import ibis
= ibis.snowflake.connect() con
- 1
- Adjust connection parameters as needed.
Install for Snowflake:
conda install -c conda-forge ibis-snowflake
And connect:
import ibis
= ibis.snowflake.connect() con
- 1
- Adjust connection parameters as needed.
Install for Snowflake:
mamba install -c conda-forge ibis-snowflake
And connect:
import ibis
= ibis.snowflake.connect() con
- 1
- Adjust connection parameters as needed.
Connect
ibis.snowflake.connect
= ibis.snowflake.connect(
con ="user",
user="password",
password="safpqpq-sq55555",
account="IBIS_TESTING/IBIS_TESTING",
database )
ibis.snowflake.connect
is a thin wrapper around ibis.backends.snowflake.Backend.do_connect
.
Connection Parameters
do_connect
do_connect(self, user, account, database, password=None, authenticator=None, connect_args=None, create_object_udfs=True, **kwargs)
Connect to Snowflake.
Parameters
Name | Type | Description | Default |
---|---|---|---|
user |
str | Username | required |
account |
str | A Snowflake organization ID and a Snowflake user ID, separated by a hyphen. Note that a Snowflake user ID is a separate identifier from a username. See https://ibis-project.org/backends/Snowflake/ for details | required |
database |
str | A Snowflake database and a Snowflake schema, separated by a / . See https://ibis-project.org/backends/Snowflake/ for details |
required |
password |
str | None | Password. If empty or None then authenticator must be passed. |
None |
authenticator |
str | None | String indicating authentication method. See https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-example#connecting-with-oauth for details. Note that the authentication flow will not take place until a database connection is made. This means that ibis.snowflake.connect(...) can succeed, while subsequent API calls fail if the authentication fails for any reason. |
None |
create_object_udfs |
bool | Enable object UDF extensions defined by ibis on the first connection to the database. | True |
connect_args |
Mapping[str, Any] | None | Additional arguments passed to the SQLAlchemy engine creation call. | None |
kwargs |
Any | Additional arguments passed to the SQLAlchemy URL constructor. See https://docs.snowflake.com/en/developer-guide/python-connector/sqlalchemy#additional-connection-parameters for more details | {} |
ibis.connect
URL format
In addition to ibis.snowflake.connect
, you can also connect to Snowflake by passing a properly formatted Snowflake connection URL to ibis.connect
= ibis.connect(f"snowflake://{user}:{password}@{account}/{database}") con
Authenticating with SSO
Ibis supports connecting to SSO-enabled Snowflake warehouses using the authenticator
parameter.
You can use it in the explicit-parameters-style or in the URL-style connection APIs. All values of authenticator
are supported.
Explicit
= ibis.snowflake.connect(
con ="user",
user="safpqpq-sq55555",
account="my_database/my_schema",
database="my_warehouse",
warehouse="externalbrowser",
authenticator )
URL
= ibis.connect(
con f"snowflake://{user}@{account}/{database}?warehouse={warehouse}",
="externalbrowser",
authenticator )
Looking up your Snowflake organization ID and user ID
A Snowflake account identifier consists of an organization ID and a user ID, separated by a hyphen.
This user ID is not the same as the username you log in with.
To find your organization ID and user ID, log in to the Snowflake web app, then click on the text just to the right of the Snowflake logo (in the lower-left-hand corner of the screen).
The bold text at the top of the little pop-up window is your organization ID. The bold blue text with a checkmark next to it is your user ID.
Choosing a value for database
Snowflake refers to a collection of tables as a schema, and a collection of schema as a database.
You must choose a database and a schema to connect to. You can refer to the available databases and schema in the “Data” sidebar item in the Snowflake web app.
snowflake.Backend
create_database
create_database(self, name, force=False)
Create a new database.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
str | Name of the new database. | required |
force |
bool | If False , an exception is raised if the database already exists. |
False |
create_schema
create_schema(self, name, database=None, force=False)
Create a schema named name
in database
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
str | Name of the schema to create. | required |
database |
str | None | Name of the database in which to create the schema. If None , the current database is used. |
None |
force |
bool | If False , an exception is raised if the schema exists. |
False |
create_table
create_table(self, name, obj=None, *, schema=None, database=None, temp=False, overwrite=False, comment=None)
Create a table in Snowflake.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
str | Name of the table to create | required |
obj |
pd.DataFrame | pa.Table | ir.Table | None | The data with which to populate the table; optional, but at least one of obj or schema must be specified |
None |
schema |
sch.Schema | None | The schema of the table to create; optional, but at least one of obj or schema must be specified |
None |
database |
str | None | The name of the database in which to create the table; if not passed, the current database is used. | None |
temp |
bool | Create a temporary table | False |
overwrite |
bool | If True , replace the table if it already exists, otherwise fail if the table exists |
False |
comment |
str | None | Add a comment to the table | None |
drop_database
drop_database(self, name, force=False)
Drop a database with name name
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
str | Database to drop. | required |
force |
bool | If False , an exception is raised if the database does not exist. |
False |
drop_schema
drop_schema(self, name, database=None, force=False)
Drop the schema with name
in database
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
str | Name of the schema to drop. | required |
database |
str | None | Name of the database to drop the schema from. If None , the current database is used. |
None |
force |
bool | If False , an exception is raised if the schema does not exist. |
False |
drop_table
drop_table(self, name, database=None, force=False)
fetch_from_cursor
fetch_from_cursor(self, cursor, schema)
list_databases
list_databases(self, like=None)
List existing databases in the current connection.
Parameters
Name | Type | Description | Default |
---|---|---|---|
like |
str | None | A pattern in Python’s regex format to filter returned database names. | None |
Returns
Type | Description |
---|---|
list[str] | The database names that exist in the current connection, that match the like pattern if provided. |
list_schemas
list_schemas(self, like=None, database=None)
read_csv
read_csv(self, path, table_name=None, **kwargs)
Register a CSV file as a table in the Snowflake backend.
Parameters
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path | A string or Path to a CSV file; globs are supported | required |
table_name |
str | None | Optional name for the table; if not passed, a random name will be generated | None |
kwargs |
Any | Snowflake-specific file format configuration arguments. See the documentation for the full list of options: https://docs.snowflake.com/en/sql-reference/sql/create-file-format#type-csv | {} |
Returns
Type | Description |
---|---|
Table | The table that was read from the CSV file |
read_json
read_json(self, path, table_name=None, **kwargs)
Read newline-delimited JSON into an ibis table, using Snowflake.
Parameters
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path | A string or Path to a JSON file; globs are supported | required |
table_name |
str | None | Optional table name | None |
kwargs |
Any | Additional keyword arguments. See https://docs.snowflake.com/en/sql-reference/sql/create-file-format#type-json for the full list of options. | {} |
Returns
Type | Description |
---|---|
Table | An ibis table expression |
read_parquet
read_parquet(self, path, table_name=None, **kwargs)
Read a Parquet file into an ibis table, using Snowflake.
Parameters
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path | A string or Path to a Parquet file; globs are supported | required |
table_name |
str | None | Optional table name | None |
kwargs |
Any | Additional keyword arguments. See https://docs.snowflake.com/en/sql-reference/sql/create-file-format#type-parquet for the full list of options. | {} |
Returns
Type | Description |
---|---|
Table | An ibis table expression |
to_pandas_batches
to_pandas_batches(self, expr, *, params=None, limit=None, **_)
Execute an Ibis expression and return an iterator of pandas DataFrame
s.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr |
ir.Expr | Ibis expression to execute. | required |
params |
Mapping[ir.Scalar, Any] | None | Mapping of scalar parameter expressions to value. | None |
limit |
int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
chunk_size |
int | Maximum number of rows in each returned DataFrame batch. This may have no effect depending on the backend. |
1000000 |
kwargs |
Any | Keyword arguments | {} |
Returns
Type | Description |
---|---|
Iterator[pd.DataFrame] | An iterator of pandas DataFrame s. |
to_pyarrow
to_pyarrow(self, expr, *, params=None, limit=None, **_)
Execute expression and return results in as a pyarrow table.
This method is eager and will execute the associated expression immediately.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr |
ir.Expr | Ibis expression to export to pyarrow | required |
params |
Mapping[ir.Scalar, Any] | None | Mapping of scalar parameter expressions to value. | None |
limit |
int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
kwargs |
Any | Keyword arguments | {} |
Returns
Type | Description |
---|---|
Table | A pyarrow table holding the results of the executed expression. |
to_pyarrow_batches
to_pyarrow_batches(self, expr, *, params=None, limit=None, chunk_size=1000000, **_)
Execute expression and return an iterator of pyarrow record batches.
This method is eager and will execute the associated expression immediately.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr |
ir.Expr | Ibis expression to export to pyarrow | required |
limit |
int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params |
Mapping[ir.Scalar, Any] | None | Mapping of scalar parameter expressions to value. | None |
chunk_size |
int | Maximum number of rows in each returned record batch. | 1000000 |
Returns
Type | Description |
---|---|
RecordBatchReader | Collection of pyarrow RecordBatch s. |