ClickHouse

Install

Install Ibis and dependencies for the ClickHouse backend:

Install with the clickhouse extra:

pip install 'ibis-framework[clickhouse]'

And connect:

import ibis

con = ibis.clickhouse.connect()

1: Adjust connection parameters as needed.

Install for ClickHouse:

conda install -c conda-forge ibis-clickhouse

And connect:

import ibis

con = ibis.clickhouse.connect()

1: Adjust connection parameters as needed.

Install for ClickHouse:

mamba install -c conda-forge ibis-clickhouse

And connect:

import ibis

con = ibis.clickhouse.connect()

1: Adjust connection parameters as needed.

Connect

`ibis.clickhouse.connect`

con = ibis.clickhouse.connect(
    user="username",
    password="password",
    host="hostname",
)

Note

ibis.clickhouse.connect is a thin wrapper around ibis.backends.clickhouse.Backend.do_connect.

Connection Parameters

do_connect

do_connect(self, host='localhost', port=None, database='default', user='default', password='', client_name='ibis', secure=None, compression=True, **kwargs)

Create a ClickHouse client for use with Ibis.

Parameters

Name	Type	Description	Default
`host`	str	Host name of the clickhouse server	`'localhost'`
`port`	int \| None	ClickHouse HTTP server’s port. If not passed, the value depends on whether `secure` is `True` or `False`.	`None`
`database`	str	Default database when executing queries	`'default'`
`user`	str	User to authenticate with	`'default'`
`password`	str	Password to authenticate with	`''`
`client_name`	str	Name of client that will appear in clickhouse server logs	`'ibis'`
`secure`	bool \| None	Whether or not to use an authenticated endpoint	`None`
`compression`	str \| bool	The kind of compression to use for requests. See https://clickhouse.com/docs/en/integrations/python#compression for more information.	`True`
`kwargs`	typing.Any	Client specific keyword arguments	`{}`

Examples

>>> import ibis
>>> client = ibis.clickhouse.connect()
>>> client
<ibis.clickhouse.client.ClickhouseClient object at 0x...>

`ibis.connect` URL format

In addition to ibis.clickhouse.connect, you can also connect to ClickHouse by passing a properly-formatted ClickHouse connection URL to ibis.connect:

con = ibis.connect(f"clickhouse://{user}:{password}@{host}:{port}?secure={secure}")

ClickHouse playground

ClickHouse provides a free playground with several datasets that you can connect to using Ibis:

from ibis.interactive import *

con = ibis.connect("clickhouse://play:clickhouse@play.clickhouse.com:443?secure=True")
actors = con.table("actors")
actors

┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ login            ┃ type    ┃ site_admin ┃ name         ┃ company             ┃ blog                ┃ location          ┃ email                     ┃ hireable ┃ bio                                       ┃ twitter_username ┃ public_repos ┃ public_gists ┃ followers ┃ following ┃ created_at          ┃ updated_at          ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ !string          │ !string │ !boolean   │ !string      │ !string             │ !string             │ !string           │ !string                   │ !boolean │ !string                                   │ !string          │ !int64       │ !int64       │ !int64    │ !int64    │ !timestamp(0)       │ !timestamp(0)       │
├──────────────────┼─────────┼────────────┼──────────────┼─────────────────────┼─────────────────────┼───────────────────┼───────────────────────────┼──────────┼───────────────────────────────────────────┼──────────────────┼──────────────┼──────────────┼───────────┼───────────┼─────────────────────┼─────────────────────┤
│ 0000Blaze        │ User    │ False      │ RohanChhetry │ Pulchowk Campus,IOE │ rohanchhetry.com.np │ Sanepa , Lalitpur │ ~                         │ True     │ ~                                         │ rohanchhetry9    │           56 │            0 │        57 │        83 │ 2019-02-24 02:31:21 │ 2023-07-30 11:30:14 │
│ 007developforfun │ User    │ False      │ ~            │ ~                   │ ~                   │ ~                 │ ~                         │ False    │ ~                                         │ ~                │            0 │            0 │         0 │         0 │ 2015-08-07 11:28:01 │ 2022-08-12 08:45:30 │
│ 00arthur00       │ User    │ False      │ Arthur       │ ~                   │ ~                   │ Beijing, China    │ yang_yapo@126.com         │ False    │ ~                                         │ ~                │           72 │            1 │         5 │        61 │ 2017-04-01 13:37:01 │ 2023-06-15 14:50:12 │
│ 010001           │ User    │ False      │ ~            │ ~                   │ ~                   │ ~                 │ 1025394547@qq.com         │ False    │ ~                                         │ ~                │           15 │            0 │         1 │         1 │ 2015-02-05 03:11:59 │ 2023-03-17 06:07:01 │
│ 01001101ilad     │ User    │ False      │ Milad        │ ~                   │ ~                   │ ~                 │ ~                         │ False    │ Programmer, Writer and Full-Time Learner. │ ~                │           10 │            0 │         0 │         0 │ 2016-10-31 19:12:55 │ 2023-07-24 11:43:03 │
│ 010227leo        │ User    │ False      │ zucker       │ trip.com            │ ~                   │ Shanghai, China   │ ~                         │ False    │ ~                                         │ ~                │            2 │            1 │         7 │        11 │ 2012-01-11 06:23:15 │ 2023-07-24 03:35:26 │
│ 010ric           │ User    │ False      │ Mario Turic  │ ~                   │ ~                   │ Munich            │ ~                         │ True     │ Enthusiast and Maker                      │ ~                │           19 │            0 │        23 │        68 │ 2017-10-27 14:00:07 │ 2023-08-04 18:44:35 │
│ 01egen           │ User    │ False      │ ~            │ ~                   │ ~                   │ ~                 │ ~                         │ False    │ ~                                         │ ~                │            1 │            0 │         0 │         2 │ 2019-02-27 08:59:00 │ 2023-06-20 04:02:51 │
│ 0400H            │ User    │ False      │ 0400H        │ ~                   │ ~                   │ Shanghai          │ git@0400h.cn              │ True     │ HPC & MLSys & PPML                        │ ~                │           17 │            0 │         3 │        42 │ 2015-12-20 17:38:00 │ 2023-07-21 11:28:22 │
│ 0442A403         │ User    │ False      │ Damir Petrov │ ~                   │ ~                   │ Moscow            │ petrovdamir2235@gmail.com │ False    │ HSE student                               │ ~                │           19 │            0 │        15 │        30 │ 2016-11-05 18:59:38 │ 2023-06-22 06:08:50 │
│ …                │ …       │ …          │ …            │ …                   │ …                   │ …                 │ …                         │ …        │ …                                         │ …                │            … │            … │         … │         … │ …                   │ …                   │
└──────────────────┴─────────┴────────────┴──────────────┴─────────────────────┴─────────────────────┴───────────────────┴───────────────────────────┴──────────┴───────────────────────────────────────────┴──────────────────┴──────────────┴──────────────┴───────────┴───────────┴─────────────────────┴─────────────────────┘

`clickhouse.Backend`

close

close(self)

Close ClickHouse connection.

compile

compile(self, expr, limit=None, params=None, **kwargs)

Compile an Ibis expression to a ClickHouse SQL string.

create_database

create_database(self, name, *, force=False, engine='Atomic')

Create a new database.

Parameters

Name	Type	Description	Default
`name`	str	Name of the new database.	required
`force`	bool	If `False`, an exception is raised if the database already exists.	`False`

create_table

create_table(self, name, obj=None, *, schema=None, database=None, temp=False, overwrite=False, engine='MergeTree', order_by=None, partition_by=None, sample_by=None, settings=None)

Create a table in a ClickHouse database.

Parameters

Name	Type	Description	Default
`name`	str	Name of the table to create	required
`obj`	pandas.pandas.DataFrame \| pyarrow.pyarrow.Table \| ibis.ibis.Table \| None	Optional data to create the table with	`None`
`schema`	ibis.ibis.Schema \| None	Optional names and types of the table	`None`
`database`	str \| None	Database to create the table in	`None`
`temp`	bool	Create a temporary table. This is not yet supported, and exists for API compatibility.	`False`
`overwrite`	bool	Whether to overwrite the table	`False`
`engine`	str	The table engine to use. See ClickHouse’s `CREATE TABLE` documentation for specifics. Defaults to `MergeTree` with `ORDER BY tuple()` because `MergeTree` is the most feature-complete engine.	`'MergeTree'`
`order_by`	collections.abc.Iterable[str] \| None	String column names to order by. Required for some table engines like `MergeTree`.	`None`
`partition_by`	collections.abc.Iterable[str] \| None	String column names to partition by	`None`
`sample_by`	str \| None	String column names to sample by	`None`
`settings`	collections.abc.Mapping[str, typing.Any] \| None	Key-value pairs of settings for table creation	`None`

Returns

Type	Description
Table	The new table

create_view

create_view(self, name, obj, *, database=None, overwrite=False)

Create a new view from an expression.

Parameters

Name	Type	Description	Default
`name`	str	Name of the new view.	required
`obj`	ibis.ibis.Table	An Ibis table expression that will be used to create the view.	required
`database`	str \| None	Name of the database where the view will be created, if not provided the database’s default is used.	`None`
`overwrite`	bool	Whether to clobber an existing view with the same name	`False`

Returns

Type	Description
Table	The view that was created.

drop_database

drop_database(self, name, *, force=False)

Drop a database with name name.

Parameters

Name	Type	Description	Default
`name`	str	Database to drop.	required
`force`	bool	If `False`, an exception is raised if the database does not exist.	`False`

drop_table

drop_table(self, name, database=None, force=False)

Drop a table.

Parameters

Name	Type	Description	Default
`name`	str	Name of the table to drop.	required
`database`	str \| None	Name of the database where the table exists, if not the default.	`None`
`force`	bool	If `False`, an exception is raised if the table does not exist.	`False`

drop_view

drop_view(self, name, *, database=None, force=False)

Drop a view.

Parameters

Name	Type	Description	Default
`name`	str	Name of the view to drop.	required
`database`	str \| None	Name of the database where the view exists, if not the default.	`None`
`force`	bool	If `False`, an exception is raised if the view does not exist.	`False`

execute

execute(self, expr, limit='default', external_tables=None, **kwargs)

Execute an expression.

get_schema

get_schema(self, table_name, database=None)

Return a Schema object for the indicated table and database.

Parameters

Name	Type	Description	Default
`table_name`	str	May not be fully qualified. Use `database` if you want to qualify the identifier.	required
`database`	str \| None	Database name	`None`

Returns

Type	Description
ibis.ibis.Schema	Ibis schema

has_operation

has_operation(cls, operation)

Return whether the backend implements support for operation.

Parameters

Name	Type	Description	Default
`operation`	type[ibis.ibis.Value]	A class corresponding to an operation.	required

Returns

Type	Description
bool	Whether the backend implements the operation.

Examples

>>> import ibis
>>> import ibis.expr.operations as ops
>>> ibis.sqlite.has_operation(ops.ArrayIndex)
False
>>> ibis.postgres.has_operation(ops.ArrayIndex)
True

insert

insert(self, name, obj, settings=None, **kwargs)

list_databases

list_databases(self, like=None)

List existing databases in the current connection.

Parameters

Name	Type	Description	Default
`like`	str \| None	A pattern in Python’s regex format to filter returned database names.	`None`

Returns

Type	Description
list[str]	The database names that exist in the current connection, that match the `like` pattern if provided.

list_tables

list_tables(self, like=None, database=None)

Return the list of table names in the current database.

For some backends, the tables may be files in a directory, or other equivalent entities in a SQL database.

Parameters

Name	Type	Description	Default
`like`	str \| None	A pattern in Python’s regex format.	`None`
`database`	str \| None	The database from which to list tables. If not provided, the current database is used.	`None`

Returns

Type	Description
list[str]	The list of the table names that match the pattern `like`.

raw_sql

raw_sql(self, query, external_tables=None, **kwargs)

Execute a SQL string query against the database.

Parameters

Name	Type	Description	Default
`query`	str \| sqlglot.sqlglot.exp.sqlglot.exp.Expression	Raw SQL string	required
`external_tables`	collections.abc.Mapping[str, pandas.pandas.DataFrame] \| None	Mapping of table name to pandas DataFrames providing external datasources for the query	`None`
`kwargs`		Backend specific query arguments	`{}`

Returns

Type	Description
Cursor	Clickhouse cursor

read_csv

read_csv(self, path, table_name=None, engine='MergeTree', **kwargs)

Parameters

Name	Type	Description	Default
`path`	str \| pathlib.Path	The data source. A string or Path to the CSV file.	required
`table_name`	str \| None	An optional name to use for the created table. This defaults to a sequentially generated name.	`None`
`**kwargs`	typing.Any	Additional keyword arguments passed to the backend loading function.	`{}`

Returns

Type	Description
ibis.ibis.Table	The just-registered table

read_parquet

read_parquet(self, path, table_name=None, engine='MergeTree', **kwargs)

Parameters

Name	Type	Description	Default
`path`	str \| pathlib.Path	The data source.	required
`table_name`	str \| None	An optional name to use for the created table. This defaults to a sequentially generated name.	`None`
`**kwargs`	typing.Any	Additional keyword arguments passed to the backend loading function.	`{}`

Returns

Type	Description
ibis.ibis.Table	The just-registered table

sql

sql(self, query, schema=None, dialect=None)

table

table(self, name, database=None)

Construct a table expression.

Parameters

Name	Type	Description	Default
`name`	str	Table name	required
`database`	str \| None	Database name	`None`

Returns

Type	Description
Table	Table expression

to_pyarrow

to_pyarrow(self, expr, *, params=None, limit=None, external_tables=None, **kwargs)

Execute expression and return results in as a pyarrow table.

This method is eager and will execute the associated expression immediately.

Parameters

Name	Type	Description	Default
`expr`	ibis.ibis.Expr	Ibis expression to export to pyarrow	required
`params`	collections.abc.Mapping[ibis.ibis.Scalar, typing.Any] \| None	Mapping of scalar parameter expressions to value.	`None`
`limit`	int \| str \| None	An integer to effect a specific row limit. A value of `None` means “no limit”. The default is in `ibis/config.py`.	`None`
`kwargs`	typing.Any	Keyword arguments	`{}`

Returns

Type	Description
Table	A pyarrow table holding the results of the executed expression.

to_pyarrow_batches

to_pyarrow_batches(self, expr, *, limit=None, params=None, external_tables=None, chunk_size=1000000, **_)

Execute expression and return an iterator of pyarrow record batches.

This method is eager and will execute the associated expression immediately.

Parameters

Name	Type	Description	Default
`expr`	ibis.ibis.Expr	Ibis expression to export to pyarrow	required
`limit`	int \| str \| None	An integer to effect a specific row limit. A value of `None` means “no limit”. The default is in `ibis/config.py`.	`None`
`params`	collections.abc.Mapping[ibis.ibis.Scalar, typing.Any] \| None	Mapping of scalar parameter expressions to value.	`None`
`external_tables`	collections.abc.Mapping[str, typing.Any] \| None	External data	`None`
`chunk_size`	int	Maximum number of row to return in a single chunk	`1000000`

Returns

Type	Description
results	RecordBatchReader

Notes

There are a variety of ways to implement clickhouse -> record batches.

FORMAT ArrowStream -> record batches via raw_query This has the same type conversion problem(s) as to_pyarrow. It’s harder to address due to lack of cast on RecordBatch. However, this is a ClickHouse problem: we should be able to get string data out without a bunch of settings/permissions rigmarole.
Native -> Python objects -> pyarrow batches This is what is implemented, using query_column_block_stream.
Native -> Python objects -> DataFrame chunks -> pyarrow batches This is not implemented because it adds an unnecessary pandas step in between Python object -> arrow. We can go directly to record batches without pandas in the middle.

truncate_table

truncate_table(self, name, database=None)

Install

Connect

ibis.clickhouse.connect

Connection Parameters

do_connect

Parameters

Examples

ibis.connect URL format

ClickHouse playground

clickhouse.Backend

close

compile

create_database

Parameters

create_table

Parameters

Returns

create_view

Parameters

Returns

drop_database

Parameters

drop_table

Parameters

drop_view

Parameters

execute

get_schema

Parameters

Returns

has_operation

Parameters

Returns

Examples

insert

list_databases

Parameters

Returns

list_tables

Parameters

Returns

raw_sql

Parameters

Returns

read_csv

Parameters

Returns

read_parquet

Parameters

Returns

sql

table

Parameters

Returns

to_pyarrow

Parameters

Returns

to_pyarrow_batches

Parameters

Returns

Notes

truncate_table

`ibis.clickhouse.connect`

`ibis.connect` URL format

`clickhouse.Backend`