ClickHouse¶
Install¶
Install ibis and dependencies for the ClickHouse backend:
pip install 'ibis-framework[clickhouse]'
conda install -c conda-forge ibis-clickhouse
mamba install -c conda-forge ibis-clickhouse
Connect¶
API¶
Create a client by passing in connection parameters to ibis.clickhouse.connect
.
See ibis.backends.clickhouse.Backend.do_connect
for connection parameter information.
ibis.clickhouse.connect
is a thin wrapper around ibis.backends.clickhouse.Backend.do_connect
.
Connection Parameters¶
do_connect(host='localhost', port=9000, database='default', user='default', password='', client_name='ibis', compression=_default_compression, external_tables=None, **kwargs)
¶
Create a ClickHouse client for use with Ibis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
host |
str
|
Host name of the clickhouse server |
'localhost'
|
port |
int
|
Clickhouse server's port |
9000
|
database |
str
|
Default database when executing queries |
'default'
|
user |
str
|
User to authenticate with |
'default'
|
password |
str
|
Password to authenticate with |
''
|
client_name |
str
|
Name of client that wil appear in clickhouse server logs |
'ibis'
|
compression |
Literal['lz4', 'lz4hc', 'quicklz', 'zstd'] | bool
|
Whether or not to use compression.
Default is |
_default_compression
|
external_tables |
External tables that can be used in a query. |
None
|
|
kwargs |
Any
|
Client specific keyword arguments |
{}
|
Examples:
>>> import ibis
>>> import os
>>> clickhouse_host = os.environ.get('IBIS_TEST_CLICKHOUSE_HOST', 'localhost')
>>> clickhouse_port = int(os.environ.get('IBIS_TEST_CLICKHOUSE_PORT', 9000))
>>> client = ibis.clickhouse.connect(host=clickhouse_host, port=clickhouse_port)
>>> client
<ibis.clickhouse.client.ClickhouseClient object at 0x...>
Backend API¶
Backend
¶
Bases: BaseBackend
Classes¶
Options
¶
Functions¶
close()
¶
Close Clickhouse connection and drop any temporary objects.
execute(expr, limit='default', external_tables=None, **kwargs)
¶
Execute an expression.
raw_sql(query, external_tables=None, **_)
¶
Execute a SQL string query
against the database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
Raw SQL string |
required |
external_tables |
Mapping[str, pd.DataFrame] | None
|
Mapping of table name to pandas DataFrames providing external datasources for the query |
None
|
Returns:
Type | Description |
---|---|
Cursor
|
Clickhouse cursor |
table(name, database=None)
¶
to_pyarrow_batches(expr, *, params=None, limit=None, chunk_size=1000000, **_)
¶
Execute expression and return an iterator of pyarrow record batches.
This method is eager and will execute the associated expression immediately.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expr |
ir.Expr
|
Ibis expression to export to pyarrow |
required |
limit |
int | str | None
|
An integer to effect a specific row limit. A value of |
None
|
params |
Mapping[ir.Scalar, Any] | None
|
Mapping of scalar parameter expressions to value. |
None
|
chunk_size |
int
|
Maximum number of rows in each returned record batch. |
1000000
|
Returns:
Type | Description |
---|---|
results
|
RecordBatchReader |