Skip to content

Datafusion

Introduced in v2.1

Install

Install ibis and dependencies for the Datafusion backend:

pip install 'ibis-framework[datafusion]'
conda install -c conda-forge ibis-datafusion
mamba install -c conda-forge ibis-datafusion

Connect

API

Create a client by passing in a dictionary of paths to ibis.datafusion.connect.

See ibis.backends.datafusion.Backend.do_connect for connection parameter information.

ibis.datafusion.connect is a thin wrapper around ibis.backends.datafusion.Backend.do_connect.

Connection Parameters

do_connect(config=None)

Create a Datafusion backend for use with Ibis.

Parameters:

Name Type Description Default
config Mapping[str, str | Path] | SessionContext | None

Mapping of table names to files.

None

Examples:

>>> import ibis
>>> config = {"t": "path/to/file.parquet", "s": "path/to/file.csv"}
>>> ibis.datafusion.connect(config)

Backend API

Backend

Bases: BaseBackend

Functions

list_tables(like=None, database=None)

List the available tables.

read_csv(path, table_name=None, **kwargs)

Register a CSV file as a table in the current database.

Parameters:

Name Type Description Default
path str | Path

The data source. A string or Path to the CSV file.

required
table_name str | None

An optional name to use for the created table. This defaults to a sequentially generated name.

None
**kwargs Any

Additional keyword arguments passed to Datafusion loading function.

{}

Returns:

Type Description
ir.Table

The just-registered table

read_parquet(path, table_name=None, **kwargs)

Register a parquet file as a table in the current database.

Parameters:

Name Type Description Default
path str | Path

The data source.

required
table_name str | None

An optional name to use for the created table. This defaults to a sequentially generated name.

None
**kwargs Any

Additional keyword arguments passed to Datafusion loading function.

{}

Returns:

Type Description
ir.Table

The just-registered table

register(source, table_name=None, **kwargs)

Register a CSV or Parquet file with table_name located at source.

Parameters:

Name Type Description Default
source str | Path

The path to the file

required
table_name str | None

The name of the table

None
kwargs Any

Datafusion-specific keyword arguments

{}
table(name, schema=None)

Get an ibis expression representing a DataFusion table.

Parameters:

Name Type Description Default
name str

The name of the table to retreive

required
schema sch.Schema | None

An optional schema for the table

None

Returns:

Type Description
Table

A table expression


Last update: January 4, 2023