Install

We recommend starting with the default backend (DuckDB).

pip install 'ibis-framework[duckdb,examples]'
1
Install Ibis with the DuckDB backend along with examples.

Backends

Need to use Ibis with a backend that isn’t currently supported? Let us know!

You can install Ibis and a supported backend with pip, conda, mamba, or pixi.

Install with the bigquery extra:

pip install 'ibis-framework[bigquery]'

Connect using ibis.bigquery.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the clickhouse extra:

pip install 'ibis-framework[clickhouse]'

Connect using ibis.clickhouse.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the dask extra:

pip install 'ibis-framework[dask]'

Connect using ibis.dask.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the datafusion extra:

pip install 'ibis-framework[datafusion]'

Connect using ibis.datafusion.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the druid extra:

pip install 'ibis-framework[druid]'

Connect using ibis.druid.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the duckdb extra:

pip install 'ibis-framework[duckdb]'

Connect using ibis.duckdb.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the exasol extra:

pip install 'ibis-framework[exasol]'

Connect using ibis.exasol.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install alongside the apache-flink package:

pip install ibis-framework apache-flink

Connect using ibis.flink.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the impala extra:

pip install 'ibis-framework[impala]'

Connect using ibis.impala.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the mssql extra:

pip install 'ibis-framework[mssql]'

Connect using ibis.mssql.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the mysql extra:

pip install 'ibis-framework[mysql]'

Connect using ibis.mysql.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the oracle extra:

pip install 'ibis-framework[oracle]'

Connect using ibis.oracle.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the pandas extra:

pip install 'ibis-framework[pandas]'

Connect using ibis.pandas.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the polars extra:

pip install 'ibis-framework[polars]'

Connect using ibis.polars.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the postgres extra:

pip install 'ibis-framework[postgres]'

Connect using ibis.postgres.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the pyspark extra:

pip install 'ibis-framework[pyspark]'

Connect using ibis.pyspark.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the risingwave extra:

pip install 'ibis-framework[risingwave]'

Connect using ibis.risingwave.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the snowflake extra:

pip install 'ibis-framework[snowflake]'

Connect using ibis.snowflake.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the sqlite extra:

pip install 'ibis-framework[sqlite]'

Connect using ibis.sqlite.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install with the trino extra:

pip install 'ibis-framework[trino]'

Connect using ibis.trino.connect.

Warning

Note that the ibis-framework package is not the same as the ibis package in PyPI. These two libraries cannot coexist in the same Python environment, as they are both imported with the ibis module name.

Install the ibis-bigquery package:

conda install -c conda-forge ibis-bigquery

Connect using ibis.bigquery.connect.

Install the ibis-clickhouse package:

conda install -c conda-forge ibis-clickhouse

Connect using ibis.clickhouse.connect.

Install the ibis-dask package:

conda install -c conda-forge ibis-dask

Connect using ibis.dask.connect.

Install the ibis-datafusion package:

conda install -c conda-forge ibis-datafusion

Connect using ibis.datafusion.connect.

Install the ibis-druid package:

conda install -c conda-forge ibis-druid

Connect using ibis.druid.connect.

Install the ibis-duckdb package:

conda install -c conda-forge ibis-duckdb

Connect using ibis.duckdb.connect.

Install the ibis-exasol package:

conda install -c conda-forge ibis-exasol

Connect using ibis.exasol.connect.

Install the ibis-flink package:

conda install -c conda-forge ibis-flink

Connect using ibis.flink.connect.

Install the ibis-impala package:

conda install -c conda-forge ibis-impala

Connect using ibis.impala.connect.

Install the ibis-mssql package:

conda install -c conda-forge ibis-mssql

Connect using ibis.mssql.connect.

Install the ibis-mysql package:

conda install -c conda-forge ibis-mysql

Connect using ibis.mysql.connect.

Install the ibis-oracle package:

conda install -c conda-forge ibis-oracle

Connect using ibis.oracle.connect.

Install the ibis-pandas package:

conda install -c conda-forge ibis-pandas

Connect using ibis.pandas.connect.

Install the ibis-polars package:

conda install -c conda-forge ibis-polars

Connect using ibis.polars.connect.

Install the ibis-postgres package:

conda install -c conda-forge ibis-postgres

Connect using ibis.postgres.connect.

Install the ibis-pyspark package:

conda install -c conda-forge ibis-pyspark

Connect using ibis.pyspark.connect.

Install the ibis-risingwave package:

conda install -c conda-forge ibis-risingwave

Connect using ibis.risingwave.connect.

Install the ibis-snowflake package:

conda install -c conda-forge ibis-snowflake

Connect using ibis.snowflake.connect.

Install the ibis-sqlite package:

conda install -c conda-forge ibis-sqlite

Connect using ibis.sqlite.connect.

Install the ibis-trino package:

conda install -c conda-forge ibis-trino

Connect using ibis.trino.connect.

Install the ibis-bigquery package:

mamba install -c conda-forge ibis-bigquery

Connect using ibis.bigquery.connect.

Install the ibis-clickhouse package:

mamba install -c conda-forge ibis-clickhouse

Connect using ibis.clickhouse.connect.

Install the ibis-dask package:

mamba install -c conda-forge ibis-dask

Connect using ibis.dask.connect.

Install the ibis-datafusion package:

mamba install -c conda-forge ibis-datafusion

Connect using ibis.datafusion.connect.

Install the ibis-druid package:

mamba install -c conda-forge ibis-druid

Connect using ibis.druid.connect.

Install the ibis-duckdb package:

mamba install -c conda-forge ibis-duckdb

Connect using ibis.duckdb.connect.

Install the ibis-exasol package:

mamba install -c conda-forge ibis-exasol

Connect using ibis.exasol.connect.

Install the ibis-flink package:

mamba install -c conda-forge ibis-flink

Connect using ibis.flink.connect.

Install the ibis-impala package:

mamba install -c conda-forge ibis-impala

Connect using ibis.impala.connect.

Install the ibis-mssql package:

mamba install -c conda-forge ibis-mssql

Connect using ibis.mssql.connect.

Install the ibis-mysql package:

mamba install -c conda-forge ibis-mysql

Connect using ibis.mysql.connect.

Install the ibis-oracle package:

mamba install -c conda-forge ibis-oracle

Connect using ibis.oracle.connect.

Install the ibis-pandas package:

mamba install -c conda-forge ibis-pandas

Connect using ibis.pandas.connect.

Install the ibis-polars package:

mamba install -c conda-forge ibis-polars

Connect using ibis.polars.connect.

Install the ibis-postgres package:

mamba install -c conda-forge ibis-postgres

Connect using ibis.postgres.connect.

Install the ibis-pyspark package:

mamba install -c conda-forge ibis-pyspark

Connect using ibis.pyspark.connect.

Install the ibis-risingwave package:

mamba install -c conda-forge ibis-risingwave

Connect using ibis.risingwave.connect.

Install the ibis-snowflake package:

mamba install -c conda-forge ibis-snowflake

Connect using ibis.snowflake.connect.

Install the ibis-sqlite package:

mamba install -c conda-forge ibis-sqlite

Connect using ibis.sqlite.connect.

Install the ibis-trino package:

mamba install -c conda-forge ibis-trino

Connect using ibis.trino.connect.

Add the ibis-bigquery package:

pixi add ibis-bigquery

Connect using ibis.bigquery.connect.

Add the ibis-clickhouse package:

pixi add ibis-clickhouse

Connect using ibis.clickhouse.connect.

Add the ibis-dask package:

pixi add ibis-dask

Connect using ibis.dask.connect.

Add the ibis-datafusion package:

pixi add ibis-datafusion

Connect using ibis.datafusion.connect.

Add the ibis-druid package:

pixi add ibis-druid

Connect using ibis.druid.connect.

Add the ibis-duckdb package:

pixi add ibis-duckdb

Connect using ibis.duckdb.connect.

Add the ibis-exasol package:

pixi add ibis-exasol

Connect using ibis.exasol.connect.

Add the ibis-flink package:

pixi add ibis-flink

Connect using ibis.flink.connect.

Add the ibis-impala package:

pixi add ibis-impala

Connect using ibis.impala.connect.

Add the ibis-mssql package:

pixi add ibis-mssql

Connect using ibis.mssql.connect.

Add the ibis-mysql package:

pixi add ibis-mysql

Connect using ibis.mysql.connect.

Add the ibis-oracle package:

pixi add ibis-oracle

Connect using ibis.oracle.connect.

Add the ibis-pandas package:

pixi add ibis-pandas

Connect using ibis.pandas.connect.

Add the ibis-polars package:

pixi add ibis-polars

Connect using ibis.polars.connect.

Add the ibis-postgres package:

pixi add ibis-postgres

Connect using ibis.postgres.connect.

Add the ibis-pyspark package:

pixi add ibis-pyspark

Connect using ibis.pyspark.connect.

Add the ibis-risingwave package:

pixi add ibis-risingwave

Connect using ibis.risingwave.connect.

Add the ibis-snowflake package:

pixi add ibis-snowflake

Connect using ibis.snowflake.connect.

Add the ibis-sqlite package:

pixi add ibis-sqlite

Connect using ibis.sqlite.connect.

Add the ibis-trino package:

pixi add ibis-trino

Connect using ibis.trino.connect.

See the backend support matrix for details on operations supported. Open a feature request if you’d like to see support for an operation in a given backend. If the backend supports it, we’ll do our best to add it quickly!

Quickstart

See the getting started tutorial for a more in-depth introduction to Ibis. Below is a quick overview.

import ibis
import ibis.selectors as s

ibis.options.interactive = True

t = ibis.examples.penguins.fetch()
t.head(3)
1
Ensure you install Ibis first.
2
Use interactive mode for exploratory data analysis (EDA) or demos.
3
Load a dataset from the built-in examples.
4
Display the table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.71813750male  2007 │
│ Adelie Torgersen39.517.41863800female2007 │
│ Adelie Torgersen40.318.01953250female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Ibis is a dataframe library with familiar syntax.

t[10:15]
1
Display a slice of the table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen37.817.11863300NULL2007 │
│ Adelie Torgersen37.817.31803700NULL2007 │
│ Adelie Torgersen41.117.61823200female2007 │
│ Adelie Torgersen38.621.21913800male  2007 │
│ Adelie Torgersen34.621.11984400male  2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Analytics

Ibis is built for easy analytics at scale in Python.

(
    t.filter(ibis._["body_mass_g"] != None)
    .group_by(["species", "island"])
    .aggregate(count=ibis._.count())
    .order_by(ibis.desc("count"))
)
1
Group by species and island, and compute the number of rows in each group.
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species    island     count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├───────────┼───────────┼───────┤
│ Gentoo   Biscoe   123 │
│ ChinstrapDream    68 │
│ Adelie   Dream    56 │
│ Adelie   Torgersen51 │
│ Adelie   Biscoe   44 │
└───────────┴───────────┴───────┘

Exploratory data analysis (EDA) and visualization

Exploratory data analysis

Ibis has built-in methods for exploration and visualization.

num_species = int(t.select("species").nunique().to_pandas())
t["species"].topk(num_species)
1
Compute the number of species in the dataset.
2
Display the top species by count.
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ species    CountStar(penguins) ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ stringint64               │
├───────────┼─────────────────────┤
│ Adelie   152 │
│ Gentoo   124 │
│ Chinstrap68 │
└───────────┴─────────────────────┘

Visualization

Ibis works with any Python plotting library that supports the dataframe interchange protocol.

grouped = (
    t.group_by("species")
    .aggregate(count=ibis._.count())
    .order_by(ibis.desc("count"))
)
grouped
1
Setup data to plot.
2
Display the table.
┏━━━━━━━━━━━┳━━━━━━━┓
┃ species    count ┃
┡━━━━━━━━━━━╇━━━━━━━┩
│ stringint64 │
├───────────┼───────┤
│ Adelie   152 │
│ Gentoo   124 │
│ Chinstrap68 │
└───────────┴───────┘
pip install altair
import altair as alt

chart = (
    alt.Chart(grouped.to_pandas())
    .mark_bar()
    .encode(
        x="species",
        y="count",
        tooltip=["species", "count"],
    )
    .properties(width=600, height=400)
    .interactive()
)
chart
pip install matplotlib
import matplotlib.pyplot as plt

chart = grouped.to_pandas().plot.bar(
    x="species",
    y="count",
    figsize=(600 / 100, 400 / 100),
)
plt.show()

pip install plotly
import plotly.express as px

chart = px.bar(
    grouped.to_pandas(),
    x="species",
    y="count",
    width=600,
    height=400,
)
chart
pip install plotnine
from plotnine import ggplot, aes, geom_bar, theme

chart = (
    ggplot(
        grouped,
        aes(x="species", y="count"),
    )
    + geom_bar(stat="identity")
    + theme(figure_size=(600 / 100, 400 / 100))
)
chart

pip install seaborn
import seaborn as sns

chart = sns.barplot(
    data=grouped.to_pandas(),
    x="species",
    y="count",
)
chart.figure.set_size_inches(600 / 100, 400 / 100)

Data science

Use Ibis with your favorite data science libraries for concise and efficient workflows.

import ibis.selectors as s


def transform(t):
    t = t.mutate(
        s.across(s.numeric(), {"zscore": lambda x: (x - x.mean()) / x.std()})
    ).dropna()
    return t


f = transform(t.drop("year"))
f.select("species", "island", s.contains("zscore"))
1
Import the selectors module.
2
Define a function to transform the table for code reuse (compute z-scores on numeric columns).
3
Apply the function to the table and assign it to a new variable.
4
Display the transformed table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ species  island     bill_length_mm_zscore  bill_depth_mm_zscore  flipper_length_mm_zscore  body_mass_g_zscore ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ stringstringfloat64float64float64float64            │
├─────────┼───────────┼───────────────────────┼──────────────────────┼──────────────────────────┼────────────────────┤
│ Adelie Torgersen-0.8832050.784300-1.416272-0.563317 │
│ Adelie Torgersen-0.8099390.126003-1.060696-0.500969 │
│ Adelie Torgersen-0.6634080.429833-0.420660-1.186793 │
│ Adelie Torgersen-1.3227991.088129-0.562890-0.937403 │
│ Adelie Torgersen-0.8465721.746426-0.776236-0.688012 │
│ Adelie Torgersen-0.9198370.328556-1.416272-0.719186 │
│ Adelie Torgersen-0.8648881.240044-0.4206600.590115 │
│ Adelie Torgersen-0.5168760.227280-1.345156-1.249141 │
│ Adelie Torgersen-0.9747872.050255-0.705121-0.500969 │
│ Adelie Torgersen-1.7074431.999617-0.2073150.247203 │
│  │
└─────────┴───────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────┘
pip install scikit-learn
import plotly.express as px
from sklearn.decomposition import PCA

X = f.select(s.contains("zscore"))

n_components = 3
pca = PCA(n_components=n_components).fit(X)

t_pca = ibis.memtable(pca.transform(X)).rename(
    {"pc1": "col0", "pc2": "col1", "pc3": "col2"}
)

f = f.mutate(row_number=ibis.row_number().over()).join(
    t_pca.mutate(row_number=ibis.row_number().over()),
    "row_number",
)

px.scatter_3d(
    f.to_pandas(),
    x="pc1",
    y="pc2",
    z="pc3",
    color="species",
    symbol="island",
)
1
Import data science libraries
2
Select “features” (numeric columns) as X
3
Compute PCA
4
Create a table from the PCA results
5
Join the PCA results to the original table
6
Plot the results

Input and output

Ibis supports a variety of input and output options.

Data platforms

You can connect Ibis to any supported backend to read and write data in backend-native tables.

Code
con = ibis.duckdb.connect("penguins.ddb")
t = con.create_table("penguins", t.to_pyarrow(), overwrite=True)
con = ibis.duckdb.connect("penguins.ddb")
t = con.table("penguins")
t.head(3)
1
Connect to a backend.
2
Load a table.
3
Display the table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.71813750male  2007 │
│ Adelie Torgersen39.517.41863800female2007 │
│ Adelie Torgersen40.318.01953250female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
grouped = (
    t.group_by(["species", "island"])
    .aggregate(count=ibis._.count())
    .order_by(ibis.desc("count"))
)
con.create_table("penguins_grouped", grouped.to_pyarrow(), overwrite=True)
1
Create a lazily evaluated Ibis expression.
2
Write to a table.
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species    island     count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├───────────┼───────────┼───────┤
│ Gentoo   Biscoe   124 │
│ ChinstrapDream    68 │
│ Adelie   Dream    56 │
│ Adelie   Torgersen52 │
│ Adelie   Biscoe   44 │
└───────────┴───────────┴───────┘

File formats

Depending on the backend, you can read and write data in several file formats.

pip install 'ibis-framework[duckdb]'
t.to_csv("penguins.csv")
ibis.read_csv("penguins.csv").head(3)
1
Write the table to a CSV file. Dependent on backend.
2
Read the CSV file into a table. Dependent on backend.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.71813750male  2007 │
│ Adelie Torgersen39.517.41863800female2007 │
│ Adelie Torgersen40.318.01953250female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
pip install 'ibis-framework[duckdb,deltalake]'
t.to_delta("penguins.delta", mode="overwrite")
ibis.read_delta("penguins.delta").head(3)
1
Write the table to a Delta Lake table. Dependent on backend.
2
Read the Delta Lake table into a table. Dependent on backend.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.71813750male  2007 │
│ Adelie Torgersen39.517.41863800female2007 │
│ Adelie Torgersen40.318.01953250female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
pip install 'ibis-framework[duckdb]'
t.to_parquet("penguins.parquet")
ibis.read_parquet("penguins.parquet").head(3)
1
Write the table to a Parquet file. Dependent on backend.
2
Read the Parquet file into a table. Dependent on backend.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.71813750male  2007 │
│ Adelie Torgersen39.517.41863800female2007 │
│ Adelie Torgersen40.318.01953250female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

With other Python libraries

Ibis uses Apache Arrow for efficient data transfer to and from other libraries. Ibis tables implement the __dataframe__ and __array__ protocols, so you can pass them to any library that supports these protocols.

You can convert Ibis tables to pandas dataframes.

pip install pandas
df = t.to_pandas()
df.head(3)
1
Returns a pandas dataframe.
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007

Or you can convert pandas dataframes to Ibis tables.

t = ibis.memtable(df)
t.head(3)
1
Returns an Ibis table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

You can convert Ibis tables to Polars dataframes.

pip install polars
import polars as pl

df = pl.from_arrow(t.to_pyarrow())
df.head(3)
shape: (3, 8)
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
str str f64 f64 f64 f64 str i64
"Adelie" "Torgersen" 39.1 18.7 181.0 3750.0 "male" 2007
"Adelie" "Torgersen" 39.5 17.4 186.0 3800.0 "female" 2007
"Adelie" "Torgersen" 40.3 18.0 195.0 3250.0 "female" 2007

Or Polars dataframes to Ibis tables.

t = ibis.memtable(df)
t.head(3)
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

You can convert Ibis tables to PyArrow tables.

pip install pyarrow
t.to_pyarrow()
pyarrow.Table
species: string
island: string
bill_length_mm: double
bill_depth_mm: double
flipper_length_mm: double
body_mass_g: double
sex: string
year: int64
----
species: [["Adelie","Adelie","Adelie","Adelie","Adelie",...,"Chinstrap","Chinstrap","Chinstrap","Chinstrap","Chinstrap"]]
island: [["Torgersen","Torgersen","Torgersen","Torgersen","Torgersen",...,"Dream","Dream","Dream","Dream","Dream"]]
bill_length_mm: [[39.1,39.5,40.3,null,36.7,...,55.8,43.5,49.6,50.8,50.2]]
bill_depth_mm: [[18.7,17.4,18,null,19.3,...,19.8,18.1,18.2,19,18.7]]
flipper_length_mm: [[181,186,195,null,193,...,207,202,193,210,198]]
body_mass_g: [[3750,3800,3250,null,3450,...,4000,3400,3775,4100,3775]]
sex: [["male","female","female",null,"female",...,"male","female","male","male","female"]]
year: [[2007,2007,2007,2007,2007,...,2009,2009,2009,2009,2009]]

Or PyArrow batches:

t.to_pyarrow_batches()
<pyarrow.lib.RecordBatchReader at 0x7fff6c576ee0>

And you can convert PyArrow tables to Ibis tables.

ibis.memtable(t.to_pyarrow()).head(3)
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

You can convert Ibis tables to torch tensors.

pip install torch
t.select(s.numeric()).limit(3).to_torch()
{'col2': tensor([39.1000, 39.5000, 40.3000], dtype=torch.float64),
 'col3': tensor([18.7000, 17.4000, 18.0000], dtype=torch.float64),
 'col4': tensor([181., 186., 195.], dtype=torch.float64),
 'col5': tensor([3750., 3800., 3250.], dtype=torch.float64),
 'col7': tensor([2007, 2007, 2007], dtype=torch.int16)}

You can directly call the __dataframe__ protocol on Ibis tables, though this is typically handled by the library you’re using.

t.__dataframe__()
<ibis.expr.types.dataframe_interchange.IbisDataFrame at 0x7fff6fc13cb0>

You can directly call the __array__ protocol on Ibis tables, though this is typically handled by the library you’re using.

t.__array__()
array([['Adelie', 'Torgersen', 39.1, ..., 3750.0, 'male', 2007],
       ['Adelie', 'Torgersen', 39.5, ..., 3800.0, 'female', 2007],
       ['Adelie', 'Torgersen', 40.3, ..., 3250.0, 'female', 2007],
       ...,
       ['Chinstrap', 'Dream', 49.6, ..., 3775.0, 'male', 2009],
       ['Chinstrap', 'Dream', 50.8, ..., 4100.0, 'male', 2009],
       ['Chinstrap', 'Dream', 50.2, ..., 3775.0, 'female', 2009]],
      dtype=object)

SQL + Python

Ibis has the ibis.to_sql to generate SQL strings.

In a Jupyter notebook or IPython shell session, the output of ibis.to_sql will be syntax highlighted.

In a plain Python REPL use print(ibis.to_sql(...)) to pretty print SQL.

Ibis uses SQLGlot under the hood to allow passing a dialect parameter to SQL methods.

dialect = "bigquery"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  `t1`.`species`,
  `t1`.`island`,
  `t1`.`count`
FROM (
  SELECT
    `t0`.`species`,
    `t0`.`island`,
    COUNT(*) AS `count`
  FROM `penguins` AS `t0`
  GROUP BY
    1,
    2
) AS `t1`
ORDER BY
  `t1`.`count` DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "snowflake"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" DESC NULLS LAST

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "oracle"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" "t0"
  GROUP BY
    "t0"."species",
    "t0"."island"
) "t1"
ORDER BY
  "t1"."count" DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "mysql"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  `t1`.`species`,
  `t1`.`island`,
  `t1`.`count`
FROM (
  SELECT
    `t0`.`species`,
    `t0`.`island`,
    COUNT(*) AS `count`
  FROM `penguins` AS `t0`
  GROUP BY
    1,
    2
) AS `t1`
ORDER BY
  `t1`.`count` DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "mssql"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  [t1].[species],
  [t1].[island],
  [t1].[count]
FROM (
  SELECT
    [t0].[species],
    [t0].[island],
    COUNT(*) AS [count]
  FROM [penguins] AS [t0]
  GROUP BY
    [t0].[species],
    [t0].[island]
) AS [t1]
ORDER BY
  [t1].[count] DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "postgres"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" DESC NULLS LAST

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "sqlite"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
dialect = "trino"
sql = ibis.to_sql(
    grouped,
    dialect=dialect,
)
sql
1
Set the dialect.
2
Convert the table to a SQL string.
3
Display the SQL string.
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" DESC

You can chain Ibis expressions and .sql together.

con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie")
1
Chain .sql calls and Ibis expressions together.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species  island     count ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├─────────┼───────────┼───────┤
│ Adelie Dream    56 │
│ Adelie Torgersen52 │
│ Adelie Biscoe   44 │
└─────────┴───────────┴───────┘
Back to top