Ibis v6.1.0

release
blog
Author

Ibis team

Published

August 2, 2023

Overview

Ibis 6.1.0 is a minor release that includes new features, backend improvements, bug fixes, documentation improvements, and refactors. We are excited to see further adoption of the dataframe interchange protocol enabling visualization and other libraries to be used more easily with Ibis.

You can view the full changelog in the release notes.

If you’re new to Ibis, see how to install and the getting started tutorial.

To follow along with this blog, ensure you’re on 'ibis-framework>=6.1,<7'. First, we'll setup Ibis and fetch some sample data to use.

import ibis
import ibis.selectors as s

ibis.__version__
'8.0.0'
# interactive mode for demo purposes
ibis.options.interactive = True
t = ibis.examples.penguins.fetch()
t = t.mutate(year=t["year"].cast("str"))
t.limit(3)
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year   ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩
│ stringstringfloat64float64int64int64stringstring │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼────────┤
│ Adelie Torgersen39.118.71813750male  2007   │
│ Adelie Torgersen39.517.41863800female2007   │
│ Adelie Torgersen40.318.01953250female2007   │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┘

Ecosystem integrations

With the introduction of __dataframe__ support in v6.0.0 and efficiency improvements in this release, Ibis now works with Altair, Plotly, plotnine, and any other visualization library that implements the protocol. This enables passing Ibis tables directly to visualization libraries without a .to_pandas() or to_pyarrow() call for any of the 15+ backends supported, with data efficiently transferred through Apache Arrow.

Code
width = 640
height = 480
1
Set the width and height of the plots.
grouped = (
    t.group_by("species")
    .aggregate(count=ibis._.count())
    .order_by(ibis.desc("count"))
)
grouped
1
Setup data to plot.
2
Display the table.
┏━━━━━━━━━━━┳━━━━━━━┓
┃ species    count ┃
┡━━━━━━━━━━━╇━━━━━━━┩
│ stringint64 │
├───────────┼───────┤
│ Adelie   152 │
│ Gentoo   124 │
│ Chinstrap68 │
└───────────┴───────┘
pip install altair
import altair as alt

chart = (
    alt.Chart(grouped)
    .mark_bar()
    .encode(
        x="species",
        y="count",
    )
    .properties(width=width, height=height)
)
chart
pip install plotly
import plotly.express as px

px.bar(
    grouped.to_pandas(),
    x="species",
    y="count",
    width=width,
    height=height,
)
pip install plotnine
from plotnine import ggplot, aes, geom_bar, theme

(
    ggplot(
        grouped,
        aes(x="species", y="count"),
    )
    + geom_bar(stat="identity")
    + theme(figure_size=(width / 100, height / 100))
)

A more modular, composable, and scalable way of working with data is taking shape with __dataframe__ and __array__ support in Ibis and increasingly the Python data ecosystem. Let's combine the above with PCA after some preprocessing in Ibis to visualize all numeric columns in 2D.

import ibis.selectors as s


def transform(t):
    t = t.mutate(
        s.across(s.numeric(), {"zscore": lambda x: (x - x.mean()) / x.std()})
    ).dropna()
    return t


f = transform(t)
f
1
Import the selectors module.
2
Define a function to transform the table for code reuse (compute z-scores on numeric columns).
3
Apply the function to the table and assign it to a new variable.
4
Display the transformed table.
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year    bill_length_mm_zscore  bill_depth_mm_zscore  flipper_length_mm_zscore  body_mass_g_zscore ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ stringstringfloat64float64int64int64stringstringfloat64float64float64float64            │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼────────┼───────────────────────┼──────────────────────┼──────────────────────────┼────────────────────┤
│ Adelie Torgersen39.118.71813750male  2007  -0.8832050.784300-1.416272-0.563317 │
│ Adelie Torgersen39.517.41863800female2007  -0.8099390.126003-1.060696-0.500969 │
│ Adelie Torgersen40.318.01953250female2007  -0.6634080.429833-0.420660-1.186793 │
│ Adelie Torgersen36.719.31933450female2007  -1.3227991.088129-0.562890-0.937403 │
│ Adelie Torgersen39.320.61903650male  2007  -0.8465721.746426-0.776236-0.688012 │
│ Adelie Torgersen38.917.81813625female2007  -0.9198370.328556-1.416272-0.719186 │
│ Adelie Torgersen39.219.61954675male  2007  -0.8648881.240044-0.4206600.590115 │
│ Adelie Torgersen41.117.61823200female2007  -0.5168760.227280-1.345156-1.249141 │
│ Adelie Torgersen38.621.21913800male  2007  -0.9747872.050255-0.705121-0.500969 │
│ Adelie Torgersen34.621.11984400male  2007  -1.7074431.999617-0.2073150.247203 │
│  │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────┘
pip install scikit-learn
import plotly.express as px
from sklearn.decomposition import PCA

X = f.select(s.contains("zscore"))

n_components = 3
pca = PCA(n_components=n_components).fit(X)

t_pca = ibis.memtable(pca.transform(X)).relabel(
    {"col0": "pc1", "col1": "pc2", "col2": "pc3"}
)

f = f.mutate(row_number=ibis.row_number().over()).join(
    t_pca.mutate(row_number=ibis.row_number().over()), "row_number"
)

px.scatter_3d(
    f.to_pandas(),
    x="pc1",
    y="pc2",
    z="pc3",
    color="species",
)
1
Import data science libraries
2
Select “features” (numeric columns) as X
3
Compute PCA
4
Create a table from the PCA results
5
Join the PCA results to the original table
6
Plot the results

Backends

Numerous backends received improvements. See the release notes for more details.

The DataFusion backend (and a few others) received several improvements from community member @mesejo with memtables and many new operations now supported. Some highlights include:

url = ibis.literal("https://ibis-project.org/concepts/why_ibis")
con = ibis.datafusion.connect()

con.execute(url.host())
'ibis-project.org'
con.execute(url.path())
'/concepts/why_ibis'
con.execute(ibis.literal("aaabbbaaa").re_search("bbb"))
True
con.execute(ibis.literal(5.56).ln())
1.715598108262491
con.execute(ibis.literal(5.56).log10())
0.7450747915820574
con.execute(ibis.literal(5.56).radians())
0.09704030641088472

Some remaining gaps in CREATE TABLE DDL options for BigQuery have been filled in, including the ability to pass in overwrite=True for table creation.

The PySpark backend now supports reading/writing Delta Lake tables. Your PySpark session must be configured to use the Delta Lake package and you must have the delta package installed in your environment.

t = ibis.read_delta("/path/to/delta")

...

t.to_delta("/path/to/delta", mode="overwrite")

The .sql API is now supported in Trino, enabling you to chain Ibis and SQL together.

Scalar Python UDFs are now supported in SQLite.

Additionally, URL parsing has been added:

con = ibis.sqlite.connect()

con.execute(url.host())
'ibis-project.org'
con.execute(url.path())
'/concepts/why_ibis'

URL parsing support was added.

con = ibis.pandas.connect()

con.execute(url.host())
'ibis-project.org'
con.execute(url.path())
'/concepts/why_ibis'

Functionality

Various new features and were added.

.nunique() supported on tables

You can now call .nunique() on tables to get the number of unique rows.

# how many unique rows are there? equivalent to `.count()` in this case
t.nunique()

344
# how many unique species/island/year combinations are there?
t.select("species", "island", "year").nunique()

15

to_sql returns a str type

The ibis.expr.sql.SQLString type resulting from to_sql is now a proper str subclass, enabling use without casting to str first.

type(ibis.to_sql(t))
ibis.expr.sql.SQLString
issubclass(type(ibis.to_sql(t)), str)
True

Allow mixing literals and columns in ibis.array

Note that arrays must still be of a single type.

ibis.array([t["species"], "hello"])
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ Array()             ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ array<string>       │
├─────────────────────┤
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│ ['Adelie', 'hello'] │
│                    │
└─────────────────────┘
ibis.array([t["flipper_length_mm"], 42])
┏━━━━━━━━━━━━━━┓
┃ Array()      ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [181, 42]    │
│ [186, 42]    │
│ [195, 42]    │
│ [None, 42]   │
│ [193, 42]    │
│ [190, 42]    │
│ [181, 42]    │
│ [195, 42]    │
│ [193, 42]    │
│ [190, 42]    │
│             │
└──────────────┘

Array concat and repeat methods

You can still use + or * in typical Python fashion, with new and more explicit concat and repeat methods added in this release.

a = ibis.array([1, 2, 3])
b = ibis.array([4, 5])

c = a.concat(b)
c

[1, 2, ... +3]
c = a + b
c

[1, 2, ... +3]
b.repeat(2)

[4, 5, ... +2]
b * 2

[4, 5, ... +2]

Support boolean literals in the join API

This allows for joins with boolean predicates.

t.join(t, True)
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year    species_right  island_right  bill_length_mm_right  bill_depth_mm_right  flipper_length_mm_right  body_mass_g_right  sex_right  year_right ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ stringstringfloat64float64int64int64stringstringstringstringfloat64float64int64int64stringstring     │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼────────┼───────────────┼──────────────┼──────────────────────┼─────────────────────┼─────────────────────────┼───────────────────┼───────────┼────────────┤
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   39.118.71813750male     2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   39.517.41863800female   2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   40.318.01953250female   2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   NULLNULLNULLNULLNULL2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   36.719.31933450female   2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   39.320.61903650male     2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   38.917.81813625female   2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   39.219.61954675male     2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   34.118.11933475NULL2007       │
│ Adelie Torgersen39.118.71813750male  2007  Adelie       Torgersen   42.020.21904250NULL2007       │
│           │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┴───────────────┴──────────────┴──────────────────────┴─────────────────────┴─────────────────────────┴───────────────────┴───────────┴────────────┘
t.join(t, False)
┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ species  island  bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year    species_right  island_right  bill_length_mm_right  bill_depth_mm_right  flipper_length_mm_right  body_mass_g_right  sex_right  year_right ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ stringstringfloat64float64int64int64stringstringstringstringfloat64float64int64int64stringstring     │
└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┴───────────────┴──────────────┴──────────────────────┴─────────────────────┴─────────────────────────┴───────────────────┴───────────┴────────────┘
t.join(t, False, how="outer")
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year    species_right  island_right  bill_length_mm_right  bill_depth_mm_right  flipper_length_mm_right  body_mass_g_right  sex_right  year_right ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ stringstringfloat64float64int64int64stringstringstringstringfloat64float64int64int64stringstring     │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼────────┼───────────────┼──────────────┼──────────────────────┼─────────────────────┼─────────────────────────┼───────────────────┼───────────┼────────────┤
│ Adelie Torgersen39.118.71813750male  2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen39.517.41863800female2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen40.318.01953250female2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen36.719.31933450female2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen39.320.61903650male  2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen38.917.81813625female2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen39.219.61954675male  2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen34.118.11933475NULL2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│ Adelie Torgersen42.020.21904250NULL2007  NULLNULLNULLNULLNULLNULLNULLNULL       │
│           │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┴───────────────┴──────────────┴──────────────────────┴─────────────────────┴─────────────────────────┴───────────────────┴───────────┴────────────┘

Refactors

Several internal refactors that shouldn't affect normal usage were made. See the release notes for more details.

Wrapping up

Ibis v6.1.0 brings exciting enhancements to the library that enable broader ecosystem adoption of Python standards.

As always, try Ibis by installing and getting started.

If you run into any issues or find support is lacking for your backend, open an issue or discussion and let us know!

Back to top