import ibis
from ibis import _
import pandas as pd
Chaining expressions
Expressions can easily be chained using the deferred expression API, also known as the Underscore (_
) API.
In this guide, we use the _
API to concisely create column expressions and then chain table expressions.
Setup
To get started, import _
from ibis:
Let’s create two in-memory tables using [ibis.memtable
], an API introduced in 3.2:
= pd.DataFrame({'x': range(5), 'y': list('ab')*2 + list('e')})
df1 = ibis.memtable(df1)
t1
= pd.DataFrame({'x': range(10), 'z': list(reversed(list('ab')*2 + list('e')))*2})
df2 = ibis.memtable(df2) t2
Creating column expressions
We can use _
to create new column expressions without explicit reference to the previous table expression:
# We can pass a deferred expression into a function:
def modf(t):
return t.x % 3
= modf(_)
xmod
# We can create ColumnExprs like aggregate expressions:
= _.y.max()
ymax = _.z.max()
zmax = _.z.count() zct
Chaining Ibis expressions
We can also use it to chain Ibis expressions in one Python expression:
= (
join
t1# _ is t1
== t2.x)
.join(t2, _.x # `xmod` is a deferred expression:
=xmod)
.mutate(xmod# _ is the TableExpression after mutate:
.group_by(_.xmod)# `ymax` and `zmax` are ColumnExpressions derived from a deferred expression:
=ymax, zmax=zmax)
.aggregate(ymax# _ is the aggregation result:
filter(_.ymax == _.zmax)
.# _ is the filtered result, and re-create xmod in t2 using modf:
== modf(t2))
.join(t2, _.xmod # _ is the second join result:
== modf(t1))
.join(t1, _.xmod # _ is the third join result:
.select(_.x, _.y, _.z)# Finally, _ is the selection result:
.order_by(_.x) )