An immutable and lazy dataframe.
You will not create Table objects directly. Instead, you will create one
Methods
aggregate
Aggregate a table with a given set of reductions grouping by by
.
alias
Create a table expression with a specific name alias
.
as_scalar
Inform ibis that the table expression should be treated as a scalar.
as_table
Promote the expression to a table.
asof_join
Perform an βas-ofβ join between left
and right
.
bind
Bind column values to a table expression.
cache
Cache the provided expression.
cast
Cast the columns of a table.
compile
Compile to an execution target.
count
Compute the number of rows in the table.
cross_join
Compute the cross join of a sequence of tables.
describe
Return summary information about a table.
difference
Compute the set difference of multiple table expressions.
distinct
Return a Table with duplicate rows removed.
drop
Remove fields from a table.
drop_null
Remove rows with null values from the table.
dropna
Deprecated - use drop_null
instead.
equals
Return whether this expression is structurally equivalent to other
.
execute
Execute an expression against its backend if one exists.
fill_null
Fill null values in a table expression.
fillna
Deprecated - use fill_null
instead.
filter
Select rows from table
based on predicates
.
get_backend
Get the current Ibis backend of the expression.
get_name
Return the fully qualified name of the table.
group_by
Create a grouped table expression.
head
Select the first n
rows of a table.
info
Return summary information about a table.
intersect
Compute the set intersection of multiple table expressions.
join
Perform a join between two tables.
limit
Select n
rows from self
starting at offset
.
mutate
Add columns to a table expression.
nunique
Compute the number of unique rows in the table.
order_by
Sort a table by one or more expressions.
pipe
Compose f
with self
.
pivot_longer
Transform a table from wider to longer.
pivot_wider
Pivot a table to a wider format.
preview
Return a subset as a Rich Table.
relabel
Deprecated in favor of Table.rename
.
relocate
Relocate columns
before or after other specified columns.
rename
Rename columns in the table.
rowid
A unique integer per row.
sample
Sample a fraction of rows from a table.
schema
Return the Schema for this table.
select
Compute a new table expression using exprs
and named_exprs
.
sql
Run a SQL query against a table expression.
to_array
Deprecated - use as_scalar
instead.
to_csv
Write the results of executing the given expression to a CSV file.
to_delta
Write the results of executing the given expression to a Delta Lake table.
to_pandas
Convert a table expression to a pandas DataFrame.
to_pandas_batches
Execute expression and return an iterator of pandas DataFrames.
to_parquet
Write the results of executing the given expression to a parquet file.
to_parquet_dir
Write the results of executing the given expression to a parquet file in a directory.
to_polars
Execute expression and return results as a polars dataframe.
to_pyarrow
Execute expression and return results in as a pyarrow table.
to_pyarrow_batches
Execute expression and return a RecordBatchReader.
to_torch
Execute an expression and return results as a dictionary of torch tensors.
try_cast
Cast the columns of a table.
unbind
Return an expression built on UnboundTable
instead of backend-specific objects.
union
Compute the set union of multiple table expressions.
unnest
Unnest an array column
from a table.
unpack
Project the struct fields of each of columns
into self
.
value_counts
Compute a frequency table of this tableβs values.
view
Create a new table expression distinct from the current one.
visualize
Visualize an expression as a GraphViz graph in the browser.
aggregate
aggregate(metrics= (), by= (), having= (), ** kwargs)
Aggregate a table with a given set of reductions grouping by by
.
Parameters
metrics
Sequence [ir
.Scalar
] | None
Aggregate expressions. These can be any scalar-producing expression, including aggregation functions like sum
or literal values like ibis.literal(1)
.
()
by
Sequence [ir
.Value
] | None
Grouping expressions.
()
having
Sequence [ir
.BooleanValue
] | None
Post-aggregation filters. The shape requirements are the same metrics
, but the output type for having
is boolean
. ::: {.callout-warning} ## Expressions like x is None
return bool
and will not generate a SQL comparison to NULL
:::
()
kwargs
ir
.Value
Named aggregate expressions
{}
Returns
Table
An aggregate table expression
Examples
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {
... "fruit" : ["apple" , "apple" , "banana" , "orange" ],
... "price" : [0.5 , 0.5 , 0.25 , 0.33 ],
... }
... )
>>> t
ββββββββββ³ββββββββββ
β fruit β price β
β‘βββββββββββββββββββ©
β string β float64 β
ββββββββββΌββββββββββ€
β apple β 0.50 β
β apple β 0.50 β
β banana β 0.25 β
β orange β 0.33 β
ββββββββββ΄ββββββββββ
>>> t.aggregate(
... by= ["fruit" ],
... total_cost= _.price.sum (),
... avg_cost= _.price.mean(),
... having= _.price.sum () < 0.5 ,
... )
ββββββββββ³βββββββββββββ³βββββββββββ
β fruit β total_cost β avg_cost β
β‘βββββββββββββββββββββββββββββββββ©
β string β float64 β float64 β
ββββββββββΌβββββββββββββΌβββββββββββ€
β banana β 0.25 β 0.25 β
β orange β 0.33 β 0.33 β
ββββββββββ΄βββββββββββββ΄βββββββββββ
alias
Create a table expression with a specific name alias
.
This method is useful for exposing an ibis expression to the underlying backend for use in the Table.sql
method.
.alias
creates a temporary view in the database.
This side effect will be removed in a future version of ibis and is not part of the public API .
Parameters
alias
str
Name of the child expression
required
Returns
Table
An table expression
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> expr = t.alias("pingΓΌinos" ).sql('SELECT * FROM "pingΓΌinos" LIMIT 5' )
>>> expr
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
as_scalar
Inform ibis that the table expression should be treated as a scalar.
Note that the table must have exactly one column and one row for this to work. If the table has more than one column an error will be raised in expression construction time. If the table has more than one row an error will be raised by the backend when the expression is executed.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> heavy_gentoo = t.filter (t.species == "Gentoo" , t.body_mass_g > 6200 )
>>> from_that_island = t.filter (t.island == heavy_gentoo.select("island" ).as_scalar())
>>> from_that_island.species.value_counts().order_by("species" )
βββββββββββ³ββββββββββββββββ
β species β species_count β
β‘ββββββββββββββββββββββββββ©
β string β int64 β
βββββββββββΌββββββββββββββββ€
β Adelie β 44 β
β Gentoo β 124 β
βββββββββββ΄ββββββββββββββββ
as_table
Promote the expression to a table.
This method is a no-op for table expressions.
Examples
>>> t = ibis.table(dict (a= "int" ), name= "t" )
>>> s = t.as_table()
>>> t is s
asof_join
asof_join(
left,
right,
on,
predicates= (),
tolerance= None ,
* ,
lname= '' ,
rname= ' {name} _right' ,
)
Perform an βas-ofβ join between left
and right
.
Similar to a left join except that the match is done on nearest key rather than equal keys.
Parameters
left
Table
Table expression
required
right
Table
Table expression
required
on
str | ir
.BooleanColumn
Closest match inequality condition
required
predicates
str | ir
.Column
| Sequence [str | ir
.Column
]
Additional join predicates
()
tolerance
str | ir
.IntervalScalar
| None
Amount of time to look behind when joining
None
lname
str
A format string to use to rename overlapping columns in the left table (e.g. "left_{name}"
).
''
rname
str
A format string to use to rename overlapping columns in the right table (e.g. "right_{name}"
).
'{name}_right'
Examples
>>> from datetime import datetime, timedelta
>>> import ibis
>>> ibis.options.interactive = True
>>> sensors = ibis.memtable(
... {
... "site" : ["a" , "b" , "a" , "b" , "a" ],
... "humidity" : [0.3 , 0.4 , 0.5 , 0.6 , 0.7 ],
... "event_time" : [
... datetime(2024 , 11 , 16 , 12 , 0 , 15 , 500000 ),
... datetime(2024 , 11 , 16 , 12 , 0 , 15 , 700000 ),
... datetime(2024 , 11 , 17 , 18 , 12 , 14 , 950000 ),
... datetime(2024 , 11 , 17 , 18 , 12 , 15 , 120000 ),
... datetime(2024 , 11 , 18 , 18 , 12 , 15 , 100000 ),
... ],
... }
... )
>>> events = ibis.memtable(
... {
... "site" : ["a" , "b" , "a" ],
... "event_type" : [
... "cloud coverage" ,
... "rain start" ,
... "rain stop" ,
... ],
... "event_time" : [
... datetime(2024 , 11 , 16 , 12 , 0 , 15 , 400000 ),
... datetime(2024 , 11 , 17 , 18 , 12 , 15 , 100000 ),
... datetime(2024 , 11 , 18 , 18 , 12 , 15 , 100000 ),
... ],
... }
... )
This setup simulates time-series data by pairing irregularly collected sensor readings with weather events, enabling analysis of environmental conditions before each event. We will use the asof_join
method to match each event with the most recent prior sensor reading from the sensors table at the same site.
ββββββββββ³βββββββββββ³ββββββββββββββββββββββββββ
β site β humidity β event_time β
β‘ββββββββββββββββββββββββββββββββββββββββββββββ©
β string β float64 β timestamp β
ββββββββββΌβββββββββββΌββββββββββββββββββββββββββ€
β a β 0.3 β 2024-11-16 12:00:15.500 β
β b β 0.4 β 2024-11-16 12:00:15.700 β
β a β 0.5 β 2024-11-17 18:12:14.950 β
β b β 0.6 β 2024-11-17 18:12:15.120 β
β a β 0.7 β 2024-11-18 18:12:15.100 β
ββββββββββ΄βββββββββββ΄ββββββββββββββββββββββββββ
ββββββββββ³βββββββββββββββββ³ββββββββββββββββββββββββββ
β site β event_type β event_time β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β timestamp β
ββββββββββΌβββββββββββββββββΌββββββββββββββββββββββββββ€
β a β cloud coverage β 2024-11-16 12:00:15.400 β
β b β rain start β 2024-11-17 18:12:15.100 β
β a β rain stop β 2024-11-18 18:12:15.100 β
ββββββββββ΄βββββββββββββββββ΄ββββββββββββββββββββββββββ
We can find the closest event to each sensor reading with a 1 second tolerance. Using the βsiteβ column as a join predicate ensures we only match events that occurred at or near the same site as the sensor reading.
>>> tolerance = timedelta(seconds= 1 )
>>> sensors.asof_join(events, on= "event_time" , predicates= "site" , tolerance= tolerance).drop(
... "event_time_right"
... ).order_by("event_time" )
ββββββββββ³βββββββββββ³ββββββββββββββββββββββββββ³βββββββββββββ³βββββββββββββββββ
β site β humidity β event_time β site_right β event_type β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β float64 β timestamp β string β string β
ββββββββββΌβββββββββββΌββββββββββββββββββββββββββΌβββββββββββββΌβββββββββββββββββ€
β a β 0.3 β 2024-11-16 12:00:15.500 β a β cloud coverage β
β b β 0.4 β 2024-11-16 12:00:15.700 β NULL β NULL β
β a β 0.5 β 2024-11-17 18:12:14.950 β NULL β NULL β
β b β 0.6 β 2024-11-17 18:12:15.120 β b β rain start β
β a β 0.7 β 2024-11-18 18:12:15.100 β a β rain stop β
ββββββββββ΄βββββββββββ΄ββββββββββββββββββββββββββ΄βββββββββββββ΄βββββββββββββββββ
bind
Bind column values to a table expression.
This method handles the binding of every kind of column-like value that Ibis handles, including strings, integers, deferred expressions and selectors, to a table expression.
Parameters
args
Any
Column-like values to bind.
()
kwargs
Any
Column-like values to bind, with names.
{}
cache
Cache the provided expression.
All subsequent operations on the returned expression will be performed on the cached data. The lifetime of the cached table is tied to its python references (ie. it is released once the last reference to it is garbage collected). Alternatively, use the with
statement or call the .release()
method for more control.
This method is idempotent: calling it multiple times in succession will return the same value as the first call.
Subsequent evaluations will not recompute the expression so method chaining will not incur the overhead of caching more than once.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> heavy_computation = ibis.literal("Heavy Computation" )
>>> cached_penguins = t.mutate(computation= heavy_computation).cache()
>>> cached_penguins
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ³ββββββββββββββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β computation β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β string β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββΌββββββββββββββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β Heavy Computation β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β Heavy Computation β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β Heavy Computation β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β Heavy Computation β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β Heavy Computation β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β Heavy Computation β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β Heavy Computation β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β Heavy Computation β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β Heavy Computation β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β Heavy Computation β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ΄ββββββββββββββββββββ
Explicit cache cleanup
>>> with t.mutate(computation= heavy_computation).cache() as cached_penguins:
... cached_penguins
cast
Cast the columns of a table.
Similar to pandas.DataFrame.astype
.
Parameters
schema
SchemaLike
Mapping, schema or iterable of pairs to use for casting
required
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.schema()
ibis.Schema {
species string
island string
bill_length_mm float64
bill_depth_mm float64
flipper_length_mm int64
body_mass_g int64
sex string
year int64
}
>>> cols = ["body_mass_g" , "bill_length_mm" ]
>>> t[cols].head()
βββββββββββββββ³βββββββββββββββββ
β body_mass_g β bill_length_mm β
β‘βββββββββββββββββββββββββββββββ©
β int64 β float64 β
βββββββββββββββΌβββββββββββββββββ€
β 3750 β 39.1 β
β 3800 β 39.5 β
β 3250 β 40.3 β
β NULL β NULL β
β 3450 β 36.7 β
βββββββββββββββ΄βββββββββββββββββ
Columns not present in the input schema will be passed through unchanged
('species',
'island',
'bill_length_mm',
'bill_depth_mm',
'flipper_length_mm',
'body_mass_g',
'sex',
'year')
>>> expr = t.cast({"body_mass_g" : "float64" , "bill_length_mm" : "int" })
>>> expr.select(* cols).head()
βββββββββββββββ³βββββββββββββββββ
β body_mass_g β bill_length_mm β
β‘βββββββββββββββββββββββββββββββ©
β float64 β int64 β
βββββββββββββββΌβββββββββββββββββ€
β 3750.0 β 39 β
β 3800.0 β 40 β
β 3250.0 β 40 β
β NULL β NULL β
β 3450.0 β 37 β
βββββββββββββββ΄βββββββββββββββββ
Columns that are in the input schema
but not in the table raise an error
>>> t.cast({"foo" : "string" })
---------------------------------------------------------------------------
IbisError Traceback (most recent call last)
Cell In[47], line 1
----> 1 t . cast ( { " foo " : " string " } )
File ~/work/ibis/ibis/ibis/expr/types/relations.py:447 , in Table.cast (self, schema)
372 def cast (self , schema: SchemaLike) - > Table:
373 """Cast the columns of a table.
374
375 Similar to `pandas.DataFrame.astype`.
(...)
445 ibis.common.exceptions.IbisError: Cast schema has fields that are not in the table: ['foo']
446 """
--> 447 return self . _cast ( schema , cast_method = " cast " )
File ~/work/ibis/ibis/ibis/expr/types/relations.py:490 , in Table._cast (self, schema, cast_method)
488 columns = self . columns
489 if missing_fields := frozenset (schema. names). difference(columns):
--> 490 raise com. IbisError(
491 f " Cast schema has fields that are not in the table: { sorted (missing_fields)} "
492 )
494 for col in columns:
495 if (new_type := schema. get(col)) is not None :
IbisError : Cast schema has fields that are not in the table: ['foo']
compile
compile (limit= None , params= None , pretty= False )
Compile to an execution target.
Parameters
limit
int | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
None
params
Mapping [ir
.Value
, Any ] | None
Mapping of scalar parameter expressions to value
None
pretty
bool
In case of SQL backends, return a pretty formatted SQL query.
False
Returns
str
SQL query representation of the expression
Examples
>>> import ibis
>>> d = {"a" : [1 , 2 , 3 ], "b" : [4 , 5 , 6 ]}
>>> con = ibis.duckdb.connect ()
>>> t = con.create_table("t" , d)
>>> expr = t.mutate(c= t.a + t.b)
>>> expr.compile ()
'SELECT "t0"."a", "t0"."b", "t0"."a" + "t0"."b" AS "c" FROM "memory"."main"."t" AS "t0"'
If you want to see the pretty formatted SQL query, set pretty
to True
.
>>> expr.compile (pretty= True )
'SELECT\n "t0"."a",\n "t0"."b",\n "t0"."a" + "t0"."b" AS "c"\nFROM "memory"."main"."t" AS "t0"'
If the expression does not have a backend, an error will be raised.
>>> t = ibis.memtable(d)
>>> expr = t.mutate(c= t.a + t.b)
>>> expr.compile () # quartodoc: +EXPECTED_FAILURE
---------------------------------------------------------------------------
IbisError Traceback (most recent call last)
Cell In[56], line 3
1 t = ibis. memtable(d)
2 expr = t. mutate(c= t. a + t. b)
----> 3 expr . compile ( ) # quartodoc: +EXPECTED_FAILURE
File ~/work/ibis/ibis/ibis/expr/types/core.py:506 , in Expr.compile (self, limit, params, pretty)
457 def compile (
458 self ,
459 limit: int | None = None ,
460 params: Mapping[ir. Value, Any] | None = None ,
461 pretty: bool = False ,
462 ) - > str :
463 r """Compile to an execution target.
464
465 Parameters
(...)
504 [`ibis.to_sql()`](./expression-generic.qmd#ibis.to_sql)
505 """
--> 506 return self . _find_backend ( ) . compile(
507 self , limit= limit, params= params, pretty= pretty
508 )
File ~/work/ibis/ibis/ibis/expr/types/core.py:368 , in Expr._find_backend (self, use_default)
366 default = _default_backend() if use_default else None
367 if default is None :
--> 368 raise IbisError(
369 " Expression depends on no backends, and found no default "
370 )
371 return default
373 if len (backends) > 1 :
IbisError : Expression depends on no backends, and found no default
count
Compute the number of rows in the table.
Parameters
where
ir
.BooleanValue
| None
Optional boolean expression to filter rows when counting.
None
Returns
IntegerScalar
Number of rows in the table
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a" : ["foo" , "bar" , "baz" ]})
>>> t
ββββββββββ
β a β
β‘βββββββββ©
β string β
ββββββββββ€
β foo β
β bar β
β baz β
ββββββββββ
βββββ
β 3 β
βββββ
>>> t.count(t.a != "foo" )
βββββ
β 2 β
βββββ
ibis.expr.types.numeric.IntegerScalar
cross_join
cross_join(left, right, * rest, lname= '' , rname= ' {name} _right' )
Compute the cross join of a sequence of tables.
Parameters
left
Table
Left table
required
right
Table
Right table
required
rest
Table
Additional tables to cross join
()
lname
str
A format string to use to rename overlapping columns in the left table (e.g. "left_{name}"
).
''
rname
str
A format string to use to rename overlapping columns in the right table (e.g. "right_{name}"
).
'{name}_right'
Returns
Table
Cross join of left
, right
and rest
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.count()
βββββββ
β 344 β
βββββββ
>>> agg = t.drop("year" ).agg(s.across(s.numeric(), _.mean()))
>>> expr = t.cross_join(agg)
>>> expr
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ³βββββββββββββββββββββββ³ββββββββββββββββββββββ³ββββββββββββββββββββββββββ³ββββββββββββββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β bill_length_mm_right β bill_depth_mm_right β flipper_length_mm_right β body_mass_g_right β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β float64 β float64 β float64 β float64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββΌβββββββββββββββββββββββΌββββββββββββββββββββββΌββββββββββββββββββββββββββΌββββββββββββββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ΄βββββββββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββββββββββββββ΄ββββββββββββββββββββ
('species',
'island',
'bill_length_mm',
'bill_depth_mm',
'flipper_length_mm',
'body_mass_g',
'sex',
'year',
'bill_length_mm_right',
'bill_depth_mm_right',
'flipper_length_mm_right',
'body_mass_g_right')
βββββββ
β 344 β
βββββββ
describe
describe(quantile= (0.25 , 0.5 , 0.75 ))
Return summary information about a table.
Parameters
quantile
Sequence [ir
.NumericValue
| float ]
The quantiles to compute for numerical columns. Defaults to (0.25, 0.5, 0.75).
(0.25, 0.5, 0.75)
Returns
Table
A table containing summary information about the columns of self.
Notes
This function computes summary statistics for each column in the table. For numerical columns, it computes statistics such as minimum, maximum, mean, standard deviation, and quantiles. For string columns, it computes the mode and the number of unique values.
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> ibis.options.interactive = True
>>> p = ibis.examples.penguins.fetch()
>>> p.describe()
βββββββββββββββββββββ³ββββββββ³ββββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ³ββββββββββββββ³βββββββββββββ³ββββββββββ³βββββββββββ³ββββββββββ³ββββββββββ³ββββββββββ
β name β pos β type β count β nulls β unique β mode β mean β std β min β p25 β p50 β p75 β max β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int16 β string β int64 β int64 β int64 β string β float64 β float64 β float64 β float64 β float64 β float64 β float64 β
βββββββββββββββββββββΌββββββββΌββββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββΌββββββββββββββΌβββββββββββββΌββββββββββΌβββββββββββΌββββββββββΌββββββββββΌββββββββββ€
β species β 0 β string β 344 β 0 β 3 β Adelie β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β island β 1 β string β 344 β 0 β 3 β Biscoe β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β bill_length_mm β 2 β float64 β 344 β 2 β 164 β NULL β 43.921930 β 5.459584 β 32.1 β 39.225 β 44.45 β 48.5 β 59.6 β
β bill_depth_mm β 3 β float64 β 344 β 2 β 80 β NULL β 17.151170 β 1.974793 β 13.1 β 15.600 β 17.30 β 18.7 β 21.5 β
β flipper_length_mm β 4 β int64 β 344 β 2 β 55 β NULL β 200.915205 β 14.061714 β 172.0 β 190.000 β 197.00 β 213.0 β 231.0 β
β body_mass_g β 5 β int64 β 344 β 2 β 94 β NULL β 4201.754386 β 801.954536 β 2700.0 β 3550.000 β 4050.00 β 4750.0 β 6300.0 β
β sex β 6 β string β 344 β 11 β 2 β male β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β year β 7 β int64 β 344 β 0 β 3 β NULL β 2008.029070 β 0.818356 β 2007.0 β 2007.000 β 2008.00 β 2009.0 β 2009.0 β
βββββββββββββββββββββ΄ββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ΄ββββββββββββββ΄βββββββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ
>>> p.select(s.of_type("numeric" )).describe()
βββββββββββββββββββββ³ββββββββ³ββββββββββ³ββββββββ³ββββββββ³βββββββββ³ββββββββββββββ³βββββββββββββ³ββββββββββ³βββββββββββ³ββββββββββ³ββββββββββ³ββββββββββ
β name β pos β type β count β nulls β unique β mean β std β min β p25 β p50 β p75 β max β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int16 β string β int64 β int64 β int64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β
βββββββββββββββββββββΌββββββββΌββββββββββΌββββββββΌββββββββΌβββββββββΌββββββββββββββΌβββββββββββββΌββββββββββΌβββββββββββΌββββββββββΌββββββββββΌββββββββββ€
β flipper_length_mm β 2 β int64 β 344 β 2 β 55 β 200.915205 β 14.061714 β 172.0 β 190.000 β 197.00 β 213.0 β 231.0 β
β body_mass_g β 3 β int64 β 344 β 2 β 94 β 4201.754386 β 801.954536 β 2700.0 β 3550.000 β 4050.00 β 4750.0 β 6300.0 β
β year β 4 β int64 β 344 β 0 β 3 β 2008.029070 β 0.818356 β 2007.0 β 2007.000 β 2008.00 β 2009.0 β 2009.0 β
β bill_length_mm β 0 β float64 β 344 β 2 β 164 β 43.921930 β 5.459584 β 32.1 β 39.225 β 44.45 β 48.5 β 59.6 β
β bill_depth_mm β 1 β float64 β 344 β 2 β 80 β 17.151170 β 1.974793 β 13.1 β 15.600 β 17.30 β 18.7 β 21.5 β
βββββββββββββββββββββ΄ββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄ββββββββββββββ΄βββββββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ
>>> p.select(s.of_type("string" )).describe()
βββββββββββ³ββββββββ³βββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ
β name β pos β type β count β nulls β unique β mode β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int16 β string β int64 β int64 β int64 β string β
βββββββββββΌββββββββΌβββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββ€
β sex β 2 β string β 344 β 11 β 2 β male β
β species β 0 β string β 344 β 0 β 3 β Adelie β
β island β 1 β string β 344 β 0 β 3 β Biscoe β
βββββββββββ΄ββββββββ΄βββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ
difference
difference(table, * rest, distinct= True )
Compute the set difference of multiple table expressions.
The input tables must have identical schemas.
Parameters
table
Table
A table expression
required
*rest
Table
Additional table expressions
()
distinct
bool
Only diff distinct rows not occurring in the calling table
True
Returns
Table
The rows present in self
that are not present in tables
.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t1 = ibis.memtable({"a" : [1 , 2 ]})
>>> t1
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 1 β
β 2 β
βββββββββ
>>> t2 = ibis.memtable({"a" : [2 , 3 ]})
>>> t2
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 2 β
β 3 β
βββββββββ
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 1 β
βββββββββ
distinct
distinct(on= None , keep= 'first' )
Return a Table with duplicate rows removed.
Similar to pandas.DataFrame.drop_duplicates()
.
Parameters
on
str | Iterable [str ] | s
.Selector
| None
Only consider certain columns for identifying duplicates. By default deduplicate all of the columns.
None
keep
Literal ['first', 'last'] | None
Determines which duplicates to keep. - "first"
: Drop duplicates except for the first occurrence. - "last"
: Drop duplicates except for the last occurrence. - None
: Drop all duplicates
'first'
Examples
>>> import ibis
>>> import ibis.examples as ex
>>> import ibis.selectors as s
>>> ibis.options.interactive = True
>>> t = ex.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Compute the distinct rows of a subset of columns
>>> t[["species" , "island" ]].distinct().order_by(s.all ())
βββββββββββββ³ββββββββββββ
β species β island β
β‘ββββββββββββββββββββββββ©
β string β string β
βββββββββββββΌββββββββββββ€
β Adelie β Biscoe β
β Adelie β Dream β
β Adelie β Torgersen β
β Chinstrap β Dream β
β Gentoo β Biscoe β
βββββββββββββ΄ββββββββββββ
Drop all duplicate rows except the first
>>> t.distinct(on= ["species" , "island" ], keep= "first" ).order_by(s.all ())
βββββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Biscoe β 37.8 β 18.3 β 174 β 3400 β female β 2007 β
β Adelie β Dream β 39.5 β 16.7 β 178 β 3250 β female β 2007 β
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Chinstrap β Dream β 46.5 β 17.9 β 192 β 3500 β female β 2007 β
β Gentoo β Biscoe β 46.1 β 13.2 β 211 β 4500 β female β 2007 β
βββββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Drop all duplicate rows except the last
>>> t.distinct(on= ["species" , "island" ], keep= "last" ).order_by(s.all ())
βββββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Biscoe β 42.7 β 18.3 β 196 β 4075 β male β 2009 β
β Adelie β Dream β 41.5 β 18.5 β 201 β 4000 β male β 2009 β
β Adelie β Torgersen β 43.1 β 19.2 β 197 β 3500 β male β 2009 β
β Chinstrap β Dream β 50.2 β 18.7 β 198 β 3775 β female β 2009 β
β Gentoo β Biscoe β 49.9 β 16.1 β 213 β 5400 β male β 2009 β
βββββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Drop all duplicated rows
>>> expr = t.distinct(on= ["species" , "island" , "year" , "bill_length_mm" ], keep= None )
>>> expr.count()
βββββββ
β 273 β
βββββββ
βββββββ
β 344 β
βββββββ
You can pass selectors
to on
>>> t.distinct(on=~ s.numeric())
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Biscoe β 37.8 β 18.3 β 174 β 3400 β female β 2007 β
β Gentoo β Biscoe β 46.1 β 13.2 β 211 β 4500 β female β 2007 β
β Adelie β Biscoe β 37.7 β 18.7 β 180 β 3600 β male β 2007 β
β Gentoo β Biscoe β 50.0 β 16.3 β 230 β 5700 β male β 2007 β
β Gentoo β Biscoe β 44.5 β 14.3 β 216 β 4100 β NULL β 2007 β
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Dream β 37.2 β 18.1 β 178 β 3900 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Dream β 39.5 β 16.7 β 178 β 3250 β female β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
The only valid values of keep
are "first"
, "last"
and None
.
>>> t.distinct(on= "species" , keep= "second" )
---------------------------------------------------------------------------
IbisError Traceback (most recent call last)
Cell In[122], line 1
----> 1 t . distinct ( on = " species " , keep = " second " )
File ~/work/ibis/ibis/ibis/expr/types/relations.py:1199 , in Table.distinct (self, on, keep)
1197 method = keep
1198 else :
-> 1199 raise com. IbisError(
1200 f " Invalid value for `keep`: { keep!r} , must be ' first ' , ' last ' or None "
1201 )
1203 aggs = {col. get_name(): getattr (col, method)() for col in (~ on). expand(self )}
1204 res = self . aggregate(aggs, by= on, having= having)
IbisError : Invalid value for `keep`: 'second', must be 'first', 'last' or None
drop
Remove fields from a table.
Parameters
fields
str | Selector
Fields to drop. Strings and selectors are accepted.
()
Returns
Table
A table with all columns matching fields
removed.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Drop one or more columns
>>> t.drop("species" ).head()
βββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
βββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
>>> t.drop("species" , "bill_length_mm" ).head()
βββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β island β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β float64 β int64 β int64 β string β int64 β
βββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Torgersen β 18.7 β 181 β 3750 β male β 2007 β
β Torgersen β 17.4 β 186 β 3800 β female β 2007 β
β Torgersen β 18.0 β 195 β 3250 β female β 2007 β
β Torgersen β NULL β NULL β NULL β NULL β 2007 β
β Torgersen β 19.3 β 193 β 3450 β female β 2007 β
βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Drop with selectors, mix and match
>>> import ibis.selectors as s
>>> t.drop("species" , s.startswith("bill_" )).head()
βββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β island β flipper_length_mm β body_mass_g β sex β year β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β int64 β string β int64 β
βββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Torgersen β 181 β 3750 β male β 2007 β
β Torgersen β 186 β 3800 β female β 2007 β
β Torgersen β 195 β 3250 β female β 2007 β
β Torgersen β NULL β NULL β NULL β 2007 β
β Torgersen β 193 β 3450 β female β 2007 β
βββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
drop_null
drop_null(subset= None , how= 'any' )
Remove rows with null values from the table.
Parameters
subset
Sequence [str ] | str | None
Columns names to consider when dropping nulls. By default all columns are considered.
None
how
Literal ['any', 'all']
Determine whether a row is removed if there is at least one null value in the row ('any'
), or if all row values are null ('all'
).
'any'
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
βββββββ
β 344 β
βββββββ
>>> t.drop_null(["bill_length_mm" , "body_mass_g" ]).count()
βββββββ
β 342 β
βββββββ
>>> t.drop_null(how= "all" ).count() # no rows where all columns are null
βββββββ
β 344 β
βββββββ
dropna
dropna(subset= None , how= 'any' )
Deprecated - use drop_null
instead.
equals
Return whether this expression is structurally equivalent to other
.
If you want to produce an equality expression, use ==
syntax.
Parameters
other
Another expression
required
Examples
>>> import ibis
>>> t1 = ibis.table(dict (a= "int" ), name= "t" )
>>> t2 = ibis.table(dict (a= "int" ), name= "t" )
>>> t1.equals(t2)
>>> v = ibis.table(dict (a= "string" ), name= "v" )
>>> t1.equals(v)
execute
execute(limit= 'default' , params= None , ** kwargs)
Execute an expression against its backend if one exists.
Parameters
limit
int | str | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
'default'
params
Mapping [ir
.Value
, Any ] | None
Mapping of scalar parameter expressions to value
None
kwargs
Any
Keyword arguments
{}
Examples
>>> import ibis
>>> t = ibis.examples.penguins.fetch()
>>> t.execute()
0
Adelie
Torgersen
39.1
18.7
181.0
3750.0
male
2007
1
Adelie
Torgersen
39.5
17.4
186.0
3800.0
female
2007
2
Adelie
Torgersen
40.3
18.0
195.0
3250.0
female
2007
3
Adelie
Torgersen
NaN
NaN
NaN
NaN
None
2007
4
Adelie
Torgersen
36.7
19.3
193.0
3450.0
female
2007
...
...
...
...
...
...
...
...
...
339
Chinstrap
Dream
55.8
19.8
207.0
4000.0
male
2009
340
Chinstrap
Dream
43.5
18.1
202.0
3400.0
female
2009
341
Chinstrap
Dream
49.6
18.2
193.0
3775.0
male
2009
342
Chinstrap
Dream
50.8
19.0
210.0
4100.0
male
2009
343
Chinstrap
Dream
50.2
18.7
198.0
3775.0
female
2009
344 rows Γ 8 columns
Scalar parameters can be supplied dynamically during execution.
>>> species = ibis.param("string" )
>>> expr = t.filter (t.species == species).order_by(t.bill_length_mm)
>>> expr.execute(limit= 3 , params= {species: "Gentoo" })
0
Gentoo
Biscoe
40.9
13.7
214
4650
female
2007
1
Gentoo
Biscoe
41.7
14.7
210
4700
female
2009
2
Gentoo
Biscoe
42.0
13.5
210
4150
female
2007
fill_null
Fill null values in a table expression.
For example, different library versions may impact whether a given backend promotes integer replacement values to floats.
Parameters
replacements
ir
.Scalar
| Mapping [str , ir
.Scalar
]
Value with which to fill nulls. If replacements
is a mapping, the keys are column names that map to their replacement value. If passed as a scalar all columns are filled with that value.
required
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.sex
ββββββββββ
β sex β
β‘βββββββββ©
β string β
ββββββββββ€
β male β
β female β
β female β
β NULL β
β female β
β male β
β female β
β male β
β NULL β
β NULL β
β β¦ β
ββββββββββ
>>> t.fill_null({"sex" : "unrecorded" }).sex
ββββββββββββββ
β sex β
β‘βββββββββββββ©
β string β
ββββββββββββββ€
β male β
β female β
β female β
β unrecorded β
β female β
β male β
β female β
β male β
β unrecorded β
β unrecorded β
β β¦ β
ββββββββββββββ
fillna
Deprecated - use fill_null
instead.
filter
Select rows from table
based on predicates
.
Parameters
predicates
ir
.BooleanValue
| Sequence [ir
.BooleanValue
] | IfAnyAll
Boolean value expressions used to select rows in table
.
()
Returns
Table
Filtered table expression
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
>>> t.filter ([t.species == "Adelie" , t.body_mass_g > 3500 ]).sex.value_counts().drop_null(
... "sex"
... ).order_by("sex" )
ββββββββββ³ββββββββββββ
β sex β sex_count β
β‘βββββββββββββββββββββ©
β string β int64 β
ββββββββββΌββββββββββββ€
β female β 22 β
β male β 68 β
ββββββββββ΄ββββββββββββ
get_backend
Get the current Ibis backend of the expression.
Returns
BaseBackend
The Ibis backend.
Examples
>>> import ibis
>>> con = ibis.duckdb.connect ()
>>> t = con.create_table("t" , {"id" : [1 , 2 , 3 ]})
>>> t.get_backend()
<ibis.backends.duckdb.Backend at 0x7fff86d6c690>
get_name
Return the fully qualified name of the table.
Examples
>>> import ibis
>>> con = ibis.duckdb.connect ()
>>> t = con.create_table("t" , {"id" : [1 , 2 , 3 ]})
>>> t.get_name()
group_by
group_by(* by, ** key_exprs)
Create a grouped table expression.
Similar to SQLβs GROUP BY statement, or pandas .groupby() method.
Examples
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {
... "fruit" : ["apple" , "apple" , "banana" , "orange" ],
... "price" : [0.5 , 0.5 , 0.25 , 0.33 ],
... }
... )
>>> t
ββββββββββ³ββββββββββ
β fruit β price β
β‘βββββββββββββββββββ©
β string β float64 β
ββββββββββΌββββββββββ€
β apple β 0.50 β
β apple β 0.50 β
β banana β 0.25 β
β orange β 0.33 β
ββββββββββ΄ββββββββββ
>>> t.group_by("fruit" ).agg(total_cost= _.price.sum (), avg_cost= _.price.mean()).order_by(
... "fruit"
... )
ββββββββββ³βββββββββββββ³βββββββββββ
β fruit β total_cost β avg_cost β
β‘βββββββββββββββββββββββββββββββββ©
β string β float64 β float64 β
ββββββββββΌβββββββββββββΌβββββββββββ€
β apple β 1.00 β 0.50 β
β banana β 0.25 β 0.25 β
β orange β 0.33 β 0.33 β
ββββββββββ΄βββββββββββββ΄βββββββββββ
head
Select the first n
rows of a table.
Parameters
n
int
Number of rows to include
5
Returns
Table
self
limited to n
rows
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a" : [1 , 1 , 2 ], "b" : ["c" , "a" , "a" ]})
>>> t
βββββββββ³βββββββββ
β a β b β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β c β
β 1 β a β
β 2 β a β
βββββββββ΄βββββββββ
βββββββββ³βββββββββ
β a β b β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β c β
β 1 β a β
βββββββββ΄βββββββββ
info
Return summary information about a table.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.info()
βββββββββββββββββββββ³ββββββββββ³βββββββββββ³ββββββββ³ββββββββββββ³ββββββββββββ³ββββββββ
β name β type β nullable β nulls β non_nulls β null_frac β pos β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β boolean β int64 β int64 β float64 β int16 β
βββββββββββββββββββββΌββββββββββΌβββββββββββΌββββββββΌββββββββββββΌββββββββββββΌββββββββ€
β species β string β True β 0 β 344 β 0.000000 β 0 β
β island β string β True β 0 β 344 β 0.000000 β 1 β
β bill_length_mm β float64 β True β 2 β 342 β 0.005814 β 2 β
β bill_depth_mm β float64 β True β 2 β 342 β 0.005814 β 3 β
β flipper_length_mm β int64 β True β 2 β 342 β 0.005814 β 4 β
β body_mass_g β int64 β True β 2 β 342 β 0.005814 β 5 β
β sex β string β True β 11 β 333 β 0.031977 β 6 β
β year β int64 β True β 0 β 344 β 0.000000 β 7 β
βββββββββββββββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββ
intersect
intersect(table, * rest, distinct= True )
Compute the set intersection of multiple table expressions.
The input tables must have identical schemas.
Parameters
table
Table
A table expression
required
*rest
Table
Additional table expressions
()
distinct
bool
Only return distinct rows
True
Returns
Table
A new table containing the intersection of all input tables.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t1 = ibis.memtable({"a" : [1 , 2 , 2 ]})
>>> t1
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 1 β
β 2 β
β 2 β
βββββββββ
>>> t2 = ibis.memtable({"a" : [2 , 2 , 3 ]})
>>> t2
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 2 β
β 2 β
β 3 β
βββββββββ
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 2 β
βββββββββ
>>> t1.intersect(t2, distinct= False )
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 2 β
β 2 β
βββββββββ
More than two table expressions can be intersected at once.
>>> t3 = ibis.memtable({"a" : [2 , 3 , 3 ]})
>>> t1.intersect(t2, t3)
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 2 β
βββββββββ
join
join(left, right, predicates= (), how= 'inner' , * , lname= '' , rname= ' {name} _right' )
Perform a join between two tables.
Parameters
left
Table
Left table to join
required
right
Table
Right table to join
required
predicates
str | Sequence [str | ir
.BooleanColumn
| Literal [True] | Literal [False] | tuple [str | ir
.Column
| ir
.Deferred
, str | ir
.Column
| ir
.Deferred
]]
Condition(s) to join on. See examples for details.
()
how
JoinKind
Join method, e.g. "inner"
or "left"
.
'inner'
lname
str
A format string to use to rename overlapping columns in the left table (e.g. "left_{name}"
).
''
rname
str
A format string to use to rename overlapping columns in the right table (e.g. "right_{name}"
).
'{name}_right'
Examples
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> movies = ibis.examples.ml_latest_small_movies.fetch()
>>> movies.head()
βββββββββββ³βββββββββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββββββββββ
β movieId β title β genres β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β string β string β
βββββββββββΌβββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββ€
β 1 β Toy Story (1995) β Adventure|Animation|Children|Comedy|Fantasy β
β 2 β Jumanji (1995) β Adventure|Children|Fantasy β
β 3 β Grumpier Old Men (1995) β Comedy|Romance β
β 4 β Waiting to Exhale (1995) β Comedy|Drama|Romance β
β 5 β Father of the Bride Part II (1995) β Comedy β
βββββββββββ΄βββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββ
>>> ratings = ibis.examples.ml_latest_small_ratings.fetch().drop("timestamp" )
>>> ratings.head()
ββββββββββ³ββββββββββ³ββββββββββ
β userId β movieId β rating β
β‘βββββββββββββββββββββββββββββ©
β int64 β int64 β float64 β
ββββββββββΌββββββββββΌββββββββββ€
β 1 β 1 β 4.0 β
β 1 β 3 β 4.0 β
β 1 β 6 β 4.0 β
β 1 β 47 β 5.0 β
β 1 β 50 β 5.0 β
ββββββββββ΄ββββββββββ΄ββββββββββ
Equality left join on the shared movieId
column. Note the _right
suffix added to all overlapping columns from the right table (in this case only the βmovieIdβ column).
>>> ratings.join(movies, "movieId" , how= "left" ).head(5 )
ββββββββββ³ββββββββββ³ββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββββββββββ
β userId β movieId β rating β movieId_right β title β genres β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β float64 β int64 β string β string β
ββββββββββΌββββββββββΌββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββ€
β 1 β 1 β 4.0 β 1 β Toy Story (1995) β Adventure|Animation|Children|Comedy|Fantasy β
β 1 β 3 β 4.0 β 3 β Grumpier Old Men (1995) β Comedy|Romance β
β 1 β 6 β 4.0 β 6 β Heat (1995) β Action|Crime|Thriller β
β 1 β 47 β 5.0 β 47 β Seven (a.k.a. Se7en) (1995) β Mystery|Thriller β
β 1 β 50 β 5.0 β 50 β Usual Suspects, The (1995) β Crime|Mystery|Thriller β
ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββ
Explicit equality join using the default how
value of "inner"
. Note how there is no _right
suffix added to the movieId
column since this is an inner join and the movieId
column is part of the join condition.
>>> ratings.join(movies, ratings.movieId == movies.movieId).head(5 )
ββββββββββ³ββββββββββ³ββββββββββ³ββββββββββββββββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββββββββββ
β userId β movieId β rating β title β genres β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β float64 β string β string β
ββββββββββΌββββββββββΌββββββββββΌββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββ€
β 1 β 1 β 4.0 β Toy Story (1995) β Adventure|Animation|Children|Comedy|Fantasy β
β 1 β 3 β 4.0 β Grumpier Old Men (1995) β Comedy|Romance β
β 1 β 6 β 4.0 β Heat (1995) β Action|Crime|Thriller β
β 1 β 47 β 5.0 β Seven (a.k.a. Se7en) (1995) β Mystery|Thriller β
β 1 β 50 β 5.0 β Usual Suspects, The (1995) β Crime|Mystery|Thriller β
ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββ
>>> tags = ibis.examples.ml_latest_small_tags.fetch()
>>> tags.head()
ββββββββββ³ββββββββββ³ββββββββββββββββββ³βββββββββββββ
β userId β movieId β tag β timestamp β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β string β int64 β
ββββββββββΌββββββββββΌββββββββββββββββββΌβββββββββββββ€
β 2 β 60756 β funny β 1445714994 β
β 2 β 60756 β Highly quotable β 1445714996 β
β 2 β 60756 β will ferrell β 1445714992 β
β 2 β 89774 β Boxing story β 1445715207 β
β 2 β 89774 β MMA β 1445715200 β
ββββββββββ΄ββββββββββ΄ββββββββββββββββββ΄βββββββββββββ
You can join on multiple columns/conditions by passing in a sequence. Find all instances where a user both tagged and rated a movie:
>>> tags.join(ratings, ["userId" , "movieId" ]).head(5 ).order_by("userId" )
ββββββββββ³ββββββββββ³βββββββββββββββββ³βββββββββββββ³ββββββββββ
β userId β movieId β tag β timestamp β rating β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β string β int64 β float64 β
ββββββββββΌββββββββββΌβββββββββββββββββΌβββββββββββββΌββββββββββ€
β 62 β 2 β Robin Williams β 1528843907 β 4.0 β
β 62 β 110 β sword fight β 1528152535 β 4.5 β
β 62 β 410 β gothic β 1525636609 β 4.5 β
β 62 β 2023 β mafia β 1525636733 β 5.0 β
β 62 β 2124 β quirky β 1525636846 β 5.0 β
ββββββββββ΄ββββββββββ΄βββββββββββββββββ΄βββββββββββββ΄ββββββββββ
To self-join a table with itself, you need to call .view()
on one of the arguments so the two tables are distinct from each other.
For crafting more complex join conditions, a valid form of a join condition is a 2-tuple like ({left_key}, {right_key})
, where each key can be
a Column
Deferred expression
lambda of the form (Table) -> Column
For example, to find all movies pairings that received the same (ignoring case) tags:
>>> movie_tags = tags["movieId" , "tag" ]
>>> view = movie_tags.view()
>>> movie_tags.join(
... view,
... [
... movie_tags.movieId != view.movieId,
... (_.tag.lower(), lambda t: t.tag.lower()),
... ],
... ).head().order_by(("movieId" , "movieId_right" ))
βββββββββββ³ββββββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ
β movieId β tag β movieId_right β tag_right β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β string β int64 β string β
βββββββββββΌββββββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββ€
β 60756 β funny β 1732 β funny β
β 60756 β Highly quotable β 1732 β Highly quotable β
β 89774 β Tom Hardy β 139385 β tom hardy β
β 106782 β drugs β 1732 β drugs β
β 106782 β Leonardo DiCaprio β 5989 β Leonardo DiCaprio β
βββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ
limit
Select n
rows from self
starting at offset
.
Parameters
n
int | None
Number of rows to include. If None
, the entire table is selected starting from offset
.
required
offset
int
Number of rows to skip first
0
Returns
Table
The first n
rows of self
starting at offset
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a" : [1 , 1 , 2 ], "b" : ["c" , "a" , "a" ]})
>>> t
βββββββββ³βββββββββ
β a β b β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β c β
β 1 β a β
β 2 β a β
βββββββββ΄βββββββββ
βββββββββ³βββββββββ
β a β b β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β c β
β 1 β a β
βββββββββ΄βββββββββ
You can use None
with offset
to slice starting from a particular row
>>> t.limit(None , offset= 1 )
βββββββββ³βββββββββ
β a β b β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β a β
β 2 β a β
βββββββββ΄βββββββββ
mutate
mutate(* exprs, ** mutations)
Add columns to a table expression.
Parameters
exprs
Sequence [ir
.Expr
] | None
List of named expressions to add as columns
()
mutations
ir
.Value
Named expressions using keyword arguments
{}
Returns
Table
Table expression with additional columns
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch().select("species" , "year" , "bill_length_mm" )
>>> t
βββββββββββ³ββββββββ³βββββββββββββββββ
β species β year β bill_length_mm β
β‘βββββββββββββββββββββββββββββββββββ©
β string β int64 β float64 β
βββββββββββΌββββββββΌβββββββββββββββββ€
β Adelie β 2007 β 39.1 β
β Adelie β 2007 β 39.5 β
β Adelie β 2007 β 40.3 β
β Adelie β 2007 β NULL β
β Adelie β 2007 β 36.7 β
β Adelie β 2007 β 39.3 β
β Adelie β 2007 β 38.9 β
β Adelie β 2007 β 39.2 β
β Adelie β 2007 β 34.1 β
β Adelie β 2007 β 42.0 β
β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββ΄βββββββββββββββββ
Add a new column from a per-element expression
>>> t.mutate(next_year= _.year + 1 ).head()
βββββββββββ³ββββββββ³βββββββββββββββββ³ββββββββββββ
β species β year β bill_length_mm β next_year β
β‘βββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β float64 β int64 β
βββββββββββΌββββββββΌβββββββββββββββββΌββββββββββββ€
β Adelie β 2007 β 39.1 β 2008 β
β Adelie β 2007 β 39.5 β 2008 β
β Adelie β 2007 β 40.3 β 2008 β
β Adelie β 2007 β NULL β 2008 β
β Adelie β 2007 β 36.7 β 2008 β
βββββββββββ΄ββββββββ΄βββββββββββββββββ΄ββββββββββββ
Add a new column based on an aggregation. Note the automatic broadcasting.
>>> t.select("species" , bill_demean= _.bill_length_mm - _.bill_length_mm.mean()).head()
βββββββββββ³ββββββββββββββ
β species β bill_demean β
β‘ββββββββββββββββββββββββ©
β string β float64 β
βββββββββββΌββββββββββββββ€
β Adelie β -4.82193 β
β Adelie β -4.42193 β
β Adelie β -3.62193 β
β Adelie β NULL β
β Adelie β -7.22193 β
βββββββββββ΄ββββββββββββββ
Mutate across multiple columns
>>> t.mutate(s.across(s.numeric() & ~ s.cols("year" ), _ - _.mean())).head()
βββββββββββ³ββββββββ³βββββββββββββββββ
β species β year β bill_length_mm β
β‘βββββββββββββββββββββββββββββββββββ©
β string β int64 β float64 β
βββββββββββΌββββββββΌβββββββββββββββββ€
β Adelie β 2007 β -4.82193 β
β Adelie β 2007 β -4.42193 β
β Adelie β 2007 β -3.62193 β
β Adelie β 2007 β NULL β
β Adelie β 2007 β -7.22193 β
βββββββββββ΄ββββββββ΄βββββββββββββββββ
nunique
Compute the number of unique rows in the table.
Parameters
where
ir
.BooleanValue
| None
Optional boolean expression to filter rows when counting.
None
Returns
IntegerScalar
Number of unique rows in the table
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a" : ["foo" , "bar" , "bar" ]})
>>> t
ββββββββββ
β a β
β‘βββββββββ©
β string β
ββββββββββ€
β foo β
β bar β
β bar β
ββββββββββ
βββββ
β 2 β
βββββ
>>> t.nunique(t.a != "foo" )
βββββ
β 1 β
βββββ
order_by
Sort a table by one or more expressions.
Similar to pandas.DataFrame.sort_values()
.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {
... "a" : [3 , 2 , 1 , 3 ],
... "b" : ["a" , "B" , "c" , "D" ],
... "c" : [4 , 6 , 5 , 7 ],
... }
... )
>>> t
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 3 β a β 4 β
β 2 β B β 6 β
β 1 β c β 5 β
β 3 β D β 7 β
βββββββββ΄βββββββββ΄ββββββββ
Sort by b. Default is ascending. Note how capital letters come before lowercase
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 2 β B β 6 β
β 3 β D β 7 β
β 3 β a β 4 β
β 1 β c β 5 β
βββββββββ΄βββββββββ΄ββββββββ
Sort in descending order
>>> t.order_by(ibis.desc("b" ))
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 1 β c β 5 β
β 3 β a β 4 β
β 3 β D β 7 β
β 2 β B β 6 β
βββββββββ΄βββββββββ΄ββββββββ
You can also use the deferred API to get the same result
>>> from ibis import _
>>> t.order_by(_.b.desc())
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 1 β c β 5 β
β 3 β a β 4 β
β 3 β D β 7 β
β 2 β B β 6 β
βββββββββ΄βββββββββ΄ββββββββ
Sort by multiple columns/expressions
>>> t.order_by(["a" , _.c.desc()])
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 1 β c β 5 β
β 2 β B β 6 β
β 3 β D β 7 β
β 3 β a β 4 β
βββββββββ΄βββββββββ΄ββββββββ
You can actually pass arbitrary expressions to use as sort keys. For example, to ignore the case of the strings in column b
>>> t.order_by(_.b.lower())
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 3 β a β 4 β
β 2 β B β 6 β
β 1 β c β 5 β
β 3 β D β 7 β
βββββββββ΄βββββββββ΄ββββββββ
This means that shuffling a Table is super simple
>>> t.order_by(ibis.random())
βββββββββ³βββββββββ³ββββββββ
β a β b β c β
β‘βββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌβββββββββΌββββββββ€
β 3 β a β 4 β
β 1 β c β 5 β
β 3 β D β 7 β
β 2 β B β 6 β
βββββββββ΄βββββββββ΄ββββββββ
Selectors are allowed as sort keys and are a concise way to sort by multiple columns matching some criteria
>>> import ibis.selectors as s
>>> penguins = ibis.examples.penguins.fetch()
>>> penguins[["year" , "island" ]].value_counts().order_by(s.startswith("year" ))
βββββββββ³ββββββββββββ³ββββββββββββββββββββ
β year β island β year_island_count β
β‘ββββββββββββββββββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌββββββββββββΌββββββββββββββββββββ€
β 2007 β Torgersen β 20 β
β 2007 β Biscoe β 44 β
β 2007 β Dream β 46 β
β 2008 β Torgersen β 16 β
β 2008 β Dream β 34 β
β 2008 β Biscoe β 64 β
β 2009 β Torgersen β 16 β
β 2009 β Dream β 44 β
β 2009 β Biscoe β 60 β
βββββββββ΄ββββββββββββ΄ββββββββββββββββββββ
Use the across
selector to apply a specific order to multiple columns
>>> penguins[["year" , "island" ]].value_counts().order_by(
... s.across(s.startswith("year" ), _.desc())
... )
βββββββββ³ββββββββββββ³ββββββββββββββββββββ
β year β island β year_island_count β
β‘ββββββββββββββββββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌββββββββββββΌββββββββββββββββββββ€
β 2009 β Biscoe β 60 β
β 2009 β Dream β 44 β
β 2009 β Torgersen β 16 β
β 2008 β Biscoe β 64 β
β 2008 β Dream β 34 β
β 2008 β Torgersen β 16 β
β 2007 β Dream β 46 β
β 2007 β Biscoe β 44 β
β 2007 β Torgersen β 20 β
βββββββββ΄ββββββββββββ΄ββββββββββββββββββββ
pipe
Compose f
with self
.
Parameters
f
If the expression needs to be passed as anything other than the first argument to the function, pass a tuple with the argument name. For example, (f, βdataβ) if the function f expects a βdataβ keyword
required
args
Any
Positional arguments to f
()
kwargs
Any
Keyword arguments to f
{}
Examples
>>> import ibis
>>> t = ibis.memtable(
... {
... "a" : [5 , 10 , 15 ],
... "b" : ["a" , "b" , "c" ],
... }
... )
>>> f = lambda a: (a + 1 ).name("a" )
>>> g = lambda a: (a * 2 ).name("a" )
>>> result1 = t.a.pipe(f).pipe(g)
>>> result1
βββββββββ
β a β
β‘ββββββββ©
β int64 β
βββββββββ€
β 12 β
β 22 β
β 32 β
βββββββββ
>>> result2 = g(f(t.a)) # equivalent to the above
>>> result1.equals(result2)
Returns
Expr
Result type of passed function
pivot_longer
pivot_longer(
col,
* ,
names_to= 'name' ,
names_pattern= '(.+)' ,
names_transform= None ,
values_to= 'value' ,
values_transform= None ,
)
Transform a table from wider to longer.
Parameters
col
str | s
.Selector
String column name or selector.
required
names_to
str | Iterable [str ]
A string or iterable of strings indicating how to name the new pivoted columns.
'name'
names_pattern
str | re .Pattern
Pattern to use to extract column names from the input. By default the entire column name is extracted.
'(.+)'
names_transform
Callable [[str ], ir
.Value
] | Mapping [str , Callable [[str ], ir
.Value
]] | None
Function or mapping of a name in names_to
to a function to transform a column name to a value.
None
values_to
str
Name of the pivoted value column.
'value'
values_transform
Callable [[ir
.Value
], ir
.Value
] | Deferred | None
Apply a function to the value column. This can be a lambda or deferred expression.
None
Examples
Basic usage
>>> import ibis
>>> import ibis.selectors as s
>>> from ibis import _
>>> ibis.options.interactive = True
>>> relig_income = ibis.examples.relig_income_raw.fetch()
>>> relig_income
βββββββββββββββββββββββββββ³ββββββββ³ββββββββββ³ββββββββββ³ββββββββββ³ββββββββββ³ββββββββββ³βββββββββββ³ββββββββββββ³ββββββββ³βββββββββββββββββββββ
β religion β <$10k β $10-20k β $20-30k β $30-40k β $40-50k β $50-75k β $75-100k β $100-150k β >150k β Don't know/refused β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β
βββββββββββββββββββββββββββΌββββββββΌββββββββββΌββββββββββΌββββββββββΌββββββββββΌββββββββββΌβββββββββββΌββββββββββββΌββββββββΌβββββββββββββββββββββ€
β Agnostic β 27 β 34 β 60 β 81 β 76 β 137 β 122 β 109 β 84 β 96 β
β Atheist β 12 β 27 β 37 β 52 β 35 β 70 β 73 β 59 β 74 β 76 β
β Buddhist β 27 β 21 β 30 β 34 β 33 β 58 β 62 β 39 β 53 β 54 β
β Catholic β 418 β 617 β 732 β 670 β 638 β 1116 β 949 β 792 β 633 β 1489 β
β Donβt know/refused β 15 β 14 β 15 β 11 β 10 β 35 β 21 β 17 β 18 β 116 β
β Evangelical Prot β 575 β 869 β 1064 β 982 β 881 β 1486 β 949 β 723 β 414 β 1529 β
β Hindu β 1 β 9 β 7 β 9 β 11 β 34 β 47 β 48 β 54 β 37 β
β Historically Black Prot β 228 β 244 β 236 β 238 β 197 β 223 β 131 β 81 β 78 β 339 β
β Jehovah's Witness β 20 β 27 β 24 β 24 β 21 β 30 β 15 β 11 β 6 β 37 β
β Jewish β 19 β 19 β 25 β 25 β 30 β 95 β 69 β 87 β 151 β 162 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββββββββββββββββββ΄ββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββββββ΄ββββββββ΄βββββββββββββββββββββ
Here we convert column names not matching the selector for the religion
column and convert those names into values
>>> relig_income.pivot_longer(~ s.cols("religion" ), names_to= "income" , values_to= "count" )
ββββββββββββ³βββββββββββββββββββββ³ββββββββ
β religion β income β count β
β‘ββββββββββββββββββββββββββββββββββββββββ©
β string β string β int64 β
ββββββββββββΌβββββββββββββββββββββΌββββββββ€
β Agnostic β <$10k β 27 β
β Agnostic β $10-20k β 34 β
β Agnostic β $20-30k β 60 β
β Agnostic β $30-40k β 81 β
β Agnostic β $40-50k β 76 β
β Agnostic β $50-75k β 137 β
β Agnostic β $75-100k β 122 β
β Agnostic β $100-150k β 109 β
β Agnostic β >150k β 84 β
β Agnostic β Don't know/refused β 96 β
β β¦ β β¦ β β¦ β
ββββββββββββ΄βββββββββββββββββββββ΄ββββββββ
Similarly for a different example dataset, we convert names to values but using a different selector and the default values_to
value.
>>> world_bank_pop = ibis.examples.world_bank_pop_raw.fetch()
>>> world_bank_pop.head()
βββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ
β country β indicator β 2000 β 2001 β 2002 β 2003 β 2004 β 2005 β 2006 β 2007 β 2008 β 2009 β 2010 β 2011 β 2012 β 2013 β 2014 β 2015 β 2016 β 2017 β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β float64 β
βββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββ€
β ABW β SP.URB.TOTL β 4.162500e+04 β 4.202500e+04 β 4.219400e+04 β 4.227700e+04 β 4.231700e+04 β 4.239900e+04 β 4.255500e+04 β 4.272900e+04 β 4.290600e+04 β 4.307900e+04 β 4.320600e+04 β 4.349300e+04 β 4.386400e+04 β 4.422800e+04 β 4.458800e+04 β 4.494300e+04 β 4.529700e+04 β 4.564800e+04 β
β ABW β SP.URB.GROW β 1.664222e+00 β 9.563731e-01 β 4.013352e-01 β 1.965172e-01 β 9.456936e-02 β 1.935880e-01 β 3.672580e-01 β 4.080490e-01 β 4.133830e-01 β 4.023963e-01 β 2.943735e-01 β 6.620631e-01 β 8.493932e-01 β 8.264135e-01 β 8.106692e-01 β 7.930256e-01 β 7.845785e-01 β 7.718989e-01 β
β ABW β SP.POP.TOTL β 8.910100e+04 β 9.069100e+04 β 9.178100e+04 β 9.270100e+04 β 9.354000e+04 β 9.448300e+04 β 9.560600e+04 β 9.678700e+04 β 9.799600e+04 β 9.921200e+04 β 1.003410e+05 β 1.012880e+05 β 1.021120e+05 β 1.028800e+05 β 1.035940e+05 β 1.042570e+05 β 1.048740e+05 β 1.054390e+05 β
β ABW β SP.POP.GROW β 2.539234e+00 β 1.768757e+00 β 1.194718e+00 β 9.973955e-01 β 9.009892e-01 β 1.003077e+00 β 1.181566e+00 β 1.227711e+00 β 1.241397e+00 β 1.233231e+00 β 1.131541e+00 β 9.393559e-01 β 8.102306e-01 β 7.493010e-01 β 6.916153e-01 β 6.379592e-01 β 5.900625e-01 β 5.372957e-01 β
β AFE β SP.URB.TOTL β 1.155517e+08 β 1.197755e+08 β 1.242275e+08 β 1.288340e+08 β 1.336475e+08 β 1.387456e+08 β 1.440267e+08 β 1.492313e+08 β 1.553838e+08 β 1.617762e+08 β 1.684561e+08 β 1.754157e+08 β 1.825587e+08 β 1.901087e+08 β 1.980733e+08 β 2.065563e+08 β 2.150833e+08 β 2.237321e+08 β
βββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ
>>> world_bank_pop.pivot_longer(s.matches(r"\d {4} " ), names_to= "year" ).head()
βββββββββββ³ββββββββββββββ³βββββββββ³ββββββββββ
β country β indicator β year β value β
β‘βββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β float64 β
βββββββββββΌββββββββββββββΌβββββββββΌββββββββββ€
β ABW β SP.URB.TOTL β 2000 β 41625.0 β
β ABW β SP.URB.TOTL β 2001 β 42025.0 β
β ABW β SP.URB.TOTL β 2002 β 42194.0 β
β ABW β SP.URB.TOTL β 2003 β 42277.0 β
β ABW β SP.URB.TOTL β 2004 β 42317.0 β
βββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββββ
pivot_longer
has some preprocessing capabilities like stripping a prefix and applying a function to column names
>>> billboard = ibis.examples.billboard.fetch()
>>> billboard
ββββββββββββββββββ³ββββββββββββββββββββββββββ³βββββββββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββ
β artist β track β date_entered β wk1 β wk2 β wk3 β wk4 β wk5 β wk6 β wk7 β wk8 β wk9 β wk10 β wk11 β wk12 β wk13 β wk14 β wk15 β wk16 β wk17 β wk18 β wk19 β wk20 β wk21 β wk22 β wk23 β wk24 β wk25 β wk26 β wk27 β wk28 β wk29 β wk30 β wk31 β wk32 β wk33 β wk34 β wk35 β wk36 β wk37 β wk38 β wk39 β wk40 β wk41 β wk42 β wk43 β wk44 β wk45 β wk46 β wk47 β wk48 β wk49 β wk50 β wk51 β wk52 β wk53 β wk54 β wk55 β wk56 β wk57 β wk58 β wk59 β wk60 β wk61 β wk62 β wk63 β wk64 β wk65 β wk66 β wk67 β wk68 β wk69 β wk70 β wk71 β wk72 β wk73 β wk74 β wk75 β wk76 β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β date β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β string β string β string β string β string β string β string β string β string β string β string β
ββββββββββββββββββΌββββββββββββββββββββββββββΌβββββββββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββ€
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 87 β 82 β 72 β 77 β 87 β 94 β 99 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β 2Ge+her β The Hardest Part Of ... β 2000-09-02 β 91 β 87 β 92 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β 3 Doors Down β Kryptonite β 2000-04-08 β 81 β 70 β 68 β 67 β 66 β 57 β 54 β 53 β 51 β 51 β 51 β 51 β 47 β 44 β 38 β 28 β 22 β 18 β 18 β 14 β 12 β 7 β 6 β 6 β 6 β 5 β 5 β 4 β 4 β 4 β 4 β 3 β 3 β 3 β 4 β 5 β 5 β 9 β 9 β 15 β 14 β 13 β 14 β 16 β 17 β 21 β 22 β 24 β 28 β 33 β 42 β 42 β 49 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β 3 Doors Down β Loser β 2000-10-21 β 76 β 76 β 72 β 69 β 67 β 65 β 55 β 59 β 62 β 61 β 61 β 59 β 61 β 66 β 72 β 76 β 75 β 67 β 73 β 70 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β 504 Boyz β Wobble Wobble β 2000-04-15 β 57 β 34 β 25 β 17 β 17 β 31 β 36 β 49 β 53 β 57 β 64 β 70 β 75 β 76 β 78 β 85 β 92 β 96 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β 98^0 β Give Me Just One Nig... β 2000-08-19 β 51 β 39 β 34 β 26 β 26 β 19 β 2 β 2 β 3 β 6 β 7 β 22 β 29 β 36 β 47 β 67 β 66 β 84 β 93 β 94 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β A*Teens β Dancing Queen β 2000-07-08 β 97 β 97 β 96 β 95 β 100 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Aaliyah β I Don't Wanna β 2000-01-29 β 84 β 62 β 51 β 41 β 38 β 35 β 35 β 38 β 38 β 36 β 37 β 37 β 38 β 49 β 61 β 63 β 62 β 67 β 83 β 86 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Aaliyah β Try Again β 2000-03-18 β 59 β 53 β 38 β 28 β 21 β 18 β 16 β 14 β 12 β 10 β 9 β 8 β 6 β 1 β 2 β 2 β 2 β 2 β 3 β 4 β 5 β 5 β 6 β 9 β 13 β 14 β 16 β 23 β 22 β 33 β 36 β 43 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Adams, Yolanda β Open My Heart β 2000-08-26 β 76 β 76 β 74 β 69 β 68 β 67 β 61 β 58 β 57 β 59 β 66 β 68 β 61 β 67 β 59 β 63 β 67 β 71 β 79 β 89 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
ββββββββββββββββββ΄ββββββββββββββββββββββββββ΄βββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ
>>> billboard.pivot_longer(
... s.startswith("wk" ),
... names_to= "week" ,
... names_pattern= r"wk(.+)" ,
... names_transform= int ,
... values_to= "rank" ,
... values_transform= _.cast("int" ),
... ).drop_null("rank" )
βββββββββββ³ββββββββββββββββββββββββββ³βββββββββββββββ³βββββββ³ββββββββ
β artist β track β date_entered β week β rank β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β date β int8 β int64 β
βββββββββββΌββββββββββββββββββββββββββΌβββββββββββββββΌβββββββΌββββββββ€
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 1 β 87 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 2 β 82 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 3 β 72 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 4 β 77 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 5 β 87 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 6 β 94 β
β 2 Pac β Baby Don't Cry (Keep... β 2000-02-26 β 7 β 99 β
β 2Ge+her β The Hardest Part Of ... β 2000-09-02 β 1 β 91 β
β 2Ge+her β The Hardest Part Of ... β 2000-09-02 β 2 β 87 β
β 2Ge+her β The Hardest Part Of ... β 2000-09-02 β 3 β 92 β
β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββββββββββββββββ΄βββββββββββββββ΄βββββββ΄ββββββββ
You can use regular expression capture groups to extract multiple variables stored in column names
>>> who = ibis.examples.who.fetch()
>>> who
βββββββββββββββ³βββββββββ³βββββββββ³ββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ³ββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββββ³βββββββββββββ
β country β iso2 β iso3 β year β new_sp_m014 β new_sp_m1524 β new_sp_m2534 β new_sp_m3544 β new_sp_m4554 β new_sp_m5564 β new_sp_m65 β new_sp_f014 β new_sp_f1524 β new_sp_f2534 β new_sp_f3544 β new_sp_f4554 β new_sp_f5564 β new_sp_f65 β new_sn_m014 β new_sn_m1524 β new_sn_m2534 β new_sn_m3544 β new_sn_m4554 β new_sn_m5564 β new_sn_m65 β new_sn_f014 β new_sn_f1524 β new_sn_f2534 β new_sn_f3544 β new_sn_f4554 β new_sn_f5564 β new_sn_f65 β new_ep_m014 β new_ep_m1524 β new_ep_m2534 β new_ep_m3544 β new_ep_m4554 β new_ep_m5564 β new_ep_m65 β new_ep_f014 β new_ep_f1524 β new_ep_f2534 β new_ep_f3544 β new_ep_f4554 β new_ep_f5564 β new_ep_f65 β newrel_m014 β newrel_m1524 β newrel_m2534 β newrel_m3544 β newrel_m4554 β newrel_m5564 β newrel_m65 β newrel_f014 β newrel_f1524 β newrel_f2534 β newrel_f3544 β newrel_f4554 β newrel_f5564 β newrel_f65 β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β
βββββββββββββββΌβββββββββΌβββββββββΌββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββ€
β Afghanistan β AF β AFG β 1980 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1981 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1982 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1983 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1984 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1985 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1986 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1987 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1988 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β Afghanistan β AF β AFG β 1989 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β NULL β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ΄ββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββ
>>> who.pivot_longer(
... s.index["new_sp_m014" :"newrel_f65" ],
... names_to= ["diagnosis" , "gender" , "age" ],
... names_pattern= "new_?(.*)_(.)(.*)" ,
... values_to= "count" ,
... )
βββββββββββββββ³βββββββββ³βββββββββ³ββββββββ³ββββββββββββ³βββββββββ³βββββββββ³ββββββββ
β country β iso2 β iso3 β year β diagnosis β gender β age β count β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β string β string β string β int64 β
βββββββββββββββΌβββββββββΌβββββββββΌββββββββΌββββββββββββΌβββββββββΌβββββββββΌββββββββ€
β Afghanistan β AF β AFG β 1980 β sp β m β 014 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 1524 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 2534 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 3544 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 4554 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 5564 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β m β 65 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β f β 014 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β f β 1524 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β f β 2534 β NULL β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ΄ββββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ
names_transform
is flexible, and can be:
1. A mapping of one or more names in `names_to` to callable
2. A callable that will be applied to every name
Letβs recode gender and age to numeric values using a mapping
>>> who.pivot_longer(
... s.index["new_sp_m014" :"newrel_f65" ],
... names_to= ["diagnosis" , "gender" , "age" ],
... names_pattern= "new_?(.*)_(.)(.*)" ,
... names_transform= dict (
... gender= {"m" : 1 , "f" : 2 }.get,
... age= dict (
... zip (
... ["014" , "1524" , "2534" , "3544" , "4554" , "5564" , "65" ],
... range (7 ),
... )
... ).get,
... ),
... values_to= "count" ,
... )
βββββββββββββββ³βββββββββ³βββββββββ³ββββββββ³ββββββββββββ³βββββββββ³βββββββ³ββββββββ
β country β iso2 β iso3 β year β diagnosis β gender β age β count β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β string β int8 β int8 β int64 β
βββββββββββββββΌβββββββββΌβββββββββΌββββββββΌββββββββββββΌβββββββββΌβββββββΌββββββββ€
β Afghanistan β AF β AFG β 1980 β sp β 1 β 0 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 1 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 2 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 3 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 4 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 5 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 1 β 6 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 2 β 0 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 2 β 1 β NULL β
β Afghanistan β AF β AFG β 1980 β sp β 2 β 2 β NULL β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ΄ββββββββββββ΄βββββββββ΄βββββββ΄ββββββββ
The number of match groups in names_pattern
must match the length of names_to
>>> who.pivot_longer(
... s.index["new_sp_m014" :"newrel_f65" ],
... names_to= ["diagnosis" , "gender" , "age" ],
... names_pattern= "new_?(.*)_.(.*)" ,
... )
---------------------------------------------------------------------------
IbisInputError Traceback (most recent call last)
Cell In[323], line 1
----> 1 who . pivot_longer (
2 s . index [ " new_sp_m014 " : " newrel_f65 " ] ,
3 names_to = [ " diagnosis " , " gender " , " age " ] ,
4 names_pattern = " new_?(.*)_.(.*) " ,
5 )
File ~/work/ibis/ibis/ibis/expr/types/relations.py:3939 , in Table.pivot_longer (self, col, names_to, names_pattern, names_transform, values_to, values_transform)
3937 names_pattern = re. compile(names_pattern)
3938 if (ngroups := names_pattern. groups) != (nnames := len (names_to)):
-> 3939 raise com. IbisInputError(
3940 f " Number of match groups in `names_pattern` "
3941 f " { names_pattern. pattern!r} ( { ngroups: d } groups) doesn ' t "
3942 f " match the length of `names_to` { names_to} (length { nnames: d } ) "
3943 )
3945 if names_transform is None :
3946 names_transform = dict . fromkeys(names_to, toolz. identity)
IbisInputError : Number of match groups in `names_pattern`'new_?(.*)_.(.*)' (2 groups) doesn't match the length of `names_to` ['diagnosis', 'gender', 'age'] (length 3)
names_transform
must be a mapping or callable
>>> who.pivot_longer(
... s.index["new_sp_m014" :"newrel_f65" ], names_transform= "upper"
... ) # quartodoc: +EXPECTED_FAILURE
---------------------------------------------------------------------------
IbisTypeError Traceback (most recent call last)
Cell In[326], line 1
----> 1 who . pivot_longer (
2 s . index [ " new_sp_m014 " : " newrel_f65 " ] , names_transform = " upper "
3 ) # quartodoc: +EXPECTED_FAILURE
File ~/work/ibis/ibis/ibis/expr/types/relations.py:3951 , in Table.pivot_longer (self, col, names_to, names_pattern, names_transform, values_to, values_transform)
3949 names_transform = dict . fromkeys(names_to, names_transform)
3950 else :
-> 3951 raise com. IbisTypeError(
3952 f " `names_transform` must be a mapping or callable. Got { type (names_transform)} "
3953 )
3955 for name in names_to:
3956 names_transform. setdefault(name, toolz. identity)
IbisTypeError : `names_transform` must be a mapping or callable. Got <class 'str'>
pivot_wider
pivot_wider(
id_cols= None ,
names_from= 'name' ,
names_prefix= '' ,
names_sep= '_' ,
names_sort= False ,
names= None ,
values_from= 'value' ,
values_fill= None ,
values_agg= 'arbitrary' ,
)
Pivot a table to a wider format.
Parameters
id_cols
s
.Selector
| None
A set of columns that uniquely identify each observation.
None
names_from
str | Iterable [str ] | s
.Selector
An argument describing which column or columns to use to get the name of the output columns.
'name'
names_prefix
str
String added to the start of every column name.
''
names_sep
str
If names_from
or values_from
contains multiple columns, this argument will be used to join their values together into a single string to use as a column name.
'_'
names_sort
bool
If True
columns are sorted. If False
column names are ordered by appearance.
False
names
Iterable [str ] | None
An explicit sequence of values to look for in columns matching names_from
. * When this value is None
, the values will be computed from names_from
. * When this value is not None
, each elementβs length must match the length of names_from
. See examples below for more detail.
None
values_from
str | Iterable [str ] | s
.Selector
An argument describing which column or columns to get the cell values from.
'value'
values_fill
int | float | str | ir
.Scalar
| None
A scalar value that specifies what each value should be filled with when missing.
None
values_agg
str | Callable [[ir
.Value
], ir
.Scalar
] | Deferred
A function applied to the value in each cell in the output.
'arbitrary'
Returns
Table
Wider pivoted table
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> from ibis import _
>>> ibis.options.interactive = True
Basic usage
>>> fish_encounters = ibis.examples.fish_encounters.fetch()
>>> fish_encounters
βββββββββ³ββββββββββ³ββββββββ
β fish β station β seen β
β‘ββββββββββββββββββββββββββ©
β int64 β string β int64 β
βββββββββΌββββββββββΌββββββββ€
β 4842 β Release β 1 β
β 4842 β I80_1 β 1 β
β 4842 β Lisbon β 1 β
β 4842 β Rstr β 1 β
β 4842 β Base_TD β 1 β
β 4842 β BCE β 1 β
β 4842 β BCW β 1 β
β 4842 β BCE2 β 1 β
β 4842 β BCW2 β 1 β
β 4842 β MAE β 1 β
β β¦ β β¦ β β¦ β
βββββββββ΄ββββββββββ΄ββββββββ
>>> fish_encounters.pivot_wider(names_from= "station" , values_from= "seen" )
βββββββββ³βββββββββ³ββββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββββ³ββββββββ³ββββββββ³ββββββββ
β fish β Lisbon β Base_TD β MAE β MAW β Rstr β BCW β BCW2 β Release β I80_1 β BCE β BCE2 β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β
βββββββββΌβββββββββΌββββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌββββββββββΌββββββββΌββββββββΌββββββββ€
β 4843 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4848 β 1 β NULL β NULL β NULL β 1 β NULL β NULL β 1 β 1 β NULL β NULL β
β 4865 β 1 β NULL β NULL β NULL β NULL β NULL β NULL β 1 β 1 β NULL β NULL β
β 4844 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4845 β 1 β 1 β NULL β NULL β 1 β NULL β NULL β 1 β 1 β NULL β NULL β
β 4849 β NULL β NULL β NULL β NULL β NULL β NULL β NULL β 1 β 1 β NULL β NULL β
β 4859 β 1 β 1 β NULL β NULL β 1 β NULL β NULL β 1 β 1 β NULL β NULL β
β 4861 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4842 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4847 β 1 β NULL β NULL β NULL β NULL β NULL β NULL β 1 β 1 β NULL β NULL β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββ΄βββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ
You can do simple transpose-like operations using pivot_wider
>>> t = ibis.memtable(dict (outcome= ["yes" , "no" ], counted= [3 , 4 ]))
>>> t
βββββββββββ³ββββββββββ
β outcome β counted β
β‘ββββββββββββββββββββ©
β string β int64 β
βββββββββββΌββββββββββ€
β yes β 3 β
β no β 4 β
βββββββββββ΄ββββββββββ
>>> t.pivot_wider(names_from= "outcome" , values_from= "counted" , names_sort= True )
βββββββββ³ββββββββ
β no β yes β
β‘ββββββββββββββββ©
β int64 β int64 β
βββββββββΌββββββββ€
β 4 β 3 β
βββββββββ΄ββββββββ
Fill missing pivoted values using values_fill
>>> fish_encounters.pivot_wider(
... names_from= "station" , values_from= "seen" , values_fill= 0
... )
βββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββββ³ββββββββ³ββββββββ³ββββββββ³ββββββββ³βββββββββ³ββββββββββ³ββββββββ
β fish β Rstr β BCW β BCW2 β Release β I80_1 β BCE β BCE2 β MAW β Lisbon β Base_TD β MAE β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β int64 β
βββββββββΌββββββββΌββββββββΌββββββββΌββββββββββΌββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌββββββββββΌββββββββ€
β 4851 β 0 β 0 β 0 β 1 β 1 β 0 β 0 β 0 β 0 β 0 β 0 β
β 4857 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 0 β 1 β 1 β 0 β
β 4858 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4862 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 0 β 1 β 1 β 0 β
β 4863 β 0 β 0 β 0 β 1 β 1 β 0 β 0 β 0 β 0 β 0 β 0 β
β 4843 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4848 β 1 β 0 β 0 β 1 β 1 β 0 β 0 β 0 β 1 β 0 β 0 β
β 4865 β 0 β 0 β 0 β 1 β 1 β 0 β 0 β 0 β 1 β 0 β 0 β
β 4844 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β 1 β
β 4845 β 1 β 0 β 0 β 1 β 1 β 0 β 0 β 0 β 1 β 1 β 0 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄ββββββββββ΄ββββββββ
Compute multiple values columns
>>> us_rent_income = ibis.examples.us_rent_income.fetch()
>>> us_rent_income
ββββββββββ³βββββββββββββ³βββββββββββ³βββββββββββ³ββββββββ
β geoid β name β variable β estimate β moe β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β int64 β
ββββββββββΌβββββββββββββΌβββββββββββΌβββββββββββΌββββββββ€
β 01 β Alabama β income β 24476 β 136 β
β 01 β Alabama β rent β 747 β 3 β
β 02 β Alaska β income β 32940 β 508 β
β 02 β Alaska β rent β 1200 β 13 β
β 04 β Arizona β income β 27517 β 148 β
β 04 β Arizona β rent β 972 β 4 β
β 05 β Arkansas β income β 23789 β 165 β
β 05 β Arkansas β rent β 709 β 5 β
β 06 β California β income β 29454 β 109 β
β 06 β California β rent β 1358 β 3 β
β β¦ β β¦ β β¦ β β¦ β β¦ β
ββββββββββ΄βββββββββββββ΄βββββββββββ΄βββββββββββ΄ββββββββ
>>> us_rent_income.pivot_wider(
... names_from= "variable" , values_from= ["estimate" , "moe" ]
... )
ββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββ³ββββββββββββββββ³βββββββββββ
β geoid β name β estimate_income β moe_income β estimate_rent β moe_rent β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β int64 β int64 β int64 β int64 β
ββββββββββΌβββββββββββββββΌββββββββββββββββββΌβββββββββββββΌββββββββββββββββΌβββββββββββ€
β 05 β Arkansas β 23789 β 165 β 709 β 5 β
β 06 β California β 29454 β 109 β 1358 β 3 β
β 13 β Georgia β 27024 β 106 β 927 β 3 β
β 15 β Hawaii β 32453 β 218 β 1507 β 18 β
β 16 β Idaho β 25298 β 208 β 792 β 7 β
β 30 β Montana β 26249 β 206 β 751 β 9 β
β 38 β North Dakota β 32336 β 245 β 775 β 9 β
β 39 β Ohio β 27435 β 94 β 764 β 2 β
β 40 β Oklahoma β 26207 β 101 β 766 β 3 β
β 47 β Tennessee β 25453 β 102 β 808 β 4 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
ββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββ΄ββββββββββββββββ΄βββββββββββ
The column name separator can be changed using the names_sep
parameter
>>> us_rent_income.pivot_wider(
... names_from= "variable" ,
... names_sep= "." ,
... values_from= ("estimate" , "moe" ),
... )
ββββββββββ³βββββββββββββββ³ββββββββββββββββββ³βββββββββββββ³ββββββββββββββββ³βββββββββββ
β geoid β name β estimate.income β moe.income β estimate.rent β moe.rent β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β int64 β int64 β int64 β int64 β
ββββββββββΌβββββββββββββββΌββββββββββββββββββΌβββββββββββββΌββββββββββββββββΌβββββββββββ€
β 05 β Arkansas β 23789 β 165 β 709 β 5 β
β 06 β California β 29454 β 109 β 1358 β 3 β
β 13 β Georgia β 27024 β 106 β 927 β 3 β
β 15 β Hawaii β 32453 β 218 β 1507 β 18 β
β 16 β Idaho β 25298 β 208 β 792 β 7 β
β 30 β Montana β 26249 β 206 β 751 β 9 β
β 38 β North Dakota β 32336 β 245 β 775 β 9 β
β 39 β Ohio β 27435 β 94 β 764 β 2 β
β 40 β Oklahoma β 26207 β 101 β 766 β 3 β
β 47 β Tennessee β 25453 β 102 β 808 β 4 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
ββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββ΄ββββββββββββββββ΄βββββββββββ
Supply an alternative function to summarize values
>>> warpbreaks = ibis.examples.warpbreaks.fetch().select("wool" , "tension" , "breaks" )
>>> warpbreaks
ββββββββββ³ββββββββββ³βββββββββ
β wool β tension β breaks β
β‘ββββββββββββββββββββββββββββ©
β string β string β int64 β
ββββββββββΌββββββββββΌβββββββββ€
β A β L β 26 β
β A β L β 30 β
β A β L β 54 β
β A β L β 25 β
β A β L β 70 β
β A β L β 52 β
β A β L β 51 β
β A β L β 26 β
β A β L β 67 β
β A β M β 18 β
β β¦ β β¦ β β¦ β
ββββββββββ΄ββββββββββ΄βββββββββ
>>> warpbreaks.pivot_wider(
... names_from= "wool" , values_from= "breaks" , values_agg= "mean"
... ).select("tension" , "A" , "B" ).order_by("tension" )
βββββββββββ³ββββββββββββ³ββββββββββββ
β tension β A β B β
β‘ββββββββββββββββββββββββββββββββββ©
β string β float64 β float64 β
βββββββββββΌββββββββββββΌββββββββββββ€
β H β 24.555556 β 18.777778 β
β L β 44.555556 β 28.222222 β
β M β 24.000000 β 28.777778 β
βββββββββββ΄ββββββββββββ΄ββββββββββββ
Passing Deferred
objects to values_agg
is supported
>>> warpbreaks.pivot_wider(
... names_from= "tension" ,
... values_from= "breaks" ,
... values_agg= _.sum (),
... ).select("wool" , "H" , "L" , "M" ).order_by(s.all ())
ββββββββββ³ββββββββ³ββββββββ³ββββββββ
β wool β H β L β M β
β‘βββββββββββββββββββββββββββββββββ©
β string β int64 β int64 β int64 β
ββββββββββΌββββββββΌββββββββΌββββββββ€
β A β 221 β 401 β 216 β
β B β 169 β 254 β 259 β
ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ
Use a custom aggregate function
>>> warpbreaks.pivot_wider(
... names_from= "wool" ,
... values_from= "breaks" ,
... values_agg= lambda col: col.std() / col.mean(),
... ).select("tension" , "A" , "B" ).order_by("tension" )
βββββββββββ³βββββββββββ³βββββββββββ
β tension β A β B β
β‘ββββββββββββββββββββββββββββββββ©
β string β float64 β float64 β
βββββββββββΌβββββββββββΌβββββββββββ€
β H β 0.418344 β 0.260590 β
β L β 0.406183 β 0.349325 β
β M β 0.360844 β 0.327719 β
βββββββββββ΄βββββββββββ΄βββββββββββ
Generate some random data, setting the random seed for reproducibility
>>> import random
>>> random.seed(0 )
>>> raw = ibis.memtable(
... [
... dict (
... product= product,
... country= country,
... year= year,
... production= random.random(),
... )
... for product in "AB"
... for country in ["AI" , "EI" ]
... for year in range (2000 , 2015 )
... ]
... )
>>> production = raw.filter (((_.product == "A" ) & (_.country == "AI" )) | (_.product == "B" ))
>>> production.order_by(s.all ())
βββββββββββ³ββββββββββ³ββββββββ³βββββββββββββ
β product β country β year β production β
β‘βββββββββββββββββββββββββββββββββββββββββ©
β string β string β int64 β float64 β
βββββββββββΌββββββββββΌββββββββΌβββββββββββββ€
β A β AI β 2000 β 0.844422 β
β A β AI β 2001 β 0.757954 β
β A β AI β 2002 β 0.420572 β
β A β AI β 2003 β 0.258917 β
β A β AI β 2004 β 0.511275 β
β A β AI β 2005 β 0.404934 β
β A β AI β 2006 β 0.783799 β
β A β AI β 2007 β 0.303313 β
β A β AI β 2008 β 0.476597 β
β A β AI β 2009 β 0.583382 β
β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββ΄ββββββββ΄βββββββββββββ
Pivoting with multiple name columns
>>> production.pivot_wider(
... names_from= ["product" , "country" ],
... values_from= "production" ,
... )
βββββββββ³βββββββββββ³βββββββββββ³βββββββββββ
β year β A_AI β B_AI β B_EI β
β‘βββββββββββββββββββββββββββββββββββββββββ©
β int64 β float64 β float64 β float64 β
βββββββββΌβββββββββββΌβββββββββββΌβββββββββββ€
β 2004 β 0.511275 β 0.548699 β 0.967540 β
β 2006 β 0.783799 β 0.719705 β 0.447970 β
β 2007 β 0.303313 β 0.398824 β 0.080446 β
β 2008 β 0.476597 β 0.824845 β 0.320055 β
β 2011 β 0.504687 β 0.493578 β 0.109058 β
β 2002 β 0.420572 β 0.260492 β 0.567511 β
β 2005 β 0.404934 β 0.014042 β 0.803179 β
β 2001 β 0.757954 β 0.865310 β 0.191067 β
β 2003 β 0.258917 β 0.805028 β 0.238616 β
β 2009 β 0.583382 β 0.668153 β 0.507941 β
β β¦ β β¦ β β¦ β β¦ β
βββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ
Select a subset of names. This call incurs no computation when constructing the expression.
>>> production.pivot_wider(
... names_from= ["product" , "country" ],
... names= [("A" , "AI" ), ("B" , "AI" )],
... values_from= "production" ,
... )
βββββββββ³βββββββββββ³βββββββββββ
β year β A_AI β B_AI β
β‘ββββββββββββββββββββββββββββββ©
β int64 β float64 β float64 β
βββββββββΌβββββββββββΌβββββββββββ€
β 2002 β 0.420572 β 0.260492 β
β 2005 β 0.404934 β 0.014042 β
β 2013 β 0.755804 β 0.243911 β
β 2001 β 0.757954 β 0.865310 β
β 2003 β 0.258917 β 0.805028 β
β 2009 β 0.583382 β 0.668153 β
β 2012 β 0.281838 β 0.867603 β
β 2004 β 0.511275 β 0.548699 β
β 2006 β 0.783799 β 0.719705 β
β 2007 β 0.303313 β 0.398824 β
β β¦ β β¦ β β¦ β
βββββββββ΄βββββββββββ΄βββββββββββ
Sort the new columnsβ names
>>> production.pivot_wider(
... names_from= ["product" , "country" ],
... values_from= "production" ,
... names_sort= True ,
... )
βββββββββ³βββββββββββ³βββββββββββ³βββββββββββ
β year β A_AI β B_AI β B_EI β
β‘βββββββββββββββββββββββββββββββββββββββββ©
β int64 β float64 β float64 β float64 β
βββββββββΌβββββββββββΌβββββββββββΌβββββββββββ€
β 2001 β 0.757954 β 0.865310 β 0.191067 β
β 2003 β 0.258917 β 0.805028 β 0.238616 β
β 2009 β 0.583382 β 0.668153 β 0.507941 β
β 2012 β 0.281838 β 0.867603 β 0.551267 β
β 2004 β 0.511275 β 0.548699 β 0.967540 β
β 2006 β 0.783799 β 0.719705 β 0.447970 β
β 2007 β 0.303313 β 0.398824 β 0.080446 β
β 2008 β 0.476597 β 0.824845 β 0.320055 β
β 2011 β 0.504687 β 0.493578 β 0.109058 β
β 2000 β 0.844422 β 0.477010 β 0.870471 β
β β¦ β β¦ β β¦ β β¦ β
βββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ
preview
preview(
max_rows= None ,
max_columns= None ,
max_length= None ,
max_string= None ,
max_depth= None ,
console_width= None ,
)
Return a subset as a Rich Table.
This is an explicit version of what you get when you inspect this object in interactive mode, except with this version you can pass formatting options. The options are the same as those exposed in ibis.options.interactive
.
Parameters
max_rows
int | None
Maximum number of rows to display
None
max_columns
int | None
Maximum number of columns to display
None
max_length
int | None
Maximum length for pretty-printed arrays and maps
None
max_string
int | None
Maximum length for pretty-printed strings
None
max_depth
int | None
Maximum depth for nested data types
None
console_width
int | float | None
Width of the console in characters. If not specified, the width will be inferred from the console.
None
Examples
>>> import ibis
>>> t = ibis.examples.penguins.fetch()
Because the console_width is too small, only 2 columns are shown even though we specified up to 3.
>>> t.preview(
... max_rows= 3 ,
... max_columns= 3 ,
... max_string= 8 ,
... console_width= 30 ,
... )
βββββββββββ³βββββββββββ³ββββ
β species β island β β¦ β
β‘βββββββββββββββββββββββββ©
β string β string β β¦ β
βββββββββββΌβββββββββββΌββββ€
β Adelie β Torgersβ¦ β β¦ β
β Adelie β Torgersβ¦ β β¦ β
β Adelie β Torgersβ¦ β β¦ β
β β¦ β β¦ β β¦ β
βββββββββββ΄βββββββββββ΄ββββ
relabel
Deprecated in favor of Table.rename
.
relocate
relocate(* columns, before= None , after= None , ** kwargs)
Relocate columns
before or after other specified columns.
Parameters
columns
str | s
.Selector
Columns to relocate. Selectors are accepted.
()
before
str | s
.Selector
| None
A column name or selector to insert the new columns before.
None
after
str | s
.Selector
| None
A column name or selector. Columns in columns
are relocated after the last column selected in after
.
None
kwargs
str
Additional column names to relocate, renaming argument values to keyword argument names.
{}
Returns
Table
A table with the columns relocated.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> import ibis.selectors as s
>>> t = ibis.memtable(dict (a= [1 ], b= [1 ], c= [1 ], d= ["a" ], e= ["a" ], f= ["a" ]))
>>> t
βββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ³βββββββββ
β a β b β c β d β e β f β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β int64 β string β string β string β
βββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββΌβββββββββ€
β 1 β 1 β 1 β a β a β a β
βββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ΄βββββββββ
ββββββββββ³ββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ
β f β a β b β c β d β e β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β int64 β int64 β string β string β
ββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββ€
β a β 1 β 1 β 1 β a β a β
ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ
>>> t.relocate("a" , after= "c" )
βββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ³βββββββββ
β b β c β a β d β e β f β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β int64 β string β string β string β
βββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββΌβββββββββ€
β 1 β 1 β 1 β a β a β a β
βββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ΄βββββββββ
>>> t.relocate("f" , before= "b" )
βββββββββ³βββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ
β a β f β b β c β d β e β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β string β int64 β int64 β string β string β
βββββββββΌβββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββ€
β 1 β a β 1 β 1 β a β a β
βββββββββ΄βββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ
>>> t.relocate("a" , after= s.last())
βββββββββ³ββββββββ³βββββββββ³βββββββββ³βββββββββ³ββββββββ
β b β c β d β e β f β a β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β int64 β int64 β string β string β string β int64 β
βββββββββΌββββββββΌβββββββββΌβββββββββΌβββββββββΌββββββββ€
β 1 β 1 β a β a β a β 1 β
βββββββββ΄ββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄ββββββββ
Relocate allows renaming
ββββββββββ³ββββββββ³ββββββββ³ββββββββ³βββββββββ³βββββββββ
β ff β a β b β c β d β e β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β int64 β int64 β string β string β
ββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββββ€
β a β 1 β 1 β 1 β a β a β
ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββββ
You can relocate based on any predicate selector, such as of_type
>>> t.relocate(s.of_type("string" ))
ββββββββββ³βββββββββ³βββββββββ³ββββββββ³ββββββββ³ββββββββ
β d β e β f β a β b β c β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β int64 β int64 β
ββββββββββΌβββββββββΌβββββββββΌββββββββΌββββββββΌββββββββ€
β a β a β a β 1 β 1 β 1 β
ββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ΄ββββββββ΄ββββββββ
>>> t.relocate(s.numeric(), after= s.last())
ββββββββββ³βββββββββ³βββββββββ³ββββββββ³ββββββββ³ββββββββ
β d β e β f β a β b β c β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β string β int64 β int64 β int64 β
ββββββββββΌβββββββββΌβββββββββΌββββββββΌββββββββΌββββββββ€
β a β a β a β 1 β 1 β 1 β
ββββββββββ΄βββββββββ΄βββββββββ΄ββββββββ΄ββββββββ΄ββββββββ
When multiple columns are selected with before
or after
, those selected columns are moved before and after the selectors
input
>>> t = ibis.memtable(dict (a= [1 ], b= ["a" ], c= [1 ], d= ["a" ]))
>>> t.relocate(s.numeric(), after= s.of_type("string" ))
ββββββββββ³βββββββββ³ββββββββ³ββββββββ
β b β d β a β c β
β‘ββββββββββββββββββββββββββββββββββ©
β string β string β int64 β int64 β
ββββββββββΌβββββββββΌββββββββΌββββββββ€
β a β a β 1 β 1 β
ββββββββββ΄βββββββββ΄ββββββββ΄ββββββββ
>>> t.relocate(s.numeric(), before= s.of_type("string" ))
βββββββββ³ββββββββ³βββββββββ³βββββββββ
β a β c β b β d β
β‘ββββββββββββββββββββββββββββββββββ©
β int64 β int64 β string β string β
βββββββββΌββββββββΌβββββββββΌβββββββββ€
β 1 β 1 β a β a β
βββββββββ΄ββββββββ΄βββββββββ΄βββββββββ
When there are duplicate renames in a call to relocate, the last one is preserved
>>> t.relocate(e= "d" , f= "d" )
ββββββββββ³ββββββββ³βββββββββ³ββββββββ
β f β a β b β c β
β‘ββββββββββββββββββββββββββββββββββ©
β string β int64 β string β int64 β
ββββββββββΌββββββββΌβββββββββΌββββββββ€
β a β 1 β a β 1 β
ββββββββββ΄ββββββββ΄βββββββββ΄ββββββββ
However, if there are duplicates that are not part of a rename, the order specified in the relocate call is preserved
>>> t.relocate(
... "b" ,
... s.of_type("string" ), # "b" is a string column, so the selector matches
... )
ββββββββββ³βββββββββ³ββββββββ³ββββββββ
β b β d β a β c β
β‘ββββββββββββββββββββββββββββββββββ©
β string β string β int64 β int64 β
ββββββββββΌβββββββββΌββββββββΌββββββββ€
β a β a β 1 β 1 β
ββββββββββ΄βββββββββ΄ββββββββ΄ββββββββ
rename
rename(method= None , / , ** substitutions)
Rename columns in the table.
Parameters
method
str | Callable [[str ], str | None] | Literal ['snake_case', 'ALL_CAPS'] | Mapping [str , str ] | None
An optional method for renaming columns. May be one of: - A format string to use to rename all columns, like "prefix_{name}"
. - A function from old name to new name. If the function returns None
the old name is used. - The literal strings "snake_case"
or "ALL_CAPS"
to rename all columns using a snake_case
or "ALL_CAPS"`` naming convention respectively. - A mapping from new name to old name. Existing columns not present in the mapping will passthrough with their original name. |
None| | substitutions | [str](
str) | Columns to be explicitly renamed, expressed as
new_name=old_name`keyword arguments. |
`
Returns
Table
A renamed table expression
Examples
>>> import ibis
>>> import ibis.selectors as s
>>> ibis.options.interactive = True
>>> first3 = s.index[:3 ] # first 3 columns
>>> t = ibis.examples.penguins_raw_raw.fetch().select(first3)
>>> t
βββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β studyName β Sample Number β Species β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
βββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 2 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 3 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 4 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 5 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 6 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 7 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 8 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 9 β Adelie Penguin (Pygoscelis adeliae) β
β PAL0708 β 10 β Adelie Penguin (Pygoscelis adeliae) β
β β¦ β β¦ β β¦ β
βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
Rename specific columns by passing keyword arguments like `new_name=βold_nameβ``
>>> t.rename(study_name= "studyName" ).head(1 )
ββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β study_name β Sample Number β Species β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
ββββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
ββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
Rename all columns using a format string
>>> t.rename("p_ {name} " ).head(1 )
βββββββββββββββ³ββββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β p_studyName β p_Sample Number β p_Species β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
βββββββββββββββΌββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
βββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
Rename all columns using a snake_case convention
>>> t.rename("snake_case" ).head(1 )
ββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β study_name β sample_number β species β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
ββββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
ββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
Rename all columns using an ALL_CAPS convention
>>> t.rename("ALL_CAPS" ).head(1 )
ββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β STUDY_NAME β SAMPLE_NUMBER β SPECIES β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
ββββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
ββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
Rename all columns using a callable
>>> t.rename(str .upper).head(1 )
βββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββββββββββββββββββββ
β STUDYNAME β SAMPLE NUMBER β SPECIES β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β int64 β string β
βββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ€
β PAL0708 β 1 β Adelie Penguin (Pygoscelis adeliae) β
βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββ
rowid
A unique integer per row.
Any further meaning behind this expression is backend dependent. Generally this corresponds to some index into the database storage (for example, SQLite and DuckDBβs rowid
).
For a monotonically increasing row number, see ibis.row_number
.
sample
sample(fraction, * , method= 'row' , seed= None )
Sample a fraction of rows from a table.
Sampling is by definition a random operation. Some backends support specifying a seed
for repeatable results, but not all backends support that option. And some backends (duckdb, for example) do support specifying a seed but may still not have repeatable results in all cases.
In all cases, results are backend-specific. An execution against one backend is unlikely to sample the same rows when executed against a different backend, even with the same seed
set.
Parameters
fraction
float
The percentage of rows to include in the sample, expressed as a float between 0 and 1.
required
method
Literal ['row', 'block']
The sampling method to use. The default is βrowβ, which includes each row with a probability of fraction
. If method is βblockβ, some backends may instead sample a fraction of blocks of rows (where βblockβ is a backend dependent definition), which may be significantly more efficient (at the cost of a less statistically random sample). This is identical to βrowβ for backends lacking a blockwise sampling implementation. For those coming from SQL, βrowβ and βblockβ correspond to βbernoulliβ and βsystemβ respectively in a TABLESAMPLE clause.
'row'
seed
int | None
An optional random seed to use, for repeatable sampling. The range of possible seed values is backend specific (most support at least [0, 2**31 - 1]
). Backends that never support specifying a seed for repeatable sampling will error appropriately. Note that some backends (like DuckDB) do support specifying a seed, but may still not have repeatable results in all cases.
None
Returns
Table
The input table, with fraction
of rows selected.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"x" : [1 , 2 , 3 , 4 ], "y" : ["a" , "b" , "c" , "d" ]})
>>> t
βββββββββ³βββββββββ
β x β y β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 1 β a β
β 2 β b β
β 3 β c β
β 4 β d β
βββββββββ΄βββββββββ
Sample approximately half the rows, with a seed specified for reproducibility.
>>> t.sample(0.5 , seed= 1234 )
βββββββββ³βββββββββ
β x β y β
β‘βββββββββββββββββ©
β int64 β string β
βββββββββΌβββββββββ€
β 2 β b β
β 3 β c β
βββββββββ΄βββββββββ
schema
Return the Schema for this table.
Returns
Schema
The tableβs schema.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.schema()
ibis.Schema {
species string
island string
bill_length_mm float64
bill_depth_mm float64
flipper_length_mm int64
body_mass_g int64
sex string
year int64
}
select
select(* exprs, ** named_exprs)
Compute a new table expression using exprs
and named_exprs
.
Passing an aggregate function to this method will broadcast the aggregateβs value over the number of rows in the table and automatically constructs a window function expression. See the examples section for more details.
For backwards compatibility the keyword argument exprs
is reserved and cannot be used to name an expression. This behavior will be removed in v4.
Parameters
exprs
ir
.Value
| str | Iterable [ir
.Value
| str ]
Column expression, string, or list of column expressions and strings.
()
named_exprs
ir
.Value
| str
Column expressions
{}
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
Simple projection
>>> t.select("island" , "bill_length_mm" ).head()
βββββββββββββ³βββββββββββββββββ
β island β bill_length_mm β
β‘βββββββββββββββββββββββββββββ©
β string β float64 β
βββββββββββββΌβββββββββββββββββ€
β Torgersen β 39.1 β
β Torgersen β 39.5 β
β Torgersen β 40.3 β
β Torgersen β NULL β
β Torgersen β 36.7 β
βββββββββββββ΄βββββββββββββββββ
In that simple case, you could also just use pythonβs indexing syntax
>>> t[["island" , "bill_length_mm" ]].head()
βββββββββββββ³βββββββββββββββββ
β island β bill_length_mm β
β‘βββββββββββββββββββββββββββββ©
β string β float64 β
βββββββββββββΌβββββββββββββββββ€
β Torgersen β 39.1 β
β Torgersen β 39.5 β
β Torgersen β 40.3 β
β Torgersen β NULL β
β Torgersen β 36.7 β
βββββββββββββ΄βββββββββββββββββ
Projection by zero-indexed column position
>>> t.select(t[0 ], t[4 ]).head()
βββββββββββ³ββββββββββββββββββββ
β species β flipper_length_mm β
β‘ββββββββββββββββββββββββββββββ©
β string β int64 β
βββββββββββΌββββββββββββββββββββ€
β Adelie β 181 β
β Adelie β 186 β
β Adelie β 195 β
β Adelie β NULL β
β Adelie β 193 β
βββββββββββ΄ββββββββββββββββββββ
Projection with renaming and compute in one call
>>> t.select(next_year= t.year + 1 ).head()
βββββββββββββ
β next_year β
β‘ββββββββββββ©
β int64 β
βββββββββββββ€
β 2008 β
β 2008 β
β 2008 β
β 2008 β
β 2008 β
βββββββββββββ
You can do the same thing with a named expression, and using the deferred API
>>> from ibis import _
>>> t.select((_.year + 1 ).name("next_year" )).head()
βββββββββββββ
β next_year β
β‘ββββββββββββ©
β int64 β
βββββββββββββ€
β 2008 β
β 2008 β
β 2008 β
β 2008 β
β 2008 β
βββββββββββββ
Projection with aggregation expressions
>>> t.select("island" , bill_mean= t.bill_length_mm.mean()).head()
βββββββββββββ³ββββββββββββ
β island β bill_mean β
β‘ββββββββββββββββββββββββ©
β string β float64 β
βββββββββββββΌββββββββββββ€
β Torgersen β 43.92193 β
β Torgersen β 43.92193 β
β Torgersen β 43.92193 β
β Torgersen β 43.92193 β
β Torgersen β 43.92193 β
βββββββββββββ΄ββββββββββββ
Projection with a selector
>>> import ibis.selectors as s
>>> t.select(s.numeric() & ~ s.cols("year" )).head()
ββββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ
β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β float64 β float64 β int64 β int64 β
ββββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββ€
β 39.1 β 18.7 β 181 β 3750 β
β 39.5 β 17.4 β 186 β 3800 β
β 40.3 β 18.0 β 195 β 3250 β
β NULL β NULL β NULL β NULL β
β 36.7 β 19.3 β 193 β 3450 β
ββββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ
Projection + aggregation across multiple columns
>>> from ibis import _
>>> t.select(s.across(s.numeric() & ~ s.cols("year" ), _.mean())).head()
ββββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ
β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β float64 β float64 β float64 β float64 β
ββββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββ€
β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
β 43.92193 β 17.15117 β 200.915205 β 4201.754386 β
ββββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ
sql
Run a SQL query against a table expression.
Parameters
query
str
Query string
required
dialect
str | None
Optional string indicating the dialect of query
. Defaults to the backendβs native dialect.
None
Returns
Table
An opaque table expression
Examples
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch(table_name= "penguins" )
>>> expr = t.sql(
... """
... SELECT island, mean(bill_length_mm) AS avg_bill_length
... FROM penguins
... GROUP BY 1
... ORDER BY 2 DESC
... """
... )
>>> expr
βββββββββββββ³ββββββββββββββββββ
β island β avg_bill_length β
β‘ββββββββββββββββββββββββββββββ©
β string β float64 β
βββββββββββββΌββββββββββββββββββ€
β Biscoe β 45.257485 β
β Dream β 44.167742 β
β Torgersen β 38.950980 β
βββββββββββββ΄ββββββββββββββββββ
Mix and match ibis expressions with SQL queries
>>> t = ibis.examples.penguins.fetch(table_name= "penguins" )
>>> expr = t.sql(
... """
... SELECT island, mean(bill_length_mm) AS avg_bill_length
... FROM penguins
... GROUP BY 1
... ORDER BY 2 DESC
... """
... )
>>> expr = expr.mutate(
... island= _.island.lower(),
... avg_bill_length= _.avg_bill_length.round (1 ),
... )
>>> expr
βββββββββββββ³ββββββββββββββββββ
β island β avg_bill_length β
β‘ββββββββββββββββββββββββββββββ©
β string β float64 β
βββββββββββββΌββββββββββββββββββ€
β biscoe β 45.3 β
β dream β 44.2 β
β torgersen β 39.0 β
βββββββββββββ΄ββββββββββββββββββ
Because ibis expressions arenβt named, they arenβt visible to subsequent .sql
calls. Use the alias
method to assign a name to an expression.
>>> expr.alias("b" ).sql("SELECT * FROM b WHERE avg_bill_length > 40" )
ββββββββββ³ββββββββββββββββββ
β island β avg_bill_length β
β‘βββββββββββββββββββββββββββ©
β string β float64 β
ββββββββββΌββββββββββββββββββ€
β biscoe β 45.3 β
β dream β 44.2 β
ββββββββββ΄ββββββββββββββββββ
to_array
Deprecated - use as_scalar
instead.
to_csv
to_csv(path, * , params= None , ** kwargs)
Write the results of executing the given expression to a CSV file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.csv.CSVWriter.html for details.
Parameters
path
str | Path
The data source. A string or Path to the CSV file.
required
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
**kwargs
Any
Additional keyword arguments passed to pyarrow.csv.CSVWriter
{}
to_delta
to_delta(path, * , params= None , ** kwargs)
Write the results of executing the given expression to a Delta Lake table.
This method is eager and will execute the associated expression immediately.
Parameters
path
str | Path
The data source. A string or Path to the Delta Lake table directory.
required
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
**kwargs
Any
Additional keyword arguments passed to deltalake.writer.write_deltalake method
{}
to_pandas
Convert a table expression to a pandas DataFrame.
Parameters
kwargs
Same as keyword arguments to execute
{}
to_pandas_batches
to_pandas_batches(limit= None , params= None , chunk_size= 1000000 , ** kwargs)
Execute expression and return an iterator of pandas DataFrames.
This method is eager and will execute the associated expression immediately.
Parameters
limit
int | str | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
None
params
Mapping [ir
.Value
, Any ] | None
Mapping of scalar parameter expressions to value.
None
chunk_size
int
Maximum number of rows in each returned DataFrame
.
1000000
kwargs
Any
Keyword arguments
{}
to_parquet
to_parquet(path, * , params= None , ** kwargs)
Write the results of executing the given expression to a parquet file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetWriter.html for details.
Parameters
path
str | Path
The data source. A string or Path to the parquet file.
required
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
**kwargs
Any
Additional keyword arguments passed to pyarrow.parquet.ParquetWriter
{}
Examples
Write out an expression to a single parquet file.
>>> import ibis
>>> import tempfile
>>> penguins = ibis.examples.penguins.fetch()
>>> penguins.to_parquet(tempfile.mktemp())
Partition on a single column.
>>> penguins.to_parquet(tempfile.mkdtemp(), partition_by= "year" )
Partition on multiple columns.
>>> penguins.to_parquet(tempfile.mkdtemp(), partition_by= ("year" , "island" ))
to_parquet_dir
to_parquet_dir(directory, * , params= None , ** kwargs)
Write the results of executing the given expression to a parquet file in a directory.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.dataset.write_dataset.html for details.
Parameters
directory
str | Path
The data target. A string or Path to the directory where the parquet file will be written.
required
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
**kwargs
Any
Additional keyword arguments passed to pyarrow.dataset.write_dataset
{}
to_polars
to_polars(params= None , limit= None , ** kwargs)
Execute expression and return results as a polars dataframe.
This method is eager and will execute the associated expression immediately.
Parameters
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
limit
int | str | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
None
kwargs
Any
Keyword arguments
{}
Returns
DataFrame
A polars dataframe holding the results of the executed expression.
to_pyarrow
to_pyarrow(params= None , limit= None , ** kwargs)
Execute expression and return results in as a pyarrow table.
This method is eager and will execute the associated expression immediately.
Parameters
params
Mapping [ir
.Scalar
, Any ] | None
Mapping of scalar parameter expressions to value.
None
limit
int | str | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
None
kwargs
Any
Keyword arguments
{}
Returns
Table
A pyarrow table holding the results of the executed expression.
to_pyarrow_batches
to_pyarrow_batches(limit= None , params= None , chunk_size= 1000000 , ** kwargs)
Execute expression and return a RecordBatchReader.
This method is eager and will execute the associated expression immediately.
Parameters
limit
int | str | None
An integer to effect a specific row limit. A value of None
means βno limitβ. The default is in ibis/config.py
.
None
params
Mapping [ir
.Value
, Any ] | None
Mapping of scalar parameter expressions to value.
None
chunk_size
int
Maximum number of rows in each returned record batch.
1000000
kwargs
Any
Keyword arguments
{}
Returns
results
RecordBatchReader
to_torch
to_torch(params= None , limit= None , ** kwargs)
Execute an expression and return results as a dictionary of torch tensors.
Parameters
params
Mapping [ir
.Scalar
, Any ] | None
Parameters to substitute into the expression.
None
limit
int | str | None
An integer to effect a specific row limit. A value of None
means no limit.
None
kwargs
Any
Keyword arguments passed into the backendβs to_torch
implementation.
{}
Returns
dict [str , torch
.Tensor
]
A dictionary of torch tensors, keyed by column name.
try_cast
Cast the columns of a table.
If the cast fails for a row, the value is returned as NULL
or NaN
depending on backend behavior.
Parameters
schema
SchemaLike
Mapping, schema or iterable of pairs to use for casting
required
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a" : ["1" , "2" , "3" ], "b" : ["2.2" , "3.3" , "book" ]})
>>> t.try_cast({"a" : "int" , "b" : "float" })
βββββββββ³ββββββββββ
β a β b β
β‘ββββββββββββββββββ©
β int64 β float64 β
βββββββββΌββββββββββ€
β 1 β 2.2 β
β 2 β 3.3 β
β 3 β NULL β
βββββββββ΄ββββββββββ
unbind
Return an expression built on UnboundTable
instead of backend-specific objects.
Examples
>>> import ibis
>>> import pandas as pd