>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2, 3, None]})
>>> t.order_by(t.a.asc())
┏━━━━━━━━━┓ ┃ a ┃ ┡━━━━━━━━━┩ │ float64 │ ├─────────┤ │ 1.0 │ │ 2.0 │ │ 3.0 │ │ NULL │ └─────────┘
Scalars and columns of any element type.
Base class for a data generating expression having a known type.
Name | Description |
---|---|
as_scalar | Convert an expression to a scalar. |
as_table | Convert an expression to a table. |
asc | Sort an expression in ascending order. |
between | Check if this expression is between lower and upper , inclusive. |
case | DEPRECATED: use value.cases() or ibis.cases() instead. |
cases | Create a multi-branch if-else expression. |
cast | Cast expression to indicated data type. |
coalesce | Return the first non-null value from args . |
collect | Aggregate this expression’s elements into an array. |
compile | Compile expr to a SQL string (for SQL backends) or a LazyFrame (for the polars backend). |
desc | Sort an expression in descending order. |
equals | Return whether this expression is structurally equivalent to other . |
execute | Execute an expression against its backend if one exists. |
fill_null | Replace NULL s with the given value. Does NOT affect NaN and inf values. |
fillna | DEPRECATED: use fill_null instead, which acts exactly the same. |
get_backend | Get the current Ibis backend of the expression. |
get_name | Return the name of this expression. |
group_concat | Concatenate values using the indicated separator to produce a string. |
hash | Compute an integer hash value. |
identical_to | Return whether this expression is identical to other. |
isin | Check whether this expression is in values . |
isnull | Whether this expression is NULL . Does NOT detect NaN and inf values. |
name | Rename an expression to name . |
notin | Check whether this expression is not in values . |
notnull | Return whether this expression is not NULL. |
nullif | Set values to null if they equal the values null_if_expr . |
over | Construct a window expression. |
pipe | Compose f with self . |
substitute | Replace values given in values with replacement . |
to_csv | Write the results of executing the given expression to a CSV file. |
to_delta | Write the results of executing the given expression to a Delta Lake table. |
to_json | Write the results of expr to a json file of [{column -> value}, …] objects. |
to_pandas | Convert a table expression to a pandas DataFrame. |
to_pandas_batches | Execute expression and return an iterator of pandas DataFrames. |
to_parquet | Write the results of executing the given expression to a parquet file. |
to_parquet_dir | Write the results of executing the given expression to a parquet file in a directory. |
to_polars | Execute expression and return results as a polars dataframe. |
to_pyarrow | Execute expression to a pyarrow object. |
to_pyarrow_batches | Execute expression and return a RecordBatchReader. |
to_sql | Compile to a formatted SQL string. |
to_torch | Execute an expression and return results as a dictionary of torch tensors. |
to_xlsx | Write a table to an Excel file. |
try_cast | Try cast expression to indicated data type. |
type | Return the DataType of self . |
typeof | Return the string name of the datatype of self. |
unbind | Return an expression built on UnboundTable instead of backend-specific objects. |
visualize | Visualize an expression as a GraphViz graph in the browser. |
Convert an expression to a scalar.
Convert an expression to a table.
Sort an expression in ascending order.
Name | Type | Description | Default |
---|---|---|---|
nulls_first | bool | Whether to sort NULL values first |
False |
Name | Type | Description |
---|---|---|
Value | Sorted expression |
Check if this expression is between lower
and upper
, inclusive.
Name | Type | Description | Default |
---|---|---|---|
lower | Value | Lower bound, inclusive | required |
upper | Value | Upper bound, inclusive | required |
Name | Type | Description |
---|---|---|
BooleanValue | Expression indicating membership in the provided range |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch().limit(5)
>>> t.bill_length_mm.between(35, 38)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Between(bill_length_mm, 35, 38) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ boolean │ ├─────────────────────────────────┤ │ False │ │ False │ │ False │ │ NULL │ │ True │ └─────────────────────────────────┘
DEPRECATED: use value.cases()
or ibis.cases()
instead.
Create a multi-branch if-else expression.
Equivalent to a SQL CASE
statement.
Name | Type | Description | Default |
---|---|---|---|
branch | tuple[Value, Value] | First (condition , result ) pair. Required. |
required |
branches | tuple[Value, Value] | Additional (condition , result ) pairs. We look through the test values in order and return the result corresponding to the first test value that matches self . If none match, we return else_ . |
() |
else_ | Value | None | Value to return if none of the case conditions evaluate to True . Defaults to NULL . |
None |
Name | Type | Description |
---|---|---|
Value | A value expression |
Note that we never hit the None
case, because x = NULL
is always NULL
, which is not truthy. If you want to replace NULL
s, you should use .fill_null(some_value)
prior to cases()
.
>>> t.mutate(
... result=(
... t.symbol.cases(
... ("+", t.left + t.right),
... ("-", t.left - t.right),
... ("*", t.left * t.right),
... ("/", t.left / t.right),
... (None, -999),
... )
... )
... )
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━┓ ┃ left ┃ symbol ┃ right ┃ result ┃ ┡━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━┩ │ int64 │ string │ int64 │ float64 │ ├───────┼────────┼───────┼─────────┤ │ 5 │ + │ 1 │ 6.0 │ │ 6 │ - │ 2 │ 4.0 │ │ 7 │ * │ 3 │ 21.0 │ │ 8 │ / │ 4 │ 2.0 │ │ 9 │ bogus │ 5 │ NULL │ │ 10 │ NULL │ 6 │ NULL │ └───────┴────────┴───────┴─────────┘
Cast expression to indicated data type.
Similar to pandas.Series.astype
.
Name | Type | Description | Default |
---|---|---|---|
target_type | Any | Type to cast to. Anything accepted by ibis.dtype() |
required |
Name | Type | Description |
---|---|---|
Value | Casted expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> x = ibis.examples.penguins.fetch()["bill_depth_mm"]
>>> x
┏━━━━━━━━━━━━━━━┓ ┃ bill_depth_mm ┃ ┡━━━━━━━━━━━━━━━┩ │ float64 │ ├───────────────┤ │ 18.7 │ │ 17.4 │ │ 18.0 │ │ NULL │ │ 19.3 │ │ 20.6 │ │ 17.8 │ │ 19.6 │ │ 18.1 │ │ 20.2 │ │ … │ └───────────────┘
python’s built-in types can be used
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Cast(bill_depth_mm, int64) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ int64 │ ├────────────────────────────┤ │ 19 │ │ 17 │ │ 18 │ │ NULL │ │ 19 │ │ 21 │ │ 18 │ │ 20 │ │ 18 │ │ 20 │ │ … │ └────────────────────────────┘
or string names
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Cast(bill_depth_mm, uint16) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ uint16 │ ├─────────────────────────────┤ │ 19 │ │ 17 │ │ 18 │ │ NULL │ │ 19 │ │ 21 │ │ 18 │ │ 20 │ │ 18 │ │ 20 │ │ … │ └─────────────────────────────┘
If you make an illegal cast, you won’t know until the backend actually executes it. Consider .try_cast()
.
--------------------------------------------------------------------------- ConversionException Traceback (most recent call last) File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/core/formatters.py:770, in PlainTextFormatter.__call__(self, obj) 763 stream = StringIO() 764 printer = pretty.RepresentationPrinter(stream, self.verbose, 765 self.max_width, self.newline, 766 max_seq_length=self.max_seq_length, 767 singleton_pprinters=self.singleton_printers, 768 type_pprinters=self.type_printers, 769 deferred_pprinters=self.deferred_printers) --> 770 printer.pretty(obj) 771 printer.flush() 772 return stream.getvalue() File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/lib/pretty.py:411, in RepresentationPrinter.pretty(self, obj) 400 return meth(obj, self, cycle) 401 if ( 402 cls is not object 403 # check if cls defines __repr__ (...) 409 and callable(_safe_getattr(cls, "__repr__", None)) 410 ): --> 411 return _repr_pprint(obj, self, cycle) 413 return _default_pprint(obj, self, cycle) 414 finally: File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/lib/pretty.py:786, in _repr_pprint(obj, p, cycle) 784 """A pprint that just redirects to the normal repr function.""" 785 # Find newlines and replace them with p.break_() --> 786 output = repr(obj) 787 lines = output.splitlines() 788 with p.group(): File ~/work/ibis/ibis/ibis/expr/types/core.py:55, in Expr.__repr__(self) 53 def __repr__(self) -> str: 54 if ibis.options.interactive: ---> 55 return capture_rich_renderable(self) 56 else: 57 return self._noninteractive_repr() File ~/work/ibis/ibis/ibis/expr/types/rich.py:48, in capture_rich_renderable(renderable) 46 console = Console(force_terminal=False) 47 with _with_rich_display_disabled(), console.capture() as capture: ---> 48 console.print(renderable) 49 return capture.get().rstrip() File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/rich/console.py:1719, in Console.print(self, sep, end, style, justify, overflow, no_wrap, emoji, markup, highlight, width, height, crop, soft_wrap, new_line_start, *objects) 1717 if style is None: 1718 for renderable in renderables: -> 1719 extend(render(renderable, render_options)) 1720 else: 1721 for renderable in renderables: File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/rich/console.py:1320, in Console.render(self, renderable, options) 1318 renderable = rich_cast(renderable) 1319 if hasattr(renderable, "__rich_console__") and not isclass(renderable): -> 1320 render_iterable = renderable.__rich_console__(self, _options) 1321 elif isinstance(renderable, str): 1322 text_renderable = self.render_str( 1323 renderable, highlight=_options.highlight, markup=_options.markup 1324 ) File ~/work/ibis/ibis/ibis/expr/types/core.py:76, in Expr.__rich_console__(self, console, options) 74 try: 75 if opts.interactive: ---> 76 rich_object = to_rich(self, console_width=console_width) 77 else: 78 rich_object = Text(self._noninteractive_repr()) File ~/work/ibis/ibis/ibis/expr/types/rich.py:66, in to_rich(expr, max_rows, max_columns, max_length, max_string, max_depth, console_width) 63 from ibis.expr.types._rich import to_rich_scalar, to_rich_table 65 if isinstance(expr, ir.Scalar): ---> 66 return to_rich_scalar( 67 expr, max_length=max_length, max_string=max_string, max_depth=max_depth 68 ) 69 else: 70 return to_rich_table( 71 expr, 72 max_rows=max_rows, (...) 77 console_width=console_width, 78 ) File ~/work/ibis/ibis/ibis/expr/types/_rich.py:270, in to_rich_scalar(expr, max_length, max_string, max_depth) 263 def to_rich_scalar( 264 expr: Scalar, 265 *, (...) 268 max_depth: int | None = None, 269 ) -> Panel: --> 270 value = expr.to_pyarrow().as_py() 272 if value is None: 273 formatted_value = Text.styled("NULL", style="dim") File ~/work/ibis/ibis/ibis/expr/types/generic.py:1424, in Scalar.to_pyarrow(self, params, limit, **kwargs) 1416 @experimental 1417 def to_pyarrow( 1418 self, (...) 1422 **kwargs: Any, 1423 ) -> pa.Scalar: -> 1424 return super().to_pyarrow(params=params, limit=limit, **kwargs) File ~/work/ibis/ibis/ibis/expr/types/core.py:605, in Expr.to_pyarrow(self, params, limit, **kwargs) 575 @experimental 576 def to_pyarrow( 577 self, (...) 581 **kwargs: Any, 582 ) -> pa.Table | pa.Array | pa.Scalar: 583 """Execute expression to a pyarrow object. 584 585 This method is eager and will execute the associated expression (...) 603 If the passed expression is a Scalar, a pyarrow scalar is returned. 604 """ --> 605 return self._find_backend(use_default=True).to_pyarrow( 606 self, params=params, limit=limit, **kwargs 607 ) File ~/work/ibis/ibis/ibis/backends/duckdb/__init__.py:1374, in Backend.to_pyarrow(self, expr, params, limit, **kwargs) 1361 def to_pyarrow( 1362 self, 1363 expr: ir.Expr, (...) 1368 **kwargs: Any, 1369 ) -> pa.Table: 1370 from ibis.backends.duckdb.converter import DuckDBPyArrowData 1372 table = self._to_duckdb_relation( 1373 expr, params=params, limit=limit, **kwargs -> 1374 ).arrow() 1375 return expr.__pyarrow_result__(table, data_mapper=DuckDBPyArrowData) ConversionException: Conversion Error: Could not convert string 'a string' to INT64 LINE 1: SELECT CAST('a string' AS BIGINT) AS "Cast('a string', int64)" ^
Return the first non-null value from args
.
Name | Type | Description | Default |
---|---|---|---|
args | Value | Arguments from which to choose the first non-null value | () |
Name | Type | Description |
---|---|---|
Value | Coalesced expression |
Aggregate this expression’s elements into an array.
This function is called array_agg
, list_agg
, or list
in other systems.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the order of the items in the result is undefined and backend specific. | None |
include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
distinct | bool | Whether to collect only distinct elements. | False |
Name | Type | Description |
---|---|---|
ArrayScalar |
An array of all the collected elements. |
Basic collect usage
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"key": list("aaabb"), "value": [1, 1, 2, 3, 5]})
>>> t
┏━━━━━━━━┳━━━━━━━┓ ┃ key ┃ value ┃ ┡━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├────────┼───────┤ │ a │ 1 │ │ a │ 1 │ │ a │ 2 │ │ b │ 3 │ │ b │ 5 │ └────────┴───────┘
Collect all elements into an array scalar:
Collect only unique elements:
Collect elements in a specified order:
Collect elements per group, filtering out values <= 1:
Compile expr
to a SQL string (for SQL backends) or a LazyFrame (for the polars backend).
Name | Type | Description | Default |
---|---|---|---|
limit | int | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value | None |
pretty | bool | In case of SQL backends, return a pretty formatted SQL query. | False |
Name | Type | Description |
---|---|---|
str | pl .LazyFrame |
A SQL string or a LazyFrame object, depending on the backend of self. |
>>> import ibis
>>> d = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> con = ibis.duckdb.connect()
>>> t = con.create_table("t", d)
>>> expr = t.mutate(c=t.a + t.b)
>>> expr.compile()
'SELECT "t0"."a", "t0"."b", "t0"."a" + "t0"."b" AS "c" FROM "memory"."main"."t" AS "t0"'
If you want to see the pretty formatted SQL query, set pretty
to True
.
'SELECT\n "t0"."a",\n "t0"."b",\n "t0"."a" + "t0"."b" AS "c"\nFROM "memory"."main"."t" AS "t0"'
If the expression does not have a backend, an error will be raised.
>>> t = ibis.memtable(d)
>>> expr = t.mutate(c=t.a + t.b)
>>> expr.compile() # quartodoc: +EXPECTED_FAILURE
--------------------------------------------------------------------------- IbisError Traceback (most recent call last) Cell In[53], line 3 1 t = ibis.memtable(d) 2 expr = t.mutate(c=t.a + t.b) ----> 3 expr.compile() # quartodoc: +EXPECTED_FAILURE File ~/work/ibis/ibis/ibis/expr/types/core.py:532, in Expr.compile(self, limit, params, pretty) 479 def compile( 480 self, 481 *, (...) 484 pretty: bool = False, 485 ) -> str | pl.LazyFrame: 486 r"""Compile `expr` to a SQL string (for SQL backends) or a LazyFrame (for the polars backend). 487 488 Parameters (...) 530 [`Table.to_sql()`](./expression-tables.qmd#ibis.expr.types.relations.Table.to_sql) 531 """ --> 532 return self._find_backend().compile( 533 self, limit=limit, params=params, pretty=pretty 534 ) File ~/work/ibis/ibis/ibis/expr/types/core.py:338, in Expr._find_backend(self, use_default) 336 default = _default_backend() if use_default else None 337 if default is None: --> 338 raise IbisError( 339 "Expression depends on no backends, and found no default" 340 ) 341 return default 343 if len(backends) > 1: IbisError: Expression depends on no backends, and found no default
Value.compile()
Table.compile()
Value.to_sql()
Table.to_sql()
Sort an expression in descending order.
Name | Type | Description | Default |
---|---|---|---|
nulls_first | bool | Whether to sort NULL values first. |
False |
Name | Type | Description |
---|---|---|
Value | Sorted expression |
Return whether this expression is structurally equivalent to other
.
If you want to produce an equality expression, use ==
syntax.
Name | Type | Description | Default |
---|---|---|---|
other | Another expression | required |
Execute an expression against its backend if one exists.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
'default' |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value | None |
kwargs | Any | Keyword arguments | {} |
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | None | 2007 |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
339 | Chinstrap | Dream | 55.8 | 19.8 | 207.0 | 4000.0 | male | 2009 |
340 | Chinstrap | Dream | 43.5 | 18.1 | 202.0 | 3400.0 | female | 2009 |
341 | Chinstrap | Dream | 49.6 | 18.2 | 193.0 | 3775.0 | male | 2009 |
342 | Chinstrap | Dream | 50.8 | 19.0 | 210.0 | 4100.0 | male | 2009 |
343 | Chinstrap | Dream | 50.2 | 18.7 | 198.0 | 3775.0 | female | 2009 |
344 rows × 8 columns
Scalar parameters can be supplied dynamically during execution.
>>> species = ibis.param("string")
>>> expr = t.filter(t.species == species).order_by(t.bill_length_mm)
>>> expr.execute(limit=3, params={species: "Gentoo"})
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Gentoo | Biscoe | 40.9 | 13.7 | 214 | 4650 | female | 2007 |
1 | Gentoo | Biscoe | 41.7 | 14.7 | 210 | 4700 | female | 2009 |
2 | Gentoo | Biscoe | 42.0 | 13.5 | 210 | 4150 | female | 2007 |
Replace NULL
s with the given value. Does NOT affect NaN
and inf
values.
This only replaces genuine NULL
values, it does NOT affect NaN
and inf
values for floating point types.
Name | Type | Description | Default |
---|---|---|---|
fill_value | Scalar | Value with which to replace NULL values in self |
required |
Value.coalesce()
ibis.coalesce()
Value.isnull()
FloatingValue.isnan()
FloatingValue.isinf()
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"f": [None, "-inf", "3.0", "inf", "nan"]})
>>> t = t.mutate(f=ibis._.f.cast(float))
>>> t = t.mutate(filled=t.f.fill_null(99))
>>> t
┏━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ filled ┃ ┡━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ float64 │ ├─────────┼─────────┤ │ NULL │ 99.0 │ │ -inf │ -inf │ │ 3.0 │ 3.0 │ │ inf │ inf │ │ nan │ nan │ └─────────┴─────────┘
If you want to fill all NaN
and inf
values as well, use something like the following:
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ filled ┃ filled2 ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ float64 │ float64 │ ├─────────┼─────────┼─────────┤ │ NULL │ 99.0 │ 99.0 │ │ -inf │ -inf │ 99.0 │ │ 3.0 │ 3.0 │ 3.0 │ │ inf │ inf │ 99.0 │ │ nan │ nan │ 99.0 │ └─────────┴─────────┴─────────┘
Name | Type | Description |
---|---|---|
Value | self filled with fill_value where it is NULL |
DEPRECATED: use fill_null
instead, which acts exactly the same.
Get the current Ibis backend of the expression.
Name | Type | Description |
---|---|---|
BaseBackend |
The Ibis backend. |
Return the name of this expression.
Concatenate values using the indicated separator to produce a string.
Name | Type | Description | Default |
---|---|---|---|
sep | str | The separator to use to join strings. | ',' |
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the order of the items in the result is undefined and backend specific. | None |
Name | Type | Description |
---|---|---|
StringScalar |
Concatenated string expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch().limit(5)
>>> t[["bill_length_mm", "bill_depth_mm"]]
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ bill_length_mm ┃ bill_depth_mm ┃ ┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ float64 │ float64 │ ├────────────────┼───────────────┤ │ 39.1 │ 18.7 │ │ 39.5 │ 17.4 │ │ 40.3 │ 18.0 │ │ NULL │ NULL │ │ 36.7 │ 19.3 │ └────────────────┴───────────────┘
┌─────────────────────┐
│ 39.1,39.5,40.3,36.7 │
└─────────────────────┘
┌────────────────────────┐
│ 39.1: 39.5: 40.3: 36.7 │
└────────────────────────┘
Compute an integer hash value.
Name | Type | Description |
---|---|---|
IntegerValue | The hash value of self |
Return whether this expression is identical to other.
Corresponds to IS NOT DISTINCT FROM
in SQL.
Name | Type | Description | Default |
---|---|---|---|
other | Value | Expression to compare to | required |
Name | Type | Description |
---|---|---|
BooleanValue | Whether this expression is not distinct from other |
Check whether this expression is in values
.
NULL
values in the input are propagated in the output. If the values
argument contains any NULL
values, then ibis follows the SQL behavior of returning NULL
(not False) when self
is not present. See examples below for details.
Name | Type | Description | Default |
---|---|---|---|
values | ir .ArrayValue | ir .Column | Iterable[Value] |
Values or expression to check for membership. | required |
Name | Type | Description |
---|---|---|
BooleanValue | True if self is contained in values , False otherwise. |
Checking for values in literals:
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_12 ┃ a_in_12None ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ ├───────┼───────┼─────────┼─────────────┤ │ 1 │ 1 │ True │ True │ │ 2 │ 2 │ True │ True │ │ 3 │ 9 │ False │ NULL │ │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴─────────────┘
Checking for values in columns of the same table:
>>> t.mutate(
... a_in_b=t.a.isin(t.b),
... a_in_b_no_null=t.a.isin(t.b.fill_null(0)),
... a_in_b_plus_1=t.a.isin(t.b + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_b ┃ a_in_b_no_null ┃ a_in_b_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼─────────┼────────────────┼───────────────┤ │ 1 │ 1 │ True │ True │ NULL │ │ 2 │ 2 │ True │ True │ True │ │ 3 │ 9 │ NULL │ False │ True │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴────────────────┴───────────────┘
Checking for values in a column from a different table:
>>> t2 = ibis.memtable({"x": [1, 2, 99], "y": [1, 2, None]})
>>> t.mutate(
... a_in_x=t.a.isin(t2.x),
... a_in_y=t.a.isin(t2.y),
... a_in_y_plus_1=t.a.isin(t2.y + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_x ┃ a_in_y ┃ a_in_y_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼─────────┼─────────┼───────────────┤ │ 1 │ 1 │ True │ True │ NULL │ │ 2 │ 2 │ True │ True │ True │ │ 3 │ 9 │ False │ NULL │ True │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴─────────┴───────────────┘
Whether this expression is NULL
. Does NOT detect NaN
and inf
values.
For FloatingValue types, use FloatingValue.isnan()
and FloatingValue.isinf()
to detect NaN
and inf
values.
Value.fill_null()
FloatingValue.isnan()
FloatingValue.isinf()
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"f": [None, "-inf", "3.0", "inf", "nan"]})
>>> t = t.mutate(f=ibis._.f.cast(float))
>>> t.mutate(
... isnull=t.f.isnull(),
... isnan=t.f.isnan(),
... isinf=t.f.isinf(),
... )
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ isnull ┃ isnan ┃ isinf ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ boolean │ boolean │ boolean │ ├─────────┼─────────┼─────────┼─────────┤ │ NULL │ True │ NULL │ NULL │ │ -inf │ False │ False │ True │ │ 3.0 │ False │ False │ False │ │ inf │ False │ False │ True │ │ nan │ False │ True │ False │ └─────────┴─────────┴─────────┴─────────┘
Rename an expression to name
.
Name | Type | Description | Default |
---|---|---|---|
name | str | The new name of the expression | required |
Name | Type | Description |
---|---|---|
Value | self with name name |
Check whether this expression is not in values
.
Opposite of Value.isin()
.
NULL
values in the input are propagated in the output. If the values
argument contains any NULL
values, then ibis follows the SQL behavior of returning NULL
(not False) when self
is present. See examples below for details.
Name | Type | Description | Default |
---|---|---|---|
values | ir .ArrayValue | ir .Column | Iterable[Value] |
Values or expression to check for lack of membership. | required |
Name | Type | Description |
---|---|---|
BooleanValue | True if self is not in values , False otherwise. |
Checking for values in literals:
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_12 ┃ a_notin_12None ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ ├───────┼───────┼────────────┼────────────────┤ │ 1 │ 1 │ False │ False │ │ 2 │ 2 │ False │ False │ │ 3 │ 9 │ True │ NULL │ │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴────────────┴────────────────┘
Checking for values in columns of the same table:
>>> t.mutate(
... a_notin_b=t.a.notin(t.b),
... a_notin_b_no_null=t.a.notin(t.b.fill_null(0)),
... a_notin_b_plus_1=t.a.notin(t.b + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_b ┃ a_notin_b_no_null ┃ a_notin_b_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼───────────┼───────────────────┼──────────────────┤ │ 1 │ 1 │ False │ False │ NULL │ │ 2 │ 2 │ False │ False │ False │ │ 3 │ 9 │ NULL │ True │ False │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴───────────┴───────────────────┴──────────────────┘
Checking for values in a column from a different table:
>>> t2 = ibis.memtable({"x": [1, 2, 99], "y": [1, 2, None]})
>>> t.mutate(
... a_notin_x=t.a.notin(t2.x),
... a_notin_y=t.a.notin(t2.y),
... a_notin_y_plus_1=t.a.notin(t2.y + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_x ┃ a_notin_y ┃ a_notin_y_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼───────────┼───────────┼──────────────────┤ │ 1 │ 1 │ False │ False │ NULL │ │ 2 │ 2 │ False │ False │ False │ │ 3 │ 9 │ True │ NULL │ False │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴───────────┴───────────┴──────────────────┘
Return whether this expression is not NULL.
Set values to null if they equal the values null_if_expr
.
Commonly used to avoid divide-by-zero problems by replacing zero with NULL
in the divisor.
Equivalent to (self == null_if_expr).ifelse(ibis.null(), self)
.
Name | Type | Description | Default |
---|---|---|---|
null_if_expr | Value | Expression indicating what values should be NULL | required |
Name | Type | Description |
---|---|---|
Value | Value expression |
Construct a window expression.
Name | Type | Description | Default |
---|---|---|---|
window | Window specification | None |
|
rows | Whether to use the ROWS window clause |
None |
|
range | Whether to use the RANGE window clause |
None |
|
group_by | Grouping key | None |
|
order_by | Ordering key | None |
Name | Type | Description |
---|---|---|
Value | A window function expression |
Compose f
with self
.
Name | Type | Description | Default |
---|---|---|---|
f | If the expression needs to be passed as anything other than the first argument to the function, pass a tuple with the argument name. For example, (f, ‘data’) if the function f expects a ‘data’ keyword | required | |
args | Any | Positional arguments to f |
() |
kwargs | Any | Keyword arguments to f |
{} |
Name | Type | Description |
---|---|---|
Expr |
Result type of passed function |
Replace values given in values
with replacement
.
This is similar to the pandas replace
method.
Name | Type | Description | Default |
---|---|---|---|
value | Value | dict | Expression or dict. | required |
replacement | Value | None | If an expression is passed to value, this must be passed. | None |
else_ | Value | None | If an original value does not match value , then else_ is used. The default of None means leave the original value unchanged. |
None |
Name | Type | Description |
---|---|---|
Value | Replaced values |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.island.value_counts().order_by("island")
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ island ┃ island_count ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ │ string │ int64 │ ├───────────┼──────────────┤ │ Biscoe │ 168 │ │ Dream │ 124 │ │ Torgersen │ 52 │ └───────────┴──────────────┘
>>> t.island.substitute({"Torgersen": "torg", "Biscoe": "bisc"}).name(
... "island"
... ).value_counts().order_by("island")
┏━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ island ┃ island_count ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━━┩ │ string │ int64 │ ├────────┼──────────────┤ │ Dream │ 124 │ │ bisc │ 168 │ │ torg │ 52 │ └────────┴──────────────┘
Write the results of executing the given expression to a CSV file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.csv.CSVWriter.html for details.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path where the CSV file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.csv.CSVWriter | {} |
Write the results of executing the given expression to a Delta Lake table.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path to the Delta Lake table directory. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to deltalake.writer.write_deltalake method | {} |
Write the results of expr
to a json file of [{column -> value}, …] objects.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path where the JSON file will be written. | required |
kwargs | Any | Additional, backend-specific keyword arguments. | {} |
Convert a table expression to a pandas DataFrame.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means no limit. The default is in ibis/config.py . |
None |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
DataFrame |
The result of executing the expression as a pandas DataFrame |
Execute expression and return an iterator of pandas DataFrames.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value. | None |
chunk_size | int | Maximum number of rows in each returned DataFrame . |
1000000 |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
Iterator[pd.DataFrame] |
Write the results of executing the given expression to a parquet file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetWriter.html for details.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | A string or Path where the Parquet file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.parquet.ParquetWriter | {} |
Write out an expression to a single parquet file.
Partition on a single column.
Partition on multiple columns.
Write the results of executing the given expression to a parquet file in a directory.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.dataset.write_dataset.html for details.
Name | Type | Description | Default |
---|---|---|---|
directory | str | Path | The data target. A string or Path to the directory where the parquet file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.dataset.write_dataset | {} |
Execute expression and return results as a polars dataframe.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
DataFrame |
A polars dataframe holding the results of the executed expression. |
Execute expression to a pyarrow object.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means no limit. The default is in ibis/config.py . |
None |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
result |
If the passed expression is a Table, a pyarrow table is returned. If the passed expression is a Column, a pyarrow array is returned. If the passed expression is a Scalar, a pyarrow scalar is returned. |
Execute expression and return a RecordBatchReader.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value. | None |
chunk_size | int | Maximum number of rows in each returned record batch. | 1000000 |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
results |
RecordBatchReader |
Compile to a formatted SQL string.
Name | Type | Description | Default |
---|---|---|---|
dialect | str | None | SQL dialect to use for compilation. Uses the dialect bound to self if not specified, or the default dialect if no dialect is bound. | None |
pretty | bool | Whether to use pretty formatting. | True |
kwargs | Scalar parameters | {} |
Name | Type | Description |
---|---|---|
str | Formatted SQL string |
You can also specify the SQL dialect to use for compilation:
Value.to_sql()
Table.to_sql()
ibis.to_sql()
Value.compile()
Table.compile()
Execute an expression and return results as a dictionary of torch tensors.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Parameters to substitute into the expression. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means no limit. |
None |
kwargs | Any | Keyword arguments passed into the backend’s to_torch implementation. |
{} |
Name | Type | Description |
---|---|---|
dict[str, torch .Tensor ] |
A dictionary of torch tensors, keyed by column name. |
Write a table to an Excel file.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | Excel output path. | required |
sheet | str | The name of the sheet to write to, eg ‘Sheet3’. | 'Sheet1' |
header | bool | Whether to include the column names as the first row. | False |
params | Mapping[ir .Scalar , Any] | None |
Additional Ibis expression parameters to pass to the backend’s write function. | None |
kwargs | Any | Additional arguments passed to the backend’s write function. | {} |
Requires DuckDB >= 1.2.0.
Try cast expression to indicated data type.
If the cast fails for a row, the value is returned as null or NaN depending on target_type and backend behavior.
Name | Type | Description | Default |
---|---|---|---|
target_type | Any | Type to try cast to. Anything accepted by ibis.dtype() |
required |
Name | Type | Description |
---|---|---|
Value | Casted expression |
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"numbers": [1, 2, 3, 4], "strings": ["1.0", "2", "hello", "world"]})
>>> t
┏━━━━━━━━━┳━━━━━━━━━┓ ┃ numbers ┃ strings ┃ ┡━━━━━━━━━╇━━━━━━━━━┩ │ int64 │ string │ ├─────────┼─────────┤ │ 1 │ 1.0 │ │ 2 │ 2 │ │ 3 │ hello │ │ 4 │ world │ └─────────┴─────────┘
>>> t = t.mutate(numbers_to_strings=_.numbers.try_cast("string"))
>>> t = t.mutate(strings_to_numbers=_.strings.try_cast("int"))
>>> t
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓ ┃ numbers ┃ strings ┃ numbers_to_strings ┃ strings_to_numbers ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩ │ int64 │ string │ string │ int64 │ ├─────────┼─────────┼────────────────────┼────────────────────┤ │ 1 │ 1.0 │ 1 │ 1 │ │ 2 │ 2 │ 2 │ 2 │ │ 3 │ hello │ 3 │ NULL │ │ 4 │ world │ 4 │ NULL │ └─────────┴─────────┴────────────────────┴────────────────────┘
Return the DataType of self
.
Return the string name of the datatype of self.
The values of the returned strings are necessarily backend dependent. e.g. duckdb may say “DOUBLE”, while sqlite may say “real”.
Name | Type | Description |
---|---|---|
StringValue | A string indicating the type of the value |
>>> import ibis
>>> ibis.options.interactive = True
>>> vals = ibis.examples.penguins.fetch().head(5).bill_length_mm
>>> vals
┏━━━━━━━━━━━━━━━━┓ ┃ bill_length_mm ┃ ┡━━━━━━━━━━━━━━━━┩ │ float64 │ ├────────────────┤ │ 39.1 │ │ 39.5 │ │ 40.3 │ │ NULL │ │ 36.7 │ └────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ TypeOf(bill_length_mm) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━┩ │ string │ ├────────────────────────┤ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ └────────────────────────┘
Different backends have different names for their native types
Return an expression built on UnboundTable
instead of backend-specific objects.
>>> import ibis
>>> import pandas as pd
>>> duckdb_con = ibis.duckdb.connect()
>>> polars_con = ibis.polars.connect()
>>> for backend in (duckdb_con, polars_con):
... t = backend.create_table("t", pd.DataFrame({"a": [1, 2, 3]}))
>>> bound_table = duckdb_con.table("t")
>>> bound_table.get_backend().name
'duckdb'
visualize(
format='svg',
*,
label_edges=False,
verbose=False,
node_attr=None,
node_attr_getter=None,
edge_attr=None,
edge_attr_getter=None,
)
Visualize an expression as a GraphViz graph in the browser.
Name | Type | Description | Default |
---|---|---|---|
format | str | Image output format. These are specified by the graphviz Python library. |
'svg' |
label_edges | bool | Show operation input names as edge labels | False |
verbose | bool | Print the graphviz DOT code to stderr if True |
False |
node_attr | Mapping[str, str] | None | Mapping of (attribute, value) pairs set for all nodes. Options are specified by the graphviz Python library. |
None |
node_attr_getter | NodeAttributeGetter | None |
Callback taking a node and returning a mapping of (attribute, value) pairs for that node. Options are specified by the graphviz Python library. |
None |
edge_attr | Mapping[str, str] | None | Mapping of (attribute, value) pairs set for all edges. Options are specified by the graphviz Python library. |
None |
edge_attr_getter | EdgeAttributeGetter | None |
Callback taking two adjacent nodes and returning a mapping of (attribute, value) pairs for the edge between those nodes. Options are specified by the graphviz Python library. |
None |
Open the visualization of an expression in default browser:
>>> import ibis
>>> import ibis.expr.operations as ops
>>> left = ibis.table(dict(a="int64", b="string"), name="left")
>>> right = ibis.table(dict(b="string", c="int64", d="string"), name="right")
>>> expr = left.inner_join(right, "b").select(left.a, b=right.c, c=right.d)
>>> expr.visualize(
... format="svg",
... label_edges=True,
... node_attr={"fontname": "Roboto Mono", "fontsize": "10"},
... node_attr_getter=lambda node: isinstance(node, ops.Field) and {"shape": "oval"},
... edge_attr={"fontsize": "8"},
... edge_attr_getter=lambda u, v: isinstance(u, ops.Field) and {"color": "red"},
... ) # quartodoc: +SKIP
Name | Type | Description |
---|---|---|
ImportError | If graphviz is not installed. |
Name | Description |
---|---|
approx_median | Return an approximate of the median of self . |
approx_nunique | Return the approximate number of distinct elements in self . |
arbitrary | Select an arbitrary value in a column. |
argmax | Return the value of self that maximizes key . |
argmin | Return the value of self that minimizes key . |
as_scalar | Inform ibis that the expression should be treated as a scalar. |
as_table | Promote the expression to a Table. |
asc | Sort an expression in ascending order. |
between | Check if this expression is between lower and upper , inclusive. |
case | DEPRECATED: use value.cases() or ibis.cases() instead. |
cases | Create a multi-branch if-else expression. |
cast | Cast expression to indicated data type. |
coalesce | Return the first non-null value from args . |
collect | Aggregate this expression’s elements into an array. |
compile | Compile expr to a SQL string (for SQL backends) or a LazyFrame (for the polars backend). |
count | Compute the number of rows in an expression. |
cume_dist | Return the cumulative distribution over a window. |
cummax | Return the cumulative max over a window. |
cummin | Return the cumulative min over a window. |
dense_rank | Position of first element within each group of equal values. |
desc | Sort an expression in descending order. |
equals | Return whether this expression is structurally equivalent to other . |
execute | Execute an expression against its backend if one exists. |
fill_null | Replace NULL s with the given value. Does NOT affect NaN and inf values. |
fillna | DEPRECATED: use fill_null instead, which acts exactly the same. |
first | Return the first value of a column. |
get_backend | Get the current Ibis backend of the expression. |
get_name | Return the name of this expression. |
group_concat | Concatenate values using the indicated separator to produce a string. |
hash | Compute an integer hash value. |
identical_to | Return whether this expression is identical to other. |
isin | Check whether this expression is in values . |
isnull | Whether this expression is NULL . Does NOT detect NaN and inf values. |
lag | Return the row located at offset rows before the current row. |
last | Return the last value of a column. |
lead | Return the row located at offset rows after the current row. |
max | Return the maximum of a column. |
median | Return the median of the column. |
min | Return the minimum of a column. |
mode | Return the mode of a column. |
name | Rename an expression to name . |
notin | Check whether this expression is not in values . |
notnull | Return whether this expression is not NULL. |
nth | Return the n th value (0-indexed) over a window. |
ntile | Return the integer number of a partitioning of the column values. |
nullif | Set values to null if they equal the values null_if_expr . |
nunique | Compute the number of distinct rows in an expression. |
over | Construct a window expression. |
percent_rank | Return the relative rank of the values in the column. |
pipe | Compose f with self . |
preview | Print a subset as a single-column Rich Table. |
quantile | Return value at the given quantile. |
rank | Compute position of first element within each equal-value group in sorted order. |
substitute | Replace values given in values with replacement . |
to_csv | Write the results of executing the given expression to a CSV file. |
to_delta | Write the results of executing the given expression to a Delta Lake table. |
to_json | Write the results of expr to a json file of [{column -> value}, …] objects. |
to_list | Convert a column expression to a list. |
to_pandas | Convert a table expression to a pandas DataFrame. |
to_pandas_batches | Execute expression and return an iterator of pandas DataFrames. |
to_parquet | Write the results of executing the given expression to a parquet file. |
to_parquet_dir | Write the results of executing the given expression to a parquet file in a directory. |
to_polars | Execute expression and return results as a polars dataframe. |
to_pyarrow_batches | Execute expression and return a RecordBatchReader. |
to_sql | Compile to a formatted SQL string. |
to_torch | Execute an expression and return results as a dictionary of torch tensors. |
to_xlsx | Write a table to an Excel file. |
topk | Computes a Table of the top values by a metric (defaults to count). |
try_cast | Try cast expression to indicated data type. |
type | Return the DataType of self . |
typeof | Return the string name of the datatype of self. |
unbind | Return an expression built on UnboundTable instead of backend-specific objects. |
value_counts | Compute a frequency table. |
visualize | Visualize an expression as a GraphViz graph in the browser. |
Return an approximate of the median of self
.
Whether the result is an approximation depends on the backend.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter in values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | An approximation of the median of self |
Return the approximate number of distinct elements in self
.
Whether the result is an approximation depends on the backend.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter in values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | An approximate count of the distinct elements of self |
Select an arbitrary value in a column.
Returns an arbitrary (nondeterministic, backend-specific) value from the column. The value will be non-NULL, except if the column is empty or all values are NULL.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
A filter expression | None |
Name | Type | Description |
---|---|---|
Scalar | An expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2, 2], "b": list("aaa"), "c": [4.0, 4.1, 4.2]})
>>> t
┏━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ a ┃ b ┃ c ┃ ┡━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ int64 │ string │ float64 │ ├───────┼────────┼─────────┤ │ 1 │ a │ 4.0 │ │ 2 │ a │ 4.1 │ │ 2 │ a │ 4.2 │ └───────┴────────┴─────────┘
Return the value of self
that maximizes key
.
If more than one value maximizes key
, the returned value is backend specific. The result may be NULL
.
Name | Type | Description | Default |
---|---|---|---|
key | ir .Value |
Key to use for max computation. |
required |
where | ir .BooleanValue | None |
Keep values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | The value of self that maximizes key |
Return the value of self
that minimizes key
.
If more than one value minimizes key
, the returned value is backend specific. The result may be NULL
.
Name | Type | Description | Default |
---|---|---|---|
key | ir .Value |
Key to use for min computation. |
required |
where | ir .BooleanValue | None |
Keep values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | The value of self that minimizes key |
Inform ibis that the expression should be treated as a scalar.
Creates a scalar subquery from the column expression. Since ibis cannot be sure that the column expression contains only one value, the column expression is wrapped in a scalar subquery and treated as a scalar.
Note that the execution of the scalar subquery will fail if the column expression contains more than one value.
Name | Type | Description |
---|---|---|
Scalar | A scalar subquery |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> heavy_gentoo = t.filter(t.species == "Gentoo", t.body_mass_g > 6200)
>>> from_that_island = t.filter(t.island == heavy_gentoo.island.as_scalar())
>>> from_that_island.species.value_counts().order_by("species")
┏━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ species ┃ species_count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ string │ int64 │ ├─────────┼───────────────┤ │ Adelie │ 44 │ │ Gentoo │ 124 │ └─────────┴───────────────┘
Promote the expression to a Table.
Name | Type | Description |
---|---|---|
Table |
A table expression |
Sort an expression in ascending order.
Name | Type | Description | Default |
---|---|---|---|
nulls_first | bool | Whether to sort NULL values first |
False |
Name | Type | Description |
---|---|---|
Value | Sorted expression |
Check if this expression is between lower
and upper
, inclusive.
Name | Type | Description | Default |
---|---|---|---|
lower | Value | Lower bound, inclusive | required |
upper | Value | Upper bound, inclusive | required |
Name | Type | Description |
---|---|---|
BooleanValue | Expression indicating membership in the provided range |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch().limit(5)
>>> t.bill_length_mm.between(35, 38)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Between(bill_length_mm, 35, 38) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ boolean │ ├─────────────────────────────────┤ │ False │ │ False │ │ False │ │ NULL │ │ True │ └─────────────────────────────────┘
DEPRECATED: use value.cases()
or ibis.cases()
instead.
Create a multi-branch if-else expression.
Equivalent to a SQL CASE
statement.
Name | Type | Description | Default |
---|---|---|---|
branch | tuple[Value, Value] | First (condition , result ) pair. Required. |
required |
branches | tuple[Value, Value] | Additional (condition , result ) pairs. We look through the test values in order and return the result corresponding to the first test value that matches self . If none match, we return else_ . |
() |
else_ | Value | None | Value to return if none of the case conditions evaluate to True . Defaults to NULL . |
None |
Name | Type | Description |
---|---|---|
Value | A value expression |
Note that we never hit the None
case, because x = NULL
is always NULL
, which is not truthy. If you want to replace NULL
s, you should use .fill_null(some_value)
prior to cases()
.
>>> t.mutate(
... result=(
... t.symbol.cases(
... ("+", t.left + t.right),
... ("-", t.left - t.right),
... ("*", t.left * t.right),
... ("/", t.left / t.right),
... (None, -999),
... )
... )
... )
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━┓ ┃ left ┃ symbol ┃ right ┃ result ┃ ┡━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━┩ │ int64 │ string │ int64 │ float64 │ ├───────┼────────┼───────┼─────────┤ │ 5 │ + │ 1 │ 6.0 │ │ 6 │ - │ 2 │ 4.0 │ │ 7 │ * │ 3 │ 21.0 │ │ 8 │ / │ 4 │ 2.0 │ │ 9 │ bogus │ 5 │ NULL │ │ 10 │ NULL │ 6 │ NULL │ └───────┴────────┴───────┴─────────┘
Cast expression to indicated data type.
Similar to pandas.Series.astype
.
Name | Type | Description | Default |
---|---|---|---|
target_type | Any | Type to cast to. Anything accepted by ibis.dtype() |
required |
Name | Type | Description |
---|---|---|
Value | Casted expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> x = ibis.examples.penguins.fetch()["bill_depth_mm"]
>>> x
┏━━━━━━━━━━━━━━━┓ ┃ bill_depth_mm ┃ ┡━━━━━━━━━━━━━━━┩ │ float64 │ ├───────────────┤ │ 18.7 │ │ 17.4 │ │ 18.0 │ │ NULL │ │ 19.3 │ │ 20.6 │ │ 17.8 │ │ 19.6 │ │ 18.1 │ │ 20.2 │ │ … │ └───────────────┘
python’s built-in types can be used
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Cast(bill_depth_mm, int64) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ int64 │ ├────────────────────────────┤ │ 19 │ │ 17 │ │ 18 │ │ NULL │ │ 19 │ │ 21 │ │ 18 │ │ 20 │ │ 18 │ │ 20 │ │ … │ └────────────────────────────┘
or string names
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Cast(bill_depth_mm, uint16) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ uint16 │ ├─────────────────────────────┤ │ 19 │ │ 17 │ │ 18 │ │ NULL │ │ 19 │ │ 21 │ │ 18 │ │ 20 │ │ 18 │ │ 20 │ │ … │ └─────────────────────────────┘
If you make an illegal cast, you won’t know until the backend actually executes it. Consider .try_cast()
.
--------------------------------------------------------------------------- ConversionException Traceback (most recent call last) File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/core/formatters.py:770, in PlainTextFormatter.__call__(self, obj) 763 stream = StringIO() 764 printer = pretty.RepresentationPrinter(stream, self.verbose, 765 self.max_width, self.newline, 766 max_seq_length=self.max_seq_length, 767 singleton_pprinters=self.singleton_printers, 768 type_pprinters=self.type_printers, 769 deferred_pprinters=self.deferred_printers) --> 770 printer.pretty(obj) 771 printer.flush() 772 return stream.getvalue() File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/lib/pretty.py:411, in RepresentationPrinter.pretty(self, obj) 400 return meth(obj, self, cycle) 401 if ( 402 cls is not object 403 # check if cls defines __repr__ (...) 409 and callable(_safe_getattr(cls, "__repr__", None)) 410 ): --> 411 return _repr_pprint(obj, self, cycle) 413 return _default_pprint(obj, self, cycle) 414 finally: File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/IPython/lib/pretty.py:786, in _repr_pprint(obj, p, cycle) 784 """A pprint that just redirects to the normal repr function.""" 785 # Find newlines and replace them with p.break_() --> 786 output = repr(obj) 787 lines = output.splitlines() 788 with p.group(): File ~/work/ibis/ibis/ibis/expr/types/core.py:55, in Expr.__repr__(self) 53 def __repr__(self) -> str: 54 if ibis.options.interactive: ---> 55 return capture_rich_renderable(self) 56 else: 57 return self._noninteractive_repr() File ~/work/ibis/ibis/ibis/expr/types/rich.py:48, in capture_rich_renderable(renderable) 46 console = Console(force_terminal=False) 47 with _with_rich_display_disabled(), console.capture() as capture: ---> 48 console.print(renderable) 49 return capture.get().rstrip() File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/rich/console.py:1719, in Console.print(self, sep, end, style, justify, overflow, no_wrap, emoji, markup, highlight, width, height, crop, soft_wrap, new_line_start, *objects) 1717 if style is None: 1718 for renderable in renderables: -> 1719 extend(render(renderable, render_options)) 1720 else: 1721 for renderable in renderables: File /nix/store/6mpfjl7vrnlya1738m8isq56zvnx3xsd-ibis-3.12/lib/python3.12/site-packages/rich/console.py:1320, in Console.render(self, renderable, options) 1318 renderable = rich_cast(renderable) 1319 if hasattr(renderable, "__rich_console__") and not isclass(renderable): -> 1320 render_iterable = renderable.__rich_console__(self, _options) 1321 elif isinstance(renderable, str): 1322 text_renderable = self.render_str( 1323 renderable, highlight=_options.highlight, markup=_options.markup 1324 ) File ~/work/ibis/ibis/ibis/expr/types/core.py:76, in Expr.__rich_console__(self, console, options) 74 try: 75 if opts.interactive: ---> 76 rich_object = to_rich(self, console_width=console_width) 77 else: 78 rich_object = Text(self._noninteractive_repr()) File ~/work/ibis/ibis/ibis/expr/types/rich.py:66, in to_rich(expr, max_rows, max_columns, max_length, max_string, max_depth, console_width) 63 from ibis.expr.types._rich import to_rich_scalar, to_rich_table 65 if isinstance(expr, ir.Scalar): ---> 66 return to_rich_scalar( 67 expr, max_length=max_length, max_string=max_string, max_depth=max_depth 68 ) 69 else: 70 return to_rich_table( 71 expr, 72 max_rows=max_rows, (...) 77 console_width=console_width, 78 ) File ~/work/ibis/ibis/ibis/expr/types/_rich.py:270, in to_rich_scalar(expr, max_length, max_string, max_depth) 263 def to_rich_scalar( 264 expr: Scalar, 265 *, (...) 268 max_depth: int | None = None, 269 ) -> Panel: --> 270 value = expr.to_pyarrow().as_py() 272 if value is None: 273 formatted_value = Text.styled("NULL", style="dim") File ~/work/ibis/ibis/ibis/expr/types/generic.py:1424, in Scalar.to_pyarrow(self, params, limit, **kwargs) 1416 @experimental 1417 def to_pyarrow( 1418 self, (...) 1422 **kwargs: Any, 1423 ) -> pa.Scalar: -> 1424 return super().to_pyarrow(params=params, limit=limit, **kwargs) File ~/work/ibis/ibis/ibis/expr/types/core.py:605, in Expr.to_pyarrow(self, params, limit, **kwargs) 575 @experimental 576 def to_pyarrow( 577 self, (...) 581 **kwargs: Any, 582 ) -> pa.Table | pa.Array | pa.Scalar: 583 """Execute expression to a pyarrow object. 584 585 This method is eager and will execute the associated expression (...) 603 If the passed expression is a Scalar, a pyarrow scalar is returned. 604 """ --> 605 return self._find_backend(use_default=True).to_pyarrow( 606 self, params=params, limit=limit, **kwargs 607 ) File ~/work/ibis/ibis/ibis/backends/duckdb/__init__.py:1374, in Backend.to_pyarrow(self, expr, params, limit, **kwargs) 1361 def to_pyarrow( 1362 self, 1363 expr: ir.Expr, (...) 1368 **kwargs: Any, 1369 ) -> pa.Table: 1370 from ibis.backends.duckdb.converter import DuckDBPyArrowData 1372 table = self._to_duckdb_relation( 1373 expr, params=params, limit=limit, **kwargs -> 1374 ).arrow() 1375 return expr.__pyarrow_result__(table, data_mapper=DuckDBPyArrowData) ConversionException: Conversion Error: Could not convert string 'a string' to INT64 LINE 1: SELECT CAST('a string' AS BIGINT) AS "Cast('a string', int64)" ^
Return the first non-null value from args
.
Name | Type | Description | Default |
---|---|---|---|
args | Value | Arguments from which to choose the first non-null value | () |
Name | Type | Description |
---|---|---|
Value | Coalesced expression |
Aggregate this expression’s elements into an array.
This function is called array_agg
, list_agg
, or list
in other systems.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the order of the items in the result is undefined and backend specific. | None |
include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
distinct | bool | Whether to collect only distinct elements. | False |
Name | Type | Description |
---|---|---|
ArrayScalar |
An array of all the collected elements. |
Basic collect usage
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"key": list("aaabb"), "value": [1, 1, 2, 3, 5]})
>>> t
┏━━━━━━━━┳━━━━━━━┓ ┃ key ┃ value ┃ ┡━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├────────┼───────┤ │ a │ 1 │ │ a │ 1 │ │ a │ 2 │ │ b │ 3 │ │ b │ 5 │ └────────┴───────┘
Collect all elements into an array scalar:
Collect only unique elements:
Collect elements in a specified order:
Collect elements per group, filtering out values <= 1:
Compile expr
to a SQL string (for SQL backends) or a LazyFrame (for the polars backend).
Name | Type | Description | Default |
---|---|---|---|
limit | int | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value | None |
pretty | bool | In case of SQL backends, return a pretty formatted SQL query. | False |
Name | Type | Description |
---|---|---|
str | pl .LazyFrame |
A SQL string or a LazyFrame object, depending on the backend of self. |
>>> import ibis
>>> d = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> con = ibis.duckdb.connect()
>>> t = con.create_table("t", d)
>>> expr = t.mutate(c=t.a + t.b)
>>> expr.compile()
'SELECT "t0"."a", "t0"."b", "t0"."a" + "t0"."b" AS "c" FROM "memory"."main"."t" AS "t0"'
If you want to see the pretty formatted SQL query, set pretty
to True
.
'SELECT\n "t0"."a",\n "t0"."b",\n "t0"."a" + "t0"."b" AS "c"\nFROM "memory"."main"."t" AS "t0"'
If the expression does not have a backend, an error will be raised.
>>> t = ibis.memtable(d)
>>> expr = t.mutate(c=t.a + t.b)
>>> expr.compile() # quartodoc: +EXPECTED_FAILURE
--------------------------------------------------------------------------- IbisError Traceback (most recent call last) Cell In[299], line 3 1 t = ibis.memtable(d) 2 expr = t.mutate(c=t.a + t.b) ----> 3 expr.compile() # quartodoc: +EXPECTED_FAILURE File ~/work/ibis/ibis/ibis/expr/types/core.py:532, in Expr.compile(self, limit, params, pretty) 479 def compile( 480 self, 481 *, (...) 484 pretty: bool = False, 485 ) -> str | pl.LazyFrame: 486 r"""Compile `expr` to a SQL string (for SQL backends) or a LazyFrame (for the polars backend). 487 488 Parameters (...) 530 [`Table.to_sql()`](./expression-tables.qmd#ibis.expr.types.relations.Table.to_sql) 531 """ --> 532 return self._find_backend().compile( 533 self, limit=limit, params=params, pretty=pretty 534 ) File ~/work/ibis/ibis/ibis/expr/types/core.py:338, in Expr._find_backend(self, use_default) 336 default = _default_backend() if use_default else None 337 if default is None: --> 338 raise IbisError( 339 "Expression depends on no backends, and found no default" 340 ) 341 return default 343 if len(backends) > 1: IbisError: Expression depends on no backends, and found no default
Value.compile()
Table.compile()
Value.to_sql()
Table.to_sql()
Compute the number of rows in an expression.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter expression | None |
Name | Type | Description |
---|---|---|
IntegerScalar |
Number of elements in an expression |
Return the cumulative distribution over a window.
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(cume_dist=t.values.cume_dist())
┏━━━━━━━━┳━━━━━━━━━━━┓ ┃ values ┃ cume_dist ┃ ┡━━━━━━━━╇━━━━━━━━━━━┩ │ int64 │ float64 │ ├────────┼───────────┤ │ 1 │ 0.333333 │ │ 1 │ 0.333333 │ │ 2 │ 0.833333 │ │ 2 │ 0.833333 │ │ 2 │ 0.833333 │ │ 3 │ 1.000000 │ └────────┴───────────┘
Return the cumulative max over a window.
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {
... "id": [1, 2, 3, 4, 5, 6],
... "grouper": ["a", "a", "a", "b", "b", "c"],
... "values": [3, 2, 1, 2, 3, 2],
... }
... )
>>> t.mutate(cummax=t.values.cummax())
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓ ┃ id ┃ grouper ┃ values ┃ cummax ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩ │ int64 │ string │ int64 │ int64 │ ├───────┼─────────┼────────┼────────┤ │ 1 │ a │ 3 │ 3 │ │ 2 │ a │ 2 │ 3 │ │ 3 │ a │ 1 │ 3 │ │ 4 │ b │ 2 │ 3 │ │ 5 │ b │ 3 │ 3 │ │ 6 │ c │ 2 │ 3 │ └───────┴─────────┴────────┴────────┘
>>> t.mutate(cummax=t.values.cummax(where=t.grouper != "c", group_by=t.grouper)).order_by(
... t.id
... )
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓ ┃ id ┃ grouper ┃ values ┃ cummax ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩ │ int64 │ string │ int64 │ int64 │ ├───────┼─────────┼────────┼────────┤ │ 1 │ a │ 3 │ 3 │ │ 2 │ a │ 2 │ 3 │ │ 3 │ a │ 1 │ 3 │ │ 4 │ b │ 2 │ 2 │ │ 5 │ b │ 3 │ 3 │ │ 6 │ c │ 2 │ NULL │ └───────┴─────────┴────────┴────────┘
Return the cumulative min over a window.
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {
... "id": [1, 2, 3, 4, 5, 6],
... "grouper": ["a", "a", "a", "b", "b", "c"],
... "values": [3, 2, 1, 2, 3, 2],
... }
... )
>>> t.mutate(cummin=t.values.cummin())
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓ ┃ id ┃ grouper ┃ values ┃ cummin ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩ │ int64 │ string │ int64 │ int64 │ ├───────┼─────────┼────────┼────────┤ │ 1 │ a │ 3 │ 3 │ │ 2 │ a │ 2 │ 2 │ │ 3 │ a │ 1 │ 1 │ │ 4 │ b │ 2 │ 1 │ │ 5 │ b │ 3 │ 1 │ │ 6 │ c │ 2 │ 1 │ └───────┴─────────┴────────┴────────┘
>>> t.mutate(cummin=t.values.cummin(where=t.grouper != "c", group_by=t.grouper)).order_by(
... t.id
... )
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓ ┃ id ┃ grouper ┃ values ┃ cummin ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩ │ int64 │ string │ int64 │ int64 │ ├───────┼─────────┼────────┼────────┤ │ 1 │ a │ 3 │ 3 │ │ 2 │ a │ 2 │ 2 │ │ 3 │ a │ 1 │ 1 │ │ 4 │ b │ 2 │ 2 │ │ 5 │ b │ 3 │ 2 │ │ 6 │ c │ 2 │ NULL │ └───────┴─────────┴────────┴────────┘
Position of first element within each group of equal values.
Values are returned in sorted order and duplicate values are ignored.
Equivalent to SQL’s DENSE_RANK()
.
Name | Type | Description |
---|---|---|
IntegerColumn | The rank |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(rank=t.values.dense_rank())
┏━━━━━━━━┳━━━━━━━┓ ┃ values ┃ rank ┃ ┡━━━━━━━━╇━━━━━━━┩ │ int64 │ int64 │ ├────────┼───────┤ │ 1 │ 0 │ │ 1 │ 0 │ │ 2 │ 1 │ │ 2 │ 1 │ │ 2 │ 1 │ │ 3 │ 2 │ └────────┴───────┘
Sort an expression in descending order.
Name | Type | Description | Default |
---|---|---|---|
nulls_first | bool | Whether to sort NULL values first. |
False |
Name | Type | Description |
---|---|---|
Value | Sorted expression |
Return whether this expression is structurally equivalent to other
.
If you want to produce an equality expression, use ==
syntax.
Name | Type | Description | Default |
---|---|---|---|
other | Another expression | required |
Execute an expression against its backend if one exists.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
'default' |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value | None |
kwargs | Any | Keyword arguments | {} |
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | None | 2007 |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
339 | Chinstrap | Dream | 55.8 | 19.8 | 207.0 | 4000.0 | male | 2009 |
340 | Chinstrap | Dream | 43.5 | 18.1 | 202.0 | 3400.0 | female | 2009 |
341 | Chinstrap | Dream | 49.6 | 18.2 | 193.0 | 3775.0 | male | 2009 |
342 | Chinstrap | Dream | 50.8 | 19.0 | 210.0 | 4100.0 | male | 2009 |
343 | Chinstrap | Dream | 50.2 | 18.7 | 198.0 | 3775.0 | female | 2009 |
344 rows × 8 columns
Scalar parameters can be supplied dynamically during execution.
>>> species = ibis.param("string")
>>> expr = t.filter(t.species == species).order_by(t.bill_length_mm)
>>> expr.execute(limit=3, params={species: "Gentoo"})
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Gentoo | Biscoe | 40.9 | 13.7 | 214 | 4650 | female | 2007 |
1 | Gentoo | Biscoe | 41.7 | 14.7 | 210 | 4700 | female | 2009 |
2 | Gentoo | Biscoe | 42.0 | 13.5 | 210 | 4150 | female | 2007 |
Replace NULL
s with the given value. Does NOT affect NaN
and inf
values.
This only replaces genuine NULL
values, it does NOT affect NaN
and inf
values for floating point types.
Name | Type | Description | Default |
---|---|---|---|
fill_value | Scalar | Value with which to replace NULL values in self |
required |
Value.coalesce()
ibis.coalesce()
Value.isnull()
FloatingValue.isnan()
FloatingValue.isinf()
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"f": [None, "-inf", "3.0", "inf", "nan"]})
>>> t = t.mutate(f=ibis._.f.cast(float))
>>> t = t.mutate(filled=t.f.fill_null(99))
>>> t
┏━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ filled ┃ ┡━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ float64 │ ├─────────┼─────────┤ │ NULL │ 99.0 │ │ -inf │ -inf │ │ 3.0 │ 3.0 │ │ inf │ inf │ │ nan │ nan │ └─────────┴─────────┘
If you want to fill all NaN
and inf
values as well, use something like the following:
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ filled ┃ filled2 ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ float64 │ float64 │ ├─────────┼─────────┼─────────┤ │ NULL │ 99.0 │ 99.0 │ │ -inf │ -inf │ 99.0 │ │ 3.0 │ 3.0 │ 3.0 │ │ inf │ inf │ 99.0 │ │ nan │ nan │ 99.0 │ └─────────┴─────────┴─────────┘
Name | Type | Description |
---|---|---|
Value | self filled with fill_value where it is NULL |
DEPRECATED: use fill_null
instead, which acts exactly the same.
Return the first value of a column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of first is undefined and will be backend specific. |
None |
include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
Get the current Ibis backend of the expression.
Name | Type | Description |
---|---|---|
BaseBackend |
The Ibis backend. |
Return the name of this expression.
Concatenate values using the indicated separator to produce a string.
Name | Type | Description | Default |
---|---|---|---|
sep | str | The separator to use to join strings. | ',' |
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the order of the items in the result is undefined and backend specific. | None |
Name | Type | Description |
---|---|---|
StringScalar |
Concatenated string expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch().limit(5)
>>> t[["bill_length_mm", "bill_depth_mm"]]
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ bill_length_mm ┃ bill_depth_mm ┃ ┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ float64 │ float64 │ ├────────────────┼───────────────┤ │ 39.1 │ 18.7 │ │ 39.5 │ 17.4 │ │ 40.3 │ 18.0 │ │ NULL │ NULL │ │ 36.7 │ 19.3 │ └────────────────┴───────────────┘
┌─────────────────────┐
│ 39.1,39.5,40.3,36.7 │
└─────────────────────┘
┌────────────────────────┐
│ 39.1: 39.5: 40.3: 36.7 │
└────────────────────────┘
Compute an integer hash value.
Name | Type | Description |
---|---|---|
IntegerValue | The hash value of self |
Return whether this expression is identical to other.
Corresponds to IS NOT DISTINCT FROM
in SQL.
Name | Type | Description | Default |
---|---|---|---|
other | Value | Expression to compare to | required |
Name | Type | Description |
---|---|---|
BooleanValue | Whether this expression is not distinct from other |
Check whether this expression is in values
.
NULL
values in the input are propagated in the output. If the values
argument contains any NULL
values, then ibis follows the SQL behavior of returning NULL
(not False) when self
is not present. See examples below for details.
Name | Type | Description | Default |
---|---|---|---|
values | ir .ArrayValue | ir .Column | Iterable[Value] |
Values or expression to check for membership. | required |
Name | Type | Description |
---|---|---|
BooleanValue | True if self is contained in values , False otherwise. |
Checking for values in literals:
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_12 ┃ a_in_12None ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ ├───────┼───────┼─────────┼─────────────┤ │ 1 │ 1 │ True │ True │ │ 2 │ 2 │ True │ True │ │ 3 │ 9 │ False │ NULL │ │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴─────────────┘
Checking for values in columns of the same table:
>>> t.mutate(
... a_in_b=t.a.isin(t.b),
... a_in_b_no_null=t.a.isin(t.b.fill_null(0)),
... a_in_b_plus_1=t.a.isin(t.b + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_b ┃ a_in_b_no_null ┃ a_in_b_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼─────────┼────────────────┼───────────────┤ │ 1 │ 1 │ True │ True │ NULL │ │ 2 │ 2 │ True │ True │ True │ │ 3 │ 9 │ NULL │ False │ True │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴────────────────┴───────────────┘
Checking for values in a column from a different table:
>>> t2 = ibis.memtable({"x": [1, 2, 99], "y": [1, 2, None]})
>>> t.mutate(
... a_in_x=t.a.isin(t2.x),
... a_in_y=t.a.isin(t2.y),
... a_in_y_plus_1=t.a.isin(t2.y + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_in_x ┃ a_in_y ┃ a_in_y_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼─────────┼─────────┼───────────────┤ │ 1 │ 1 │ True │ True │ NULL │ │ 2 │ 2 │ True │ True │ True │ │ 3 │ 9 │ False │ NULL │ True │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴─────────┴─────────┴───────────────┘
Whether this expression is NULL
. Does NOT detect NaN
and inf
values.
For FloatingValue types, use FloatingValue.isnan()
and FloatingValue.isinf()
to detect NaN
and inf
values.
Value.fill_null()
FloatingValue.isnan()
FloatingValue.isinf()
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"f": [None, "-inf", "3.0", "inf", "nan"]})
>>> t = t.mutate(f=ibis._.f.cast(float))
>>> t.mutate(
... isnull=t.f.isnull(),
... isnan=t.f.isnan(),
... isinf=t.f.isinf(),
... )
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ f ┃ isnull ┃ isnan ┃ isinf ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩ │ float64 │ boolean │ boolean │ boolean │ ├─────────┼─────────┼─────────┼─────────┤ │ NULL │ True │ NULL │ NULL │ │ -inf │ False │ False │ True │ │ 3.0 │ False │ False │ False │ │ inf │ False │ False │ True │ │ nan │ False │ True │ False │ └─────────┴─────────┴─────────┴─────────┘
Return the row located at offset
rows before the current row.
Name | Type | Description | Default |
---|---|---|---|
offset | int | ir .IntegerValue | None |
Index of row to select | None |
default | Value | None | Value used if no row exists at offset |
None |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {"year": [2007, 2008, 2009, 2010], "total": [1899.6, 1928.2, 2037.9, 1955.2]}
... )
>>> t.mutate(total_lead=t.total.lag())
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ year ┃ total ┃ total_lead ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩ │ int64 │ float64 │ float64 │ ├───────┼─────────┼────────────┤ │ 2007 │ 1899.6 │ NULL │ │ 2008 │ 1928.2 │ 1899.6 │ │ 2009 │ 2037.9 │ 1928.2 │ │ 2010 │ 1955.2 │ 2037.9 │ └───────┴─────────┴────────────┘
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ year ┃ total ┃ total_lead ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩ │ int64 │ float64 │ float64 │ ├───────┼─────────┼────────────┤ │ 2007 │ 1899.6 │ 0.0 │ │ 2008 │ 1928.2 │ 0.0 │ │ 2009 │ 2037.9 │ 1899.6 │ │ 2010 │ 1955.2 │ 1928.2 │ └───────┴─────────┴────────────┘
Return the last value of a column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of last is undefined and will be backend specific. |
None |
include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
Return the row located at offset
rows after the current row.
Name | Type | Description | Default |
---|---|---|---|
offset | int | ir .IntegerValue | None |
Index of row to select | None |
default | Value | None | Value used if no row exists at offset |
None |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... {"year": [2007, 2008, 2009, 2010], "total": [1899.6, 1928.2, 2037.9, 1955.2]}
... )
>>> t.mutate(total_lead=t.total.lead())
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ year ┃ total ┃ total_lead ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩ │ int64 │ float64 │ float64 │ ├───────┼─────────┼────────────┤ │ 2007 │ 1899.6 │ 1928.2 │ │ 2008 │ 1928.2 │ 2037.9 │ │ 2009 │ 2037.9 │ 1955.2 │ │ 2010 │ 1955.2 │ NULL │ └───────┴─────────┴────────────┘
┏━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ year ┃ total ┃ total_lead ┃ ┡━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━┩ │ int64 │ float64 │ float64 │ ├───────┼─────────┼────────────┤ │ 2007 │ 1899.6 │ 2037.9 │ │ 2008 │ 1928.2 │ 1955.2 │ │ 2009 │ 2037.9 │ 0.0 │ │ 2010 │ 1955.2 │ 0.0 │ └───────┴─────────┴────────────┘
Return the maximum of a column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter in values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | The maximum value in self |
Return the median of the column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Optional boolean expression. If given, only the values where where evaluates to true will be considered for the median. |
None |
Name | Type | Description |
---|---|---|
Scalar | Median of the column |
Compute the median of bill_depth_mm
>>> t.group_by(t.species).agg(median_bill_depth=t.bill_depth_mm.median()).order_by(
... ibis.desc("median_bill_depth")
... )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ ┃ species ┃ median_bill_depth ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ │ string │ float64 │ ├───────────┼───────────────────┤ │ Chinstrap │ 18.45 │ │ Adelie │ 18.40 │ │ Gentoo │ 15.00 │ └───────────┴───────────────────┘
In addition to numeric types, any orderable non-numeric types such as strings and dates work with median
.
>>> t.group_by(t.island).agg(median_species=t.species.median()).order_by(
... ibis.desc("median_species")
... )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ island ┃ median_species ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ string │ string │ ├───────────┼────────────────┤ │ Biscoe │ Gentoo │ │ Dream │ Chinstrap │ │ Torgersen │ Adelie │ └───────────┴────────────────┘
Return the minimum of a column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter in values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | The minimum value in self |
Return the mode of a column.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter in values when where is True |
None |
Name | Type | Description |
---|---|---|
Scalar | The mode of self |
Rename an expression to name
.
Name | Type | Description | Default |
---|---|---|---|
name | str | The new name of the expression | required |
Name | Type | Description |
---|---|---|
Value | self with name name |
Check whether this expression is not in values
.
Opposite of Value.isin()
.
NULL
values in the input are propagated in the output. If the values
argument contains any NULL
values, then ibis follows the SQL behavior of returning NULL
(not False) when self
is present. See examples below for details.
Name | Type | Description | Default |
---|---|---|---|
values | ir .ArrayValue | ir .Column | Iterable[Value] |
Values or expression to check for lack of membership. | required |
Name | Type | Description |
---|---|---|
BooleanValue | True if self is not in values , False otherwise. |
Checking for values in literals:
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_12 ┃ a_notin_12None ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ ├───────┼───────┼────────────┼────────────────┤ │ 1 │ 1 │ False │ False │ │ 2 │ 2 │ False │ False │ │ 3 │ 9 │ True │ NULL │ │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴────────────┴────────────────┘
Checking for values in columns of the same table:
>>> t.mutate(
... a_notin_b=t.a.notin(t.b),
... a_notin_b_no_null=t.a.notin(t.b.fill_null(0)),
... a_notin_b_plus_1=t.a.notin(t.b + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_b ┃ a_notin_b_no_null ┃ a_notin_b_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼───────────┼───────────────────┼──────────────────┤ │ 1 │ 1 │ False │ False │ NULL │ │ 2 │ 2 │ False │ False │ False │ │ 3 │ 9 │ NULL │ True │ False │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴───────────┴───────────────────┴──────────────────┘
Checking for values in a column from a different table:
>>> t2 = ibis.memtable({"x": [1, 2, 99], "y": [1, 2, None]})
>>> t.mutate(
... a_notin_x=t.a.notin(t2.x),
... a_notin_y=t.a.notin(t2.y),
... a_notin_y_plus_1=t.a.notin(t2.y + 1),
... )
┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓ ┃ a ┃ b ┃ a_notin_x ┃ a_notin_y ┃ a_notin_y_plus_1 ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩ │ int64 │ int64 │ boolean │ boolean │ boolean │ ├───────┼───────┼───────────┼───────────┼──────────────────┤ │ 1 │ 1 │ False │ False │ NULL │ │ 2 │ 2 │ False │ False │ False │ │ 3 │ 9 │ True │ NULL │ False │ │ NULL │ NULL │ NULL │ NULL │ NULL │ └───────┴───────┴───────────┴───────────┴──────────────────┘
Return whether this expression is not NULL.
Return the n
th value (0-indexed) over a window.
.nth(0)
is equivalent to .first()
. Negative will result in NULL
. If the value of n
is greater than the number of rows in the window, NULL
will be returned.
Name | Type | Description | Default |
---|---|---|---|
n | int | ir .IntegerValue |
Desired rank value | required |
Name | Type | Description |
---|---|---|
Column | The nth value over a window |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 3, 4, 5, 6]})
>>> t.mutate(nth=t.values.nth(2))
┏━━━━━━━━┳━━━━━━━┓ ┃ values ┃ nth ┃ ┡━━━━━━━━╇━━━━━━━┩ │ int64 │ int64 │ ├────────┼───────┤ │ 1 │ 3 │ │ 2 │ 3 │ │ 3 │ 3 │ │ 4 │ 3 │ │ 5 │ 3 │ │ 6 │ 3 │ └────────┴───────┘
Return the integer number of a partitioning of the column values.
Name | Type | Description | Default |
---|---|---|---|
buckets | int | ir .IntegerValue |
Number of buckets to partition into | required |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(ntile=t.values.ntile(3))
┏━━━━━━━━┳━━━━━━━┓ ┃ values ┃ ntile ┃ ┡━━━━━━━━╇━━━━━━━┩ │ int64 │ int64 │ ├────────┼───────┤ │ 1 │ 0 │ │ 1 │ 0 │ │ 2 │ 1 │ │ 2 │ 1 │ │ 2 │ 2 │ │ 3 │ 2 │ └────────┴───────┘
Set values to null if they equal the values null_if_expr
.
Commonly used to avoid divide-by-zero problems by replacing zero with NULL
in the divisor.
Equivalent to (self == null_if_expr).ifelse(ibis.null(), self)
.
Name | Type | Description | Default |
---|---|---|---|
null_if_expr | Value | Expression indicating what values should be NULL | required |
Name | Type | Description |
---|---|---|
Value | Value expression |
Compute the number of distinct rows in an expression.
Name | Type | Description | Default |
---|---|---|---|
where | ir .BooleanValue | None |
Filter expression | None |
Name | Type | Description |
---|---|---|
IntegerScalar |
Number of distinct elements in an expression |
Construct a window expression.
Name | Type | Description | Default |
---|---|---|---|
window | Window specification | None |
|
rows | Whether to use the ROWS window clause |
None |
|
range | Whether to use the RANGE window clause |
None |
|
group_by | Grouping key | None |
|
order_by | Ordering key | None |
Name | Type | Description |
---|---|---|
Value | A window function expression |
Return the relative rank of the values in the column.
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(percent_rank=t.values.percent_rank())
┏━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ values ┃ percent_rank ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━━┩ │ int64 │ float64 │ ├────────┼──────────────┤ │ 1 │ 0.0 │ │ 1 │ 0.0 │ │ 2 │ 0.4 │ │ 2 │ 0.4 │ │ 2 │ 0.4 │ │ 3 │ 1.0 │ └────────┴──────────────┘
Compose f
with self
.
Name | Type | Description | Default |
---|---|---|---|
f | If the expression needs to be passed as anything other than the first argument to the function, pass a tuple with the argument name. For example, (f, ‘data’) if the function f expects a ‘data’ keyword | required | |
args | Any | Positional arguments to f |
() |
kwargs | Any | Keyword arguments to f |
{} |
Name | Type | Description |
---|---|---|
Expr |
Result type of passed function |
Print a subset as a single-column Rich Table.
This is an explicit version of what you get when you inspect this object in interactive mode, except with this version you can pass formatting options. The options are the same as those exposed in ibis.options.interactive
.
Name | Type | Description | Default |
---|---|---|---|
max_rows | int | None | Maximum number of rows to display | None |
max_length | int | None | Maximum length for pretty-printed arrays and maps. | None |
max_string | int | None | Maximum length for pretty-printed strings. | None |
max_depth | int | None | Maximum depth for nested data types. | None |
console_width | int | float | None | Width of the console in characters. If not specified, the width will be inferred from the console. | None |
Return value at the given quantile.
The output of this method is a continuous quantile if the input is numeric, otherwise the output is a discrete quantile.
Name | Type | Description | Default |
---|---|---|---|
quantile | float | ir .NumericValue | Sequence[ir .NumericValue | float] |
0 <= quantile <= 1 , or an array of such values indicating the quantile or quantiles to compute |
required |
where | ir .BooleanValue | None |
Boolean filter for input values | None |
Name | Type | Description |
---|---|---|
Scalar | Quantile of the input |
Compute the 99th percentile of bill_depth
>>> t.group_by(t.species).agg(p99_bill_depth=t.bill_depth_mm.quantile(0.99)).order_by(
... ibis.desc("p99_bill_depth")
... )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ species ┃ p99_bill_depth ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ string │ float64 │ ├───────────┼────────────────┤ │ Adelie │ 21.200 │ │ Chinstrap │ 20.733 │ │ Gentoo │ 17.256 │ └───────────┴────────────────┘
In addition to numeric types, any orderable non-numeric types such as strings and dates work with quantile
.
Let’s compute the 99th percentile of the species
column
>>> t.group_by(t.island).agg(p99_species=t.species.quantile(0.99)).order_by(
... ibis.desc("p99_species")
... )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ island ┃ p99_species ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │ string │ string │ ├───────────┼─────────────┤ │ Biscoe │ Gentoo │ │ Dream │ Chinstrap │ │ Torgersen │ Adelie │ └───────────┴─────────────┘
Compute position of first element within each equal-value group in sorted order.
Equivalent to SQL’s RANK()
window function.
Name | Type | Description |
---|---|---|
Int64Column |
The min rank |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(rank=t.values.rank())
┏━━━━━━━━┳━━━━━━━┓ ┃ values ┃ rank ┃ ┡━━━━━━━━╇━━━━━━━┩ │ int64 │ int64 │ ├────────┼───────┤ │ 1 │ 0 │ │ 1 │ 0 │ │ 2 │ 2 │ │ 2 │ 2 │ │ 2 │ 2 │ │ 3 │ 5 │ └────────┴───────┘
Replace values given in values
with replacement
.
This is similar to the pandas replace
method.
Name | Type | Description | Default |
---|---|---|---|
value | Value | dict | Expression or dict. | required |
replacement | Value | None | If an expression is passed to value, this must be passed. | None |
else_ | Value | None | If an original value does not match value , then else_ is used. The default of None means leave the original value unchanged. |
None |
Name | Type | Description |
---|---|---|
Value | Replaced values |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.island.value_counts().order_by("island")
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ island ┃ island_count ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ │ string │ int64 │ ├───────────┼──────────────┤ │ Biscoe │ 168 │ │ Dream │ 124 │ │ Torgersen │ 52 │ └───────────┴──────────────┘
>>> t.island.substitute({"Torgersen": "torg", "Biscoe": "bisc"}).name(
... "island"
... ).value_counts().order_by("island")
┏━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ island ┃ island_count ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━━┩ │ string │ int64 │ ├────────┼──────────────┤ │ Dream │ 124 │ │ bisc │ 168 │ │ torg │ 52 │ └────────┴──────────────┘
Write the results of executing the given expression to a CSV file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.csv.CSVWriter.html for details.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path where the CSV file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.csv.CSVWriter | {} |
Write the results of executing the given expression to a Delta Lake table.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path to the Delta Lake table directory. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to deltalake.writer.write_deltalake method | {} |
Write the results of expr
to a json file of [{column -> value}, …] objects.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | The data target. A string or Path where the JSON file will be written. | required |
kwargs | Any | Additional, backend-specific keyword arguments. | {} |
Convert a column expression to a list.
Name | Type | Description | Default |
---|---|---|---|
kwargs | Same as keyword arguments to to_pyarrow |
{} |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.bill_length_mm.to_list(limit=5)
[39.1, 39.5, 40.3, None, 36.7]
Convert a table expression to a pandas DataFrame.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means no limit. The default is in ibis/config.py . |
None |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
DataFrame |
The result of executing the expression as a pandas DataFrame |
Execute expression and return an iterator of pandas DataFrames.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value. | None |
chunk_size | int | Maximum number of rows in each returned DataFrame . |
1000000 |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
Iterator[pd.DataFrame] |
Write the results of executing the given expression to a parquet file.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetWriter.html for details.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | A string or Path where the Parquet file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.parquet.ParquetWriter | {} |
Write out an expression to a single parquet file.
Partition on a single column.
Partition on multiple columns.
Write the results of executing the given expression to a parquet file in a directory.
This method is eager and will execute the associated expression immediately.
See https://arrow.apache.org/docs/python/generated/pyarrow.dataset.write_dataset.html for details.
Name | Type | Description | Default |
---|---|---|---|
directory | str | Path | The data target. A string or Path to the directory where the parquet file will be written. | required |
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
**kwargs | Any | Additional keyword arguments passed to pyarrow.dataset.write_dataset | {} |
Execute expression and return results as a polars dataframe.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Mapping of scalar parameter expressions to value. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
DataFrame |
A polars dataframe holding the results of the executed expression. |
Execute expression and return a RecordBatchReader.
This method is eager and will execute the associated expression immediately.
Name | Type | Description | Default |
---|---|---|---|
limit | int | str | None | An integer to effect a specific row limit. A value of None means “no limit”. The default is in ibis/config.py . |
None |
params | Mapping[ir .Value , Any] | None |
Mapping of scalar parameter expressions to value. | None |
chunk_size | int | Maximum number of rows in each returned record batch. | 1000000 |
kwargs | Any | Keyword arguments | {} |
Name | Type | Description |
---|---|---|
results |
RecordBatchReader |
Compile to a formatted SQL string.
Name | Type | Description | Default |
---|---|---|---|
dialect | str | None | SQL dialect to use for compilation. Uses the dialect bound to self if not specified, or the default dialect if no dialect is bound. | None |
pretty | bool | Whether to use pretty formatting. | True |
kwargs | Scalar parameters | {} |
Name | Type | Description |
---|---|---|
str | Formatted SQL string |
You can also specify the SQL dialect to use for compilation:
Value.to_sql()
Table.to_sql()
ibis.to_sql()
Value.compile()
Table.compile()
Execute an expression and return results as a dictionary of torch tensors.
Name | Type | Description | Default |
---|---|---|---|
params | Mapping[ir .Scalar , Any] | None |
Parameters to substitute into the expression. | None |
limit | int | str | None | An integer to effect a specific row limit. A value of None means no limit. |
None |
kwargs | Any | Keyword arguments passed into the backend’s to_torch implementation. |
{} |
Name | Type | Description |
---|---|---|
dict[str, torch .Tensor ] |
A dictionary of torch tensors, keyed by column name. |
Write a table to an Excel file.
Name | Type | Description | Default |
---|---|---|---|
path | str | Path | Excel output path. | required |
sheet | str | The name of the sheet to write to, eg ‘Sheet3’. | 'Sheet1' |
header | bool | Whether to include the column names as the first row. | False |
params | Mapping[ir .Scalar , Any] | None |
Additional Ibis expression parameters to pass to the backend’s write function. | None |
kwargs | Any | Additional arguments passed to the backend’s write function. | {} |
Requires DuckDB >= 1.2.0.
Computes a Table of the top values by a metric (defaults to count).
Added name
parameter.
Name | Type | Description | Default |
---|---|---|---|
k | int | None | The number of rows to return. If None , all values are returned in descending order. |
None |
by | ir .Value | None |
The metric to compute “top” by. Defaults to count . |
None |
name | str | None | The name to use for the metric column. If not provided, a suitable name will be generated. | None |
Name | Type | Description |
---|---|---|
Table |
The top k values. |
Compute the top 3 diamond colors by frequency:
┏━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ color ┃ color_count ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━┩ │ string │ int64 │ ├────────┼─────────────┤ │ G │ 11292 │ │ E │ 9797 │ │ F │ 9542 │ └────────┴─────────────┘
Compute the top 3 diamond colors by mean price:
┏━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ color ┃ Mean(price) ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━┩ │ string │ float64 │ ├────────┼─────────────┤ │ J │ 5323.818020 │ │ I │ 5091.874954 │ │ H │ 4486.669196 │ └────────┴─────────────┘
Rank all the colors by max carat:
Try cast expression to indicated data type.
If the cast fails for a row, the value is returned as null or NaN depending on target_type and backend behavior.
Name | Type | Description | Default |
---|---|---|---|
target_type | Any | Type to try cast to. Anything accepted by ibis.dtype() |
required |
Name | Type | Description |
---|---|---|
Value | Casted expression |
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"numbers": [1, 2, 3, 4], "strings": ["1.0", "2", "hello", "world"]})
>>> t
┏━━━━━━━━━┳━━━━━━━━━┓ ┃ numbers ┃ strings ┃ ┡━━━━━━━━━╇━━━━━━━━━┩ │ int64 │ string │ ├─────────┼─────────┤ │ 1 │ 1.0 │ │ 2 │ 2 │ │ 3 │ hello │ │ 4 │ world │ └─────────┴─────────┘
>>> t = t.mutate(numbers_to_strings=_.numbers.try_cast("string"))
>>> t = t.mutate(strings_to_numbers=_.strings.try_cast("int"))
>>> t
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓ ┃ numbers ┃ strings ┃ numbers_to_strings ┃ strings_to_numbers ┃ ┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩ │ int64 │ string │ string │ int64 │ ├─────────┼─────────┼────────────────────┼────────────────────┤ │ 1 │ 1.0 │ 1 │ 1 │ │ 2 │ 2 │ 2 │ 2 │ │ 3 │ hello │ 3 │ NULL │ │ 4 │ world │ 4 │ NULL │ └─────────┴─────────┴────────────────────┴────────────────────┘
Return the DataType of self
.
Return the string name of the datatype of self.
The values of the returned strings are necessarily backend dependent. e.g. duckdb may say “DOUBLE”, while sqlite may say “real”.
Name | Type | Description |
---|---|---|
StringValue | A string indicating the type of the value |
>>> import ibis
>>> ibis.options.interactive = True
>>> vals = ibis.examples.penguins.fetch().head(5).bill_length_mm
>>> vals
┏━━━━━━━━━━━━━━━━┓ ┃ bill_length_mm ┃ ┡━━━━━━━━━━━━━━━━┩ │ float64 │ ├────────────────┤ │ 39.1 │ │ 39.5 │ │ 40.3 │ │ NULL │ │ 36.7 │ └────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ TypeOf(bill_length_mm) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━┩ │ string │ ├────────────────────────┤ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ │ DOUBLE │ └────────────────────────┘
Different backends have different names for their native types
Return an expression built on UnboundTable
instead of backend-specific objects.
>>> import ibis
>>> import pandas as pd
>>> duckdb_con = ibis.duckdb.connect()
>>> polars_con = ibis.polars.connect()
>>> for backend in (duckdb_con, polars_con):
... t = backend.create_table("t", pd.DataFrame({"a": [1, 2, 3]}))
>>> bound_table = duckdb_con.table("t")
>>> bound_table.get_backend().name
'duckdb'
Compute a frequency table.
Name | Type | Description | Default |
---|---|---|---|
name | str | None | The name to use for the frequency column. If not provided, a suitable name will be generated. | None |
Name | Type | Description |
---|---|---|
Table |
The frequency table. |
Compute the count of each unique value in “chars”, ordered by “chars”:
┏━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ chars ┃ chars_count ┃ ┡━━━━━━━━╇━━━━━━━━━━━━━┩ │ string │ int64 │ ├────────┼─────────────┤ │ a │ 2 │ │ b │ 1 │ │ c │ 3 │ │ d │ 4 │ └────────┴─────────────┘
Compute the count of each unique value in “chars” as a column named “freq”, ordered by “freq”:
┏━━━━━━━━┳━━━━━━━┓ ┃ chars ┃ freq ┃ ┡━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├────────┼───────┤ │ b │ 1 │ │ a │ 2 │ │ c │ 3 │ │ d │ 4 │ └────────┴───────┘
visualize(
format='svg',
*,
label_edges=False,
verbose=False,
node_attr=None,
node_attr_getter=None,
edge_attr=None,
edge_attr_getter=None,
)
Visualize an expression as a GraphViz graph in the browser.
Name | Type | Description | Default |
---|---|---|---|
format | str | Image output format. These are specified by the graphviz Python library. |
'svg' |
label_edges | bool | Show operation input names as edge labels | False |
verbose | bool | Print the graphviz DOT code to stderr if True |
False |
node_attr | Mapping[str, str] | None | Mapping of (attribute, value) pairs set for all nodes. Options are specified by the graphviz Python library. |
None |
node_attr_getter | NodeAttributeGetter | None |
Callback taking a node and returning a mapping of (attribute, value) pairs for that node. Options are specified by the graphviz Python library. |
None |
edge_attr | Mapping[str, str] | None | Mapping of (attribute, value) pairs set for all edges. Options are specified by the graphviz Python library. |
None |
edge_attr_getter | EdgeAttributeGetter | None |
Callback taking two adjacent nodes and returning a mapping of (attribute, value) pairs for the edge between those nodes. Options are specified by the graphviz Python library. |
None |
Open the visualization of an expression in default browser:
>>> import ibis
>>> import ibis.expr.operations as ops
>>> left = ibis.table(dict(a="int64", b="string"), name="left")
>>> right = ibis.table(dict(b="string", c="int64", d="string"), name="right")
>>> expr = left.inner_join(right, "b").select(left.a, b=right.c, c=right.d)
>>> expr.visualize(
... format="svg",
... label_edges=True,
... node_attr={"fontname": "Roboto Mono", "fontsize": "10"},
... node_attr_getter=lambda node: isinstance(node, ops.Field) and {"shape": "oval"},
... edge_attr={"fontsize": "8"},
... edge_attr_getter=lambda u, v: isinstance(u, ops.Field) and {"color": "red"},
... ) # quartodoc: +SKIP
Name | Type | Description |
---|---|---|
ImportError | If graphviz is not installed. |
The user facing wrapper object providing syntactic sugar for deferreds.
Provides a natural-like syntax for constructing deferred expressions by overloading all of the available dunder methods including the equality operator.
Its sole purpose is to provide a nicer syntax for constructing deferred expressions, thus it gets unwrapped to the underlying deferred expression when used by the rest of the library.
Name | Type | Description | Default |
---|---|---|---|
obj | The deferred object to provide syntax sugar for. | required | |
repr | An optional fixed string to use when repr-ing the deferred expression, instead of the default. This is useful for complex deferred expressions where the arguments don’t necessarily make sense to be user facing in the repr. | None |
Name | Description |
---|---|
as_scalar | Inform ibis that the expression should be treated as a scalar. |
as_table | Promote the scalar expression to a table. |
Inform ibis that the expression should be treated as a scalar.
If the expression is a literal, it will be returned as is. If it depends on a table, it will be turned to a scalar subquery.
Name | Type | Description |
---|---|---|
Scalar | A scalar subquery or a literal |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> max_gentoo_weight = t.filter(t.species == "Gentoo").body_mass_g.max()
>>> light_penguins = t.filter(t.body_mass_g < max_gentoo_weight / 2)
>>> light_penguins.species.value_counts().order_by("species")
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ species ┃ species_count ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ string │ int64 │ ├───────────┼───────────────┤ │ Adelie │ 15 │ │ Chinstrap │ 2 │ └───────────┴───────────────┘
Promote the scalar expression to a table.
Name | Type | Description |
---|---|---|
Table |
A table expression |
Promote an aggregation to a table
>>> import ibis
>>> import ibis.expr.types as ir
>>> t = ibis.table(dict(a="str"), name="t")
>>> expr = t.a.length().sum().name("len").as_table()
>>> isinstance(expr, ir.Table)
True
Promote a literal value to a table
deferred
Deferred expression object.
Use this object to refer to a previous table expression in a chain of expressions.
_
may conflict with other idioms in Python
See https://github.com/ibis-project/ibis/issues/4704 for details.
Use from ibis import deferred as <NAME>
to assign a different name to the deferred object builder.
Another option is to use ibis._
directly.
Create a scalar expression from a Python value.
Ibis supports literal construction of arrays using the following functions:
Constructing these types using literal
will be deprecated in a future release.
Name | Type | Description | Default |
---|---|---|---|
value | Any | A Python value | required |
type | dt .DataType | str | None |
An instance of DataType or a string indicating the ibis type of value . This parameter can be used in cases where ibis’s type inference isn’t sufficient for discovering the type of value . |
None |
Name | Type | Description |
---|---|---|
Scalar | An expression representing a literal value |
Construct an integer literal
Construct a float64
literal from an int
Ibis checks for invalid types
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) File ~/work/ibis/ibis/ibis/expr/datatypes/value.py:273, in normalize(typ, value) 272 try: --> 273 value = int(value) 274 except ValueError: ValueError: invalid literal for int() with base 10: 'foobar' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) Cell In[620], line 1 ----> 1 ibis.literal("foobar", type="int64") File ~/work/ibis/ibis/ibis/common/deferred.py:619, in deferrable.<locals>.wrapper.<locals>.inner(*args, **kwargs) 617 builder = Call(func, *args, **kwargs) 618 return Deferred(builder, repr=repr) --> 619 return func(*args, **kwargs) File ~/work/ibis/ibis/ibis/expr/types/generic.py:3083, in literal(value, type) 3078 raise TypeError( 3079 f"Ibis literal {value!r} cannot be safely coerced to datatype {type}" 3080 ) 3082 dtype = dt.infer(value) if type is None else dt.dtype(type) -> 3083 return ops.Literal(value, dtype=dtype).to_expr() File ~/work/ibis/ibis/ibis/common/bases.py:72, in AbstractMeta.__call__(cls, *args, **kwargs) 52 def __call__(cls, *args, **kwargs): 53 """Create a new instance of the class. 54 55 The subclass may override the `__create__` classmethod to change the (...) 70 71 """ ---> 72 return cls.__create__(*args, **kwargs) File ~/work/ibis/ibis/ibis/common/grounds.py:120, in Annotable.__create__(cls, *args, **kwargs) 116 @classmethod 117 def __create__(cls, *args: Any, **kwargs: Any) -> Self: 118 # construct the instance by passing only validated keyword arguments 119 kwargs = cls.__signature__.validate(cls, args, kwargs) --> 120 return super().__create__(**kwargs) File ~/work/ibis/ibis/ibis/expr/operations/generic.py:147, in Literal.__init__(self, value, dtype) 145 def __init__(self, value, dtype): 146 # normalize ensures that the value is a valid value for the given dtype --> 147 value = dt.normalize(dtype, value) 148 super().__init__(value=value, dtype=dtype) File ~/work/ibis/ibis/ibis/expr/datatypes/value.py:275, in normalize(typ, value) 273 value = int(value) 274 except ValueError: --> 275 raise TypeError(f"Unable to normalize {value!r} to {dtype!r}") 276 if value not in dtype.bounds: 277 raise TypeError( 278 f"Value {value} is out of bounds for type {dtype!r} " 279 f"(bounds: {dtype.bounds})" 280 ) TypeError: Unable to normalize 'foobar' to Int64(nullable=True)
Literals can also be used in a deferred context.
Here’s an example of constructing a table of a column’s type repeated for every row:
>>> from ibis import _, selectors as s
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.select(s.across(s.all(), ibis.literal(_.type(), type=str).name(_.get_name()))).head(1)
┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩ │ string │ string │ string │ string │ string │ string │ string │ string │ ├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼────────┤ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ └─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴────────┘
Create a deferred parameter of a given type.
Name | Type | Description | Default |
---|---|---|---|
type | dt .DataType |
The type of the unbound parameter, e.g., double, int64, date, etc. | required |
Name | Type | Description |
---|---|---|
Scalar |
A scalar expression backend by a parameter |
>>> from datetime import date
>>> import ibis
>>> start = ibis.param("date")
>>> t = ibis.memtable(
... {
... "date_col": [date(2013, 1, 1), date(2013, 1, 2), date(2013, 1, 3)],
... "value": [1.0, 2.0, 3.0],
... },
... )
>>> expr = t.filter(t.date_col >= start).value.sum()
>>> expr.execute(params={start: date(2013, 1, 1)})
6.0
Create a NULL scalar.
NULL
s with an unspecified type are castable and comparable to values, but lack datatype-specific methods:
>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.null().upper() # quartodoc: +EXPECTED_FAILURE
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[635], line 3 1 import ibis 2 ibis.options.interactive = True ----> 3 ibis.null().upper() # quartodoc: +EXPECTED_FAILURE AttributeError: 'NullScalar' object has no attribute 'upper'
Generate a range of values.
Integer ranges are supported, as well as timestamp ranges.
start
is inclusive and stop
is exclusive, just like Python’s builtin range
.
When step
equals 0, however, this function will return an empty array.
Python’s range
will raise an exception when step
is zero.
Name | Type | Description | Default |
---|---|---|---|
start | Lower bound of the range, inclusive. | required | |
stop | Upper bound of the range, exclusive. | required | |
step | Step value. Optional, defaults to 1. | required |
Name | Type | Description |
---|---|---|
ArrayValue |
An array of values |
Range using only a stop argument
Simple range using start and stop
Generate an empty range
Negative step values are supported
ibis.range
behaves the same as Python’s range …
… except when the step is zero, in which case ibis.range
returns an empty array
Because the resulting expression is array, you can unnest the values
┏━━━━━━━━━┓ ┃ numbers ┃ ┡━━━━━━━━━┩ │ int8 │ ├─────────┤ │ 0 │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ └─────────┘
Timestamp ranges are also supported
┌─────────────────────────────────────────────────────────────────────────────────────┐ │ [datetime.datetime(2002, 1, 1, 0, 0), datetime.datetime(2002, 1, 3, 0, 0), ... +14] │ └─────────────────────────────────────────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━┓ ┃ ts ┃ ┡━━━━━━━━━━━━━━━━━━━━━┩ │ timestamp │ ├─────────────────────┤ │ 2002-01-01 00:00:00 │ │ 2002-01-03 00:00:00 │ │ 2002-01-05 00:00:00 │ │ 2002-01-07 00:00:00 │ │ 2002-01-09 00:00:00 │ │ 2002-01-11 00:00:00 │ │ 2002-01-13 00:00:00 │ │ 2002-01-15 00:00:00 │ │ 2002-01-17 00:00:00 │ │ 2002-01-19 00:00:00 │ │ … │ └─────────────────────┘
Return the first non-null value from args
.
Name | Type | Description | Default |
---|---|---|---|
arg | Any | First argument from which to choose the first non-null value | required |
args | Any | Arguments from which to choose the first non-null value | () |
Name | Type | Description |
---|---|---|
Value |
Coalesced expression |
Compute the smallest value among the supplied arguments.
Name | Type | Description | Default |
---|---|---|---|
arg | Any | First argument | required |
args | Any | Remaining arguments | () |
Name | Type | Description |
---|---|---|
Value |
Minimum of the passed arguments |
Compute the largest value among the supplied arguments.
Name | Type | Description | Default |
---|---|---|---|
arg | Any | First argument | required |
args | Any | Remaining arguments | () |
Name | Type | Description |
---|---|---|
Value |
Maximum of the passed arguments |
Construct a ternary conditional expression.
Name | Type | Description | Default |
---|---|---|---|
condition | Any | A boolean expression | required |
true_expr | Any | Expression to return if condition evaluates to True |
required |
false_expr | Any | Expression to return if condition evaluates to False or NULL |
required |
Name | Type | Description |
---|---|---|
Value | ir .Value |
The value of true_expr if condition is True else false_expr |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"condition": [True, False, True, None]})
>>> ibis.ifelse(t.condition, "yes", "no")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ IfElse(condition, 'yes', 'no') ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ string │ ├────────────────────────────────┤ │ yes │ │ no │ │ yes │ │ no │ └────────────────────────────────┘
Create a multi-branch if-else expression.
Equivalent to a SQL CASE
statement.
Name | Type | Description | Default |
---|---|---|---|
branch | tuple[Any, Any] | First (condition , result ) pair. Required. |
required |
branches | tuple[Any, Any] | Additional (condition , result ) pairs. We look through the test values in order and return the result corresponding to the first test value that matches self . If none match, we return else_ . |
() |
else_ | Any | None | Value to return if none of the case conditions evaluate to True . Defaults to NULL . |
None |
Name | Type | Description |
---|---|---|
Value |
A value expression |
>>> import ibis
>>> ibis.options.interactive = True
>>> v = ibis.memtable({"values": [1, 2, 1, 2, 3, 2, 4]}).values
>>> ibis.cases((v == 1, "a"), (v > 2, "b"), else_="unk").name("cases")
┏━━━━━━━━┓ ┃ cases ┃ ┡━━━━━━━━┩ │ string │ ├────────┤ │ a │ │ unk │ │ a │ │ unk │ │ b │ │ unk │ │ b │ └────────┘
>>> ibis.cases(
... (v % 2 == 0, "divisible by 2"),
... (v % 3 == 0, "divisible by 3"),
... (v % 4 == 0, "shadowed by the 2 case"),
... ).name("cases")
┏━━━━━━━━━━━━━━━━┓ ┃ cases ┃ ┡━━━━━━━━━━━━━━━━┩ │ string │ ├────────────────┤ │ NULL │ │ divisible by 2 │ │ NULL │ │ divisible by 2 │ │ divisible by 3 │ │ divisible by 2 │ │ divisible by 2 │ └────────────────┘
Create a ascending sort key from asc
or column name.
Name | Type | Description | Default |
---|---|---|---|
expr | ir .Column | str |
The expression or column name to use for sorting | required |
nulls_first | bool | Bool to indicate whether to put NULL values first or not. | False |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t[["species", "year"]].order_by(ibis.asc("year")).head()
┏━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├─────────┼───────┤ │ Adelie │ 2007 │ │ Adelie │ 2007 │ │ Adelie │ 2007 │ │ Adelie │ 2007 │ │ Adelie │ 2007 │ └─────────┴───────┘
Name | Type | Description |
---|---|---|
ir .ValueExpr |
An expression |
Create a descending sort key from expr
or column name.
Name | Type | Description | Default |
---|---|---|---|
expr | ir .Column | str |
The expression or column name to use for sorting | required |
nulls_first | bool | Bool to indicate whether to put NULL values first or not. | False |
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t[["species", "year"]].order_by(ibis.desc("year")).head()
┏━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├─────────┼───────┤ │ Adelie │ 2009 │ │ Adelie │ 2009 │ │ Adelie │ 2009 │ │ Adelie │ 2009 │ │ Adelie │ 2009 │ └─────────┴───────┘
Name | Type | Description |
---|---|---|
ir .ValueExpr |
An expression |
Return the formatted SQL string for an expression.
Name | Type | Description | Default |
---|---|---|---|
expr | ir .Expr |
Ibis expression. | required |
dialect | str | None | SQL dialect to use for compilation. | None |
pretty | bool | Whether to use pretty formatting. | True |
kwargs | Scalar parameters | {} |
Name | Type | Description |
---|---|---|
str | Formatted SQL string |
You can also specify the SQL dialect to use for compilation: