>>> import ibis
>>> t = ibis.table(dict(a="int", b="string", c="array<int>", abcd="float"))
>>> expr = t.select([t[c] for c in t.columns if t[c].type().is_numeric()])
>>> expr.columns('a', 'abcd')
selectors
Convenient column selectors.
Column selectors (“selectors”) are convenience functions for selecting columns that satisfy a particular condition. For example, selecting all string-typed columns.
A common task is to be able to select all numeric columns for a subsequent computation.
Without selectors this becomes quite verbose and tedious to write:
('a', 'abcd')
Compare that to the numeric selector:
When there are multiple properties to check it gets worse:
('a', 'b', 'abcd')
Using a composition of selectors this is much less tiresome:
| Name | Description |
|---|---|
| index | Select columns by index. |
| Name | Description |
|---|---|
| across | Apply data transformations across multiple columns. |
| all | Return every column from a table. |
| all_of | Include columns satisfying all of predicates. |
| any_of | Include columns satisfying any of predicates. |
| cols | Select specific column names. |
| contains | Return columns whose name contains needles. |
| endswith | Select columns whose name ends with one of suffixes. |
| first | Return the first column of a table. |
| if_all | Return the conjunction of predicate applied on all selector columns. |
| if_any | Return the disjunction of predicate applied on all selector columns. |
| last | Return the last column of a table. |
| matches | Return columns whose name matches the regular expression regex. |
| none | Return no columns. |
| numeric | Return numeric columns. |
| of_type | Select columns of type dtype. |
| startswith | Select columns whose name starts with one of prefixes. |
| where | Select columns that satisfy predicate. |
Apply data transformations across multiple columns.
| Name | Type | Description | Default |
|---|---|---|---|
| selector | Selector | Iterable[str] | str |
An expression that selects columns on which the transformation function will be applied, an iterable of str column names or a single str column name. |
required |
| func | Deferred | Callable[[ir.Value], ir.Value] | Mapping[str | None, Deferred | Callable[[ir.Value], ir.Value]] |
A function (or dictionary of functions) to use to transform the data. | required |
| names | str | Callable[[str, str | None], str] | None | A lambda function or a format string to name the columns created by the transformation function. | None |
| Name | Type | Description |
|---|---|---|
Across |
An Across selector object |
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ bill_length_mm ┃ bill_depth_mm ┃ centered_bill_length_mm ┃ centered_bill_depth_mm ┃ ┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩ │ float64 │ float64 │ float64 │ float64 │ ├────────────────┼───────────────┼─────────────────────────┼────────────────────────┤ │ 39.1 │ 18.7 │ -4.82193 │ 1.54883 │ │ 39.5 │ 17.4 │ -4.42193 │ 0.24883 │ │ 40.3 │ 18.0 │ -3.62193 │ 0.84883 │ │ NULL │ NULL │ NULL │ NULL │ │ 36.7 │ 19.3 │ -7.22193 │ 2.14883 │ │ 39.3 │ 20.6 │ -4.62193 │ 3.44883 │ │ 38.9 │ 17.8 │ -5.02193 │ 0.64883 │ │ 39.2 │ 19.6 │ -4.72193 │ 2.44883 │ │ 34.1 │ 18.1 │ -9.82193 │ 0.94883 │ │ 42.0 │ 20.2 │ -1.92193 │ 3.04883 │ │ … │ … │ … │ … │ └────────────────┴───────────────┴─────────────────────────┴────────────────────────┘
Return every column from a table.
Include columns satisfying all of predicates.
Include columns satisfying any of predicates.
Select specific column names.
| Name | Type | Description | Default |
|---|---|---|---|
| names | str | ir.Column |
The column names to select | () |
Return columns whose name contains needles.
| Name | Type | Description | Default |
|---|---|---|---|
| needles | str | tuple[str, …] | One or more strings to search for in column names | required |
| how | Callable[[Iterable[bool]], bool] | A boolean reduction to allow the configuration of how needles are summarized. |
builtins.any |
Select columns that contain either "a" or "b"
('a', 'b', 'ab')
Select columns that contain all of "a" and "b", that is, both "a" and "b" must be in each column’s name to match.
Select columns whose name ends with one of suffixes.
| Name | Type | Description | Default |
|---|---|---|---|
| suffixes | str | tuple[str, …] | Suffixes to compare column names against | required |
Return the first column of a table.
Return the conjunction of predicate applied on all selector columns.
| Name | Type | Description | Default |
|---|---|---|---|
| selector | Selector |
A column selector | required |
| predicate | Deferred | Callable | A callable or deferred object defining a predicate to apply to each column from selector. |
required |
>>> import ibis
>>> from ibis import selectors as s, _
>>> ibis.options.interactive = True
>>> penguins = ibis.examples.penguins.fetch()
>>> cols = s.across(s.endswith("_mm"), (_ - _.mean()) / _.std())
>>> expr = penguins.mutate(cols).filter(s.if_all(s.endswith("_mm"), _.abs() > 1))
>>> expr_by_hand = penguins.mutate(cols).filter(
... (_.bill_length_mm.abs() > 1)
... & (_.bill_depth_mm.abs() > 1)
... & (_.flipper_length_mm.abs() > 1)
... )
>>> expr.equals(expr_by_hand)True
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Dream │ -1.157951 │ 1.088129 │ -1.416272 │ 3300 │ female │ 2007 │ │ Adelie │ Torgersen │ -1.231217 │ 1.138768 │ -1.202926 │ 3900 │ male │ 2008 │ │ Gentoo │ Biscoe │ 1.149917 │ -1.443781 │ 1.214987 │ 5700 │ male │ 2007 │ │ Gentoo │ Biscoe │ 1.040019 │ -1.089314 │ 1.072757 │ 4750 │ male │ 2008 │ │ Gentoo │ Biscoe │ 1.131601 │ -1.089314 │ 1.712792 │ 5000 │ male │ 2008 │ │ Gentoo │ Biscoe │ 1.241499 │ -1.089314 │ 1.570562 │ 5550 │ male │ 2008 │ │ Gentoo │ Biscoe │ 1.351398 │ -1.494420 │ 1.214987 │ 5300 │ male │ 2009 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
Return the disjunction of predicate applied on all selector columns.
| Name | Type | Description | Default |
|---|---|---|---|
| selector | Selector |
A column selector | required |
| predicate | Deferred | Callable | A callable or deferred object defining a predicate to apply to each column from selector. |
required |
>>> import ibis
>>> from ibis import selectors as s, _
>>> ibis.options.interactive = True
>>> penguins = ibis.examples.penguins.fetch().mutate(idx=ibis.row_number().over())
>>> cols = s.across(s.endswith("_mm"), (_ - _.mean()) / _.std())
>>> expr = penguins.mutate(cols).filter(s.if_any(s.endswith("_mm"), _.abs() > 2))
>>> expr_by_hand = penguins.mutate(cols).filter(
... (_.bill_length_mm.abs() > 2)
... | (_.bill_depth_mm.abs() > 2)
... | (_.flipper_length_mm.abs() > 2)
... )
>>> expr.equals(expr_by_hand)True
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ idx ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ int64 │ string │ int64 │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼───────┤ │ Adelie │ Torgersen │ -0.974787 │ 2.050255 │ -0.705121 │ 3800 │ male │ 2007 │ 13 │ │ Adelie │ Torgersen │ 0.380628 │ 2.202170 │ -0.491775 │ 4200 │ male │ 2007 │ 19 │ │ Adelie │ Biscoe │ -1.103002 │ 0.733662 │ -2.056307 │ 3150 │ female │ 2007 │ 28 │ │ Adelie │ Dream │ -0.297079 │ 2.050255 │ -0.705121 │ 4150 │ male │ 2007 │ 49 │ │ Adelie │ Dream │ -2.165354 │ -0.836123 │ -0.918466 │ 3050 │ female │ 2009 │ 142 │ │ Gentoo │ Biscoe │ 0.398944 │ -2.000802 │ 0.717181 │ 4500 │ female │ 2007 │ 152 │ │ Gentoo │ Biscoe │ 1.113285 │ -0.431017 │ 2.068368 │ 5700 │ male │ 2007 │ 153 │ │ Gentoo │ Biscoe │ -0.187181 │ -2.051440 │ 1.001641 │ 5000 │ female │ 2007 │ 176 │ │ Gentoo │ Biscoe │ 2.871660 │ -0.076550 │ 2.068368 │ 6050 │ male │ 2007 │ 185 │ │ Gentoo │ Biscoe │ 1.900890 │ -0.734846 │ 2.139483 │ 5650 │ male │ 2008 │ 215 │ │ … │ … │ … │ … │ … │ … │ … │ … │ … │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴───────┘
Return the last column of a table.
Return columns whose name matches the regular expression regex.
| Name | Type | Description | Default |
|---|---|---|---|
| regex | str | re.Pattern | A string or re.Pattern object |
required |
Return no columns.
s.none() results in an empty expansion.
This can be useful when you want to pivot a table without identifying unique observations.
┏━━━━━━━┳━━━━━━━┳━━━━━━━┓ ┃ Blue ┃ Green ┃ Red ┃ ┡━━━━━━━╇━━━━━━━╇━━━━━━━┩ │ int64 │ int64 │ int64 │ ├───────┼───────┼───────┤ │ 3 │ 1 │ 2 │ └───────┴───────┴───────┘
Return numeric columns.
('a', 'b', 'c')
Select columns of type dtype.
| Name | Type | Description | Default |
|---|---|---|---|
| dtype | dt.DataType | str | type[dt.DataType] |
DataType instance, str or DataType class |
required |
Select according to a specific DataType instance
('siblings',)
Strings are also accepted
Abstract/unparametrized types may also be specified by their string name (e.g. “integer” for any integer type), or by passing in a DataType class instead. The following options are equivalent.
True
Select columns whose name starts with one of prefixes.
| Name | Type | Description | Default |
|---|---|---|---|
| prefixes | str | tuple[str, …] | Prefixes to compare column names against | required |
Select columns that satisfy predicate.
Use this selector when one of the other selectors does not meet your needs.
| Name | Type | Description | Default |
|---|---|---|---|
| predicate | Callable[[ir.Value], bool] |
A callable that accepts an ibis value expression and returns a bool |
required |