Collection expressions

Arrays, maps and structs.

ArrayValue

ArrayValue(self, arg)

An Array is a variable-length sequence of values of a single type.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.memtable({"a": [[1, None, 3], [4], [], None]})
┏━━━━━━━━━━━━━━━━━━━┓
┃ a                 ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ array<int64>      │
├───────────────────┤
│ [1, None, ... +1] │
│ [4]               │
│ []                │
│ NULL              │
└───────────────────┘

Methods

Name Description
alls Return whether all elements (ignoring nulls) in the array are true.
anys Return whether any element in the array is true.
concat Concatenate this array with one or more arrays.
contains Return whether the array contains other.
filter Filter array elements using predicate function or Deferred.
flatten Remove one level of nesting from an array expression.
index Return the position of other in an array.
intersect Intersect two arrays.
join Join the elements of this array expression with sep.
length Compute the length of an array.
map Apply a func or Deferred to each element of this array expression.
maxs Return the maximum value in the array.
means Return the mean of the values in the array.
mins Return the minimum value in the array.
remove Remove other from self.
repeat Repeat this array n times.
sort Sort the elements in an array.
sums Return the sum of the values in the array.
union Union two arrays.
unique Return the unique values in an array.
unnest Unnest an array into a column.
zip Zip two or more arrays together.

alls

alls()

Return whether all elements (ignoring nulls) in the array are true.

Returns NULL if the array is empty or contains only NULLs.

See Also

BooleanColumn.all

Returns

Type Description
BooleanValue Whether all elements (ignoring nulls) in the array are true.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
...     {
...         "id": range(8),
...         "arr": [
...             [True, False],
...             [False],
...             [True],
...             [None, False],
...             [None, True],
...             [None],
...             [],
...             None,
...         ],
...     }
... )
>>> t.mutate(x=t.arr.alls()).order_by("id")
┏━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ id     arr             x       ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ int64array<boolean>boolean │
├───────┼────────────────┼─────────┤
│     0[True, False]  │ False   │
│     1[False]        │ False   │
│     2[True]         │ True    │
│     3[None, False]  │ False   │
│     4[None, True]   │ True    │
│     5[None]NULL    │
│     6[]NULL    │
│     7NULLNULL    │
└───────┴────────────────┴─────────┘

anys

anys()

Return whether any element in the array is true.

Returns NULL if the array is empty or contains only NULLs.

See Also

BooleanColumn.any

Returns

Type Description
BooleanValue Whether any element in the array is true

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
...     {
...         "arr": [
...             [True, False],
...             [False],
...             [True],
...             [None, False],
...             [None, True],
...             [None],
...             [],
...             None,
...         ]
...     }
... )
>>> t.mutate(x=t.arr.anys())
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ arr             x       ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ array<boolean>boolean │
├────────────────┼─────────┤
│ [True, False]  │ True    │
│ [False]        │ False   │
│ [True]         │ True    │
│ [None, False]  │ False   │
│ [None, True]   │ True    │
│ [None]NULL    │
│ []NULL    │
│ NULLNULL    │
└────────────────┴─────────┘

concat

concat(other, *args)

Concatenate this array with one or more arrays.

Parameters

Name Type Description Default
other ArrayValue Other array to concat with self required
args ArrayValue Other arrays to concat with self ()

Returns

Type Description
ArrayValue self concatenated with other and args

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[7], [3], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ a            ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [7]          │
│ [3]          │
│ NULL         │
└──────────────┘
>>> t.a.concat(t.a)
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayConcat((a, a)) ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>        │
├─────────────────────┤
│ [7, 7]              │
│ [3, 3]              │
│ NULL                │
└─────────────────────┘
>>> t.a.concat(ibis.literal([4], type="array<int64>"))
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayConcat((a, (4,))) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [7, 4]                 │
│ [3, 4]                 │
│ [4]                    │
└────────────────────────┘

concat is also available using the + operator

>>> [1] + t.a
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayConcat(((1,), a)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [1, 7]                 │
│ [1, 3]                 │
│ [1]                    │
└────────────────────────┘
>>> t.a + [1]
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayConcat((a, (1,))) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [7, 1]                 │
│ [3, 1]                 │
│ [1]                    │
└────────────────────────┘

contains

contains(other)

Return whether the array contains other.

Parameters

Name Type Description Default
other ir.Value Ibis expression to check for existence of in self required

Returns

Type Description
BooleanValue Whether other is contained in self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1], [], [42, 42], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ arr          ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [1]          │
│ []           │
│ [42, 42]     │
│ NULL         │
└──────────────┘
>>> t.arr.contains(42)
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayContains(arr, 42) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean                │
├────────────────────────┤
│ False                  │
│ False                  │
│ True                   │
│ NULL                   │
└────────────────────────┘
>>> t.arr.contains(None)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayContains(arr, None) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean                  │
├──────────────────────────┤
│ NULL                     │
│ NULL                     │
│ NULL                     │
│ NULL                     │
└──────────────────────────┘

filter

filter(predicate)

Filter array elements using predicate function or Deferred.

Parameters

Name Type Description Default
predicate Deferred | Callable[[ir.Value], bool | ir.BooleanValue] Function or Deferred to use to filter array elements required

Returns

Type Description
ArrayValue Array elements filtered using predicate

Examples

>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[1, None, 2], [4], []]})
>>> t
┏━━━━━━━━━━━━━━━━━━━┓
┃ a                 ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ array<int64>      │
├───────────────────┤
│ [1, None, ... +1] │
│ [4]               │
│ []                │
└───────────────────┘

The most succinct way to use filter is with Deferred expressions:

>>> t.a.filter(_ > 1)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFilter(a, Greater(_, 1)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>                  │
├───────────────────────────────┤
│ [2]                           │
│ [4]                           │
│ []                            │
└───────────────────────────────┘

You can also use map with a lambda function:

>>> t.a.filter(lambda x: x > 1)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFilter(a, Greater(x, 1)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>                  │
├───────────────────────────────┤
│ [2]                           │
│ [4]                           │
│ []                            │
└───────────────────────────────┘

.filter() also supports more complex callables like functools.partial and lambdas with closures

>>> from functools import partial
>>> def gt(x, y):
...     return x > y
>>> gt1 = partial(gt, y=1)
>>> t.a.filter(gt1)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFilter(a, Greater(x, 1)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>                  │
├───────────────────────────────┤
│ [2]                           │
│ [4]                           │
│ []                            │
└───────────────────────────────┘
>>> y = 1
>>> t.a.filter(lambda x: x > y)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFilter(a, Greater(x, 1)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>                  │
├───────────────────────────────┤
│ [2]                           │
│ [4]                           │
│ []                            │
└───────────────────────────────┘

flatten

flatten()

Remove one level of nesting from an array expression.

Returns

Type Description
ArrayValue Flattened array expression

Examples

>>> import ibis
>>> import ibis.selectors as s
>>> from ibis import _
>>> ibis.options.interactive = True
>>> schema = {
...     "empty": "array<array<int>>",
...     "happy": "array<array<string>>",
...     "nulls_only": "array<array<struct<a: array<string>>>>",
...     "mixed_nulls": "array<array<string>>",
... }
>>> data = {
...     "empty": [[], [], []],
...     "happy": [[["abc"]], [["bcd"]], [["def"]]],
...     "nulls_only": [None, None, None],
...     "mixed_nulls": [[], None, [None]],
... }
>>> import pyarrow as pa
>>> t = ibis.memtable(
...     pa.Table.from_pydict(
...         data,
...         schema=ibis.schema(schema).to_pyarrow(),
...     )
... )
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ empty                happy                 nulls_only  mixed_nulls          ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ array<array<int64>>array<array<string>>array<arr…array<array<string>> │
├─────────────────────┼──────────────────────┼────────────┼──────────────────────┤
│ [][[...]]NULL[]                   │
│ [][[...]]NULLNULL                 │
│ [][[...]]NULL[None]               │
└─────────────────────┴──────────────────────┴────────────┴──────────────────────┘
>>> t.empty.flatten()
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFlatten(empty) ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>        │
├─────────────────────┤
│ []                  │
│ []                  │
│ []                  │
└─────────────────────┘
>>> t.happy.flatten()
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFlatten(happy) ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ array<string>       │
├─────────────────────┤
│ ['abc']             │
│ ['bcd']             │
│ ['def']             │
└─────────────────────┘
>>> t.nulls_only.flatten()
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFlatten(nulls_only) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<struct<a: array<s… │
├──────────────────────────┤
│ NULL                     │
│ NULL                     │
│ NULL                     │
└──────────────────────────┘
>>> t.mixed_nulls.flatten()
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayFlatten(mixed_nulls) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<string>             │
├───────────────────────────┤
│ []                        │
│ NULL                      │
│ []                        │
└───────────────────────────┘
>>> t.select(s.across(s.all(), _.flatten()))
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ empty         happy          nulls_only  mixed_nulls   ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ array<int64>array<string>array<str…array<string> │
├──────────────┼───────────────┼────────────┼───────────────┤
│ []['abc']NULL[]            │
│ []['bcd']NULLNULL          │
│ []['def']NULL[]            │
└──────────────┴───────────────┴────────────┴───────────────┘

index

index(other)

Return the position of other in an array.

Parameters

Name Type Description Default
other ir.Value Ibis expression to existence of in self required

Returns

Type Description
BooleanValue The position of other in self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1], [], [42, 42], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ arr          ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [1]          │
│ []           │
│ [42, 42]     │
│ NULL         │
└──────────────┘
>>> t.arr.index(42)
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayPosition(arr, 42) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                  │
├────────────────────────┤
│                     -1 │
│                     -1 │
│                      0 │
│                   NULL │
└────────────────────────┘
>>> t.arr.index(800)
┏━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayPosition(arr, 800) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                   │
├─────────────────────────┤
│                      -1 │
│                      -1 │
│                      -1 │
│                    NULL │
└─────────────────────────┘
>>> t.arr.index(None)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayPosition(arr, None) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                    │
├──────────────────────────┤
│                     NULL │
│                     NULL │
│                     NULL │
│                     NULL │
└──────────────────────────┘

intersect

intersect(other)

Intersect two arrays.

Parameters

Name Type Description Default
other ArrayValue Another array to intersect with self required

Returns

Type Description
ArrayValue Intersected arrays

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr1": [[3, 2], [], None], "arr2": [[1, 3], [None], [5]]})
>>> t
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ arr1          arr2         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ array<int64>array<int64> │
├──────────────┼──────────────┤
│ [3, 2][1, 3]       │
│ [][None]       │
│ NULL[5]          │
└──────────────┴──────────────┘
>>> t.arr1.intersect(t.arr2)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayIntersect(arr1, arr2) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>               │
├────────────────────────────┤
│ [3]                        │
│ []                         │
│ NULL                       │
└────────────────────────────┘

join

join(sep)

Join the elements of this array expression with sep.

Parameters

Name Type Description Default
sep str | ir.StringValue Separator to use for joining array elements required

Returns

Type Description
StringValue Elements of self joined with sep

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [["a", "b", "c"], None, [], ["b", None]]})
>>> t
┏━━━━━━━━━━━━━━━━━━━━┓
┃ arr                ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ array<string>      │
├────────────────────┤
│ ['a', 'b', ... +1] │
│ NULL               │
│ []                 │
│ ['b', None]        │
└────────────────────┘
>>> t.arr.join("|")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayStringJoin(arr, '|') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string                    │
├───────────────────────────┤
│ a|b|c                     │
│ NULL                      │
│ NULL                      │
│ b                         │
└───────────────────────────┘

See Also

StringValue.join

length

length()

Compute the length of an array.

Returns

Type Description
IntegerValue The integer length of each element of self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[7, 42], [3], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ a            ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [7, 42]      │
│ [3]          │
│ NULL         │
└──────────────┘
>>> t.a.length()
┏━━━━━━━━━━━━━━━━┓
┃ ArrayLength(a) ┃
┡━━━━━━━━━━━━━━━━┩
│ int64          │
├────────────────┤
│              2 │
│              1 │
│           NULL │
└────────────────┘

map

map(func)

Apply a func or Deferred to each element of this array expression.

Parameters

Name Type Description Default
func Deferred | Callable[[ir.Value], ir.Value] Function or Deferred to apply to each element of this array. required

Returns

Type Description
ArrayValue func applied to every element of this array expression.

Examples

>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[1, None, 2], [4], []]})
>>> t
┏━━━━━━━━━━━━━━━━━━━┓
┃ a                 ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ array<int64>      │
├───────────────────┤
│ [1, None, ... +1] │
│ [4]               │
│ []                │
└───────────────────┘

The most succinct way to use map is with Deferred expressions:

>>> t.a.map((_ + 100).cast("float"))
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayMap(a, Cast(Add(_, 100), float64)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<float64>                          │
├─────────────────────────────────────────┤
│ [101.0, None, ... +1]                   │
│ [104.0]                                 │
│ []                                      │
└─────────────────────────────────────────┘

You can also use map with a lambda function:

>>> t.a.map(lambda x: (x + 100).cast("float"))
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayMap(a, Cast(Add(x, 100), float64)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<float64>                          │
├─────────────────────────────────────────┤
│ [101.0, None, ... +1]                   │
│ [104.0]                                 │
│ []                                      │
└─────────────────────────────────────────┘

.map() also supports more complex callables like functools.partial and lambdas with closures

>>> from functools import partial
>>> def add(x, y):
...     return x + y
>>> add2 = partial(add, y=2)
>>> t.a.map(add2)
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayMap(a, Add(x, 2)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [3, None, ... +1]      │
│ [6]                    │
│ []                     │
└────────────────────────┘
>>> y = 2
>>> t.a.map(lambda x: x + y)
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayMap(a, Add(x, 2)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [3, None, ... +1]      │
│ [6]                    │
│ []                     │
└────────────────────────┘

maxs

maxs()

Return the maximum value in the array.

Returns NULL if the array is empty or contains only NULLs.

See Also

Column.max

Returns

Type Description
Value Maximum value in the array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1, 2, 3], [None, 6], [None], [], None]})
>>> t.mutate(x=t.arr.maxs())
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ arr             x     ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ array<int64>int64 │
├────────────────┼───────┤
│ [1, 2, ... +1]3 │
│ [None, 6]6 │
│ [None]NULL │
│ []NULL │
│ NULLNULL │
└────────────────┴───────┘

means

means()

Return the mean of the values in the array.

Returns NULL if the array is empty or contains only NULLs.

See Also

NumericColumn.mean

Returns

Type Description
Value Mean of the values in the array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1, 2, 3], [None, 6], [None], [], None]})
>>> t.mutate(x=t.arr.means())
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ arr             x       ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ array<int64>float64 │
├────────────────┼─────────┤
│ [1, 2, ... +1]2.0 │
│ [None, 6]6.0 │
│ [None]NULL │
│ []NULL │
│ NULLNULL │
└────────────────┴─────────┘

mins

mins()

Return the minimum value in the array.

Returns NULL if the array is empty or contains only NULLs.

See Also

Column.min

Returns

Type Description
Value Minimum value in the array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1, 2, 3], [None, 6], [None], [], None]})
>>> t.mutate(x=t.arr.mins())
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ arr             x     ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ array<int64>int64 │
├────────────────┼───────┤
│ [1, 2, ... +1]1 │
│ [None, 6]6 │
│ [None]NULL │
│ []NULL │
│ NULLNULL │
└────────────────┴───────┘

remove

remove(other)

Remove other from self.

Parameters

Name Type Description Default
other ir.Value Element to remove from self. required

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[3, 2], [], [42, 2], [2, 2], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ arr          ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [3, 2]       │
│ []           │
│ [42, 2]      │
│ [2, 2]       │
│ NULL         │
└──────────────┘
>>> t.arr.remove(2)
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayRemove(arr, 2) ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>        │
├─────────────────────┤
│ [3]                 │
│ []                  │
│ [42]                │
│ []                  │
│ NULL                │
└─────────────────────┘

repeat

repeat(n)

Repeat this array n times.

Parameters

Name Type Description Default
n int | ir.IntegerValue Number of times to repeat self. required

Returns

Type Description
ArrayValue self repeated n times

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[7], [3], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ a            ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [7]          │
│ [3]          │
│ NULL         │
└──────────────┘
>>> t.a.repeat(2)
┏━━━━━━━━━━━━━━━━━━━┓
┃ ArrayRepeat(a, 2) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ array<int64>      │
├───────────────────┤
│ [7, 7]            │
│ [3, 3]            │
│ []                │
└───────────────────┘

repeat is also available using the * operator

>>> 2 * t.a
┏━━━━━━━━━━━━━━━━━━━┓
┃ ArrayRepeat(a, 2) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ array<int64>      │
├───────────────────┤
│ [7, 7]            │
│ [3, 3]            │
│ []                │
└───────────────────┘

sort

sort()

Sort the elements in an array.

Returns

Type Description
ArrayValue Sorted values in an array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[3, 2], [], [42, 42], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ arr          ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [3, 2]       │
│ []           │
│ [42, 42]     │
│ NULL         │
└──────────────┘
>>> t.arr.sort()
┏━━━━━━━━━━━━━━━━┓
┃ ArraySort(arr) ┃
┡━━━━━━━━━━━━━━━━┩
│ array<int64>   │
├────────────────┤
│ [2, 3]         │
│ []             │
│ [42, 42]       │
│ NULL           │
└────────────────┘

sums

sums()

Return the sum of the values in the array.

Returns NULL if the array is empty or contains only NULLs.

See Also

NumericColumn.sum

Returns

Type Description
Value Sum of the values in the array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1, 2, 3], [None, 6], [None], [], None]})
>>> t.mutate(x=t.arr.sums())
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ arr             x     ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ array<int64>int64 │
├────────────────┼───────┤
│ [1, 2, ... +1]6 │
│ [None, 6]6 │
│ [None]NULL │
│ []NULL │
│ NULLNULL │
└────────────────┴───────┘

union

union(other)

Union two arrays.

Parameters

Name Type Description Default
other ir.ArrayValue Another array to union with self required

Returns

Type Description
ArrayValue Unioned arrays

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr1": [[3, 2], [], None], "arr2": [[1, 3], [None], [5]]})
>>> t
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ arr1          arr2         ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ array<int64>array<int64> │
├──────────────┼──────────────┤
│ [3, 2][1, 3]       │
│ [][None]       │
│ NULL[5]          │
└──────────────┴──────────────┘
>>> t.arr1.union(t.arr2)
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayUnion(arr1, arr2) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [1, 2, ... +1]         │
│ [None]                 │
│ [5]                    │
└────────────────────────┘
>>> t.arr1.union(t.arr2).contains(3)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayContains(ArrayUnion(arr1, arr2), 3) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean                                  │
├──────────────────────────────────────────┤
│ True                                     │
│ False                                    │
│ False                                    │
└──────────────────────────────────────────┘

unique

unique()

Return the unique values in an array.

Element ordering in array may not be retained.

Returns

Type Description
ArrayValue Unique values in an array

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"arr": [[1, 3, 3], [], [42, 42, None], None]})
>>> t.arr.unique()
┏━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayDistinct(arr) ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>       │
├────────────────────┤
│ [3, 1]             │
│ []                 │
│ [42, None]         │
│ NULL               │
└────────────────────┘

unnest

unnest()

Unnest an array into a column.

Empty arrays and NULLs are dropped in the output.

To preserve empty arrays as NULLs as well as existing NULL values, use Table.unnest.

Returns

Type Description
ir.Value Unnested array

See Also

Table.unnest

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [[7, 42], [3, 3], None]})
>>> t
┏━━━━━━━━━━━━━━┓
┃ a            ┃
┡━━━━━━━━━━━━━━┩
│ array<int64> │
├──────────────┤
│ [7, 42]      │
│ [3, 3]       │
│ NULL         │
└──────────────┘
>>> t.a.unnest()
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     7 │
│    42 │
│     3 │
│     3 │
└───────┘

zip

zip(other, *others)

Zip two or more arrays together.

Parameters

Name Type Description Default
other ArrayValue Another array to zip with self required
others ArrayValue Additional arrays to zip with self ()

Returns

Type Description
Array Array of structs where each struct field is an element of each input array. The fields are named f1, f2, f3, etc.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.options.repr.interactive.max_depth = 2
>>> t = ibis.memtable(
...     {
...         "numbers": [[3, 2], [6, 7], [], None],
...         "strings": [["a", "c"], ["d"], [], ["x", "y"]],
...     }
... )
>>> t
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ numbers       strings       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ array<int64>array<string> │
├──────────────┼───────────────┤
│ [3, 2]['a', 'c']    │
│ [6, 7]['d']         │
│ [][]            │
│ NULL['x', 'y']    │
└──────────────┴───────────────┘
>>> t.numbers.zip(t.strings)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayZip((numbers, strings))                  ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<struct<f1: int64, f2: string>>          │
├───────────────────────────────────────────────┤
│ [{'f1': 3, 'f2': 'a'}, {'f1': 2, 'f2': 'c'}]  │
│ [{'f1': 6, 'f2': 'd'}, {'f1': 7, 'f2': None}] │
│ []                                            │
│ NULL                                          │
└───────────────────────────────────────────────┘

MapValue

MapValue(self, arg)

A dict-like collection with fixed-type keys and values.

Maps are similar to a Python dictionary, with the restriction that all keys must have the same type, and all values must have the same type.

The key type and the value type can be different.

For example, keys are strings, and values are int64s.

Keys are unique within a given map value.

Maps can be constructed with ibis.map().

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> import pyarrow as pa
>>> tab = pa.table(
...     {
...         "m": pa.array(
...             [[("a", 1), ("b", 2)], [("a", 1)], None],
...             type=pa.map_(pa.utf8(), pa.int64()),
...         )
...     }
... )
>>> t = ibis.memtable(tab)
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ m                   ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ map<!string, int64> │
├─────────────────────┤
│ {'a': 1, 'b': 2}    │
│ {'a': 1}            │
│ NULL                │
└─────────────────────┘

Can use [] to access values:

>>> t.m["a"]
┏━━━━━━━━━━━━━━━━━━━━━━┓
┃ MapGet(m, 'a', None) ┃
┡━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                │
├──────────────────────┤
│                    1 │
│                    1 │
│                 NULL │
└──────────────────────┘

To provide default values, use get:

>>> t.m.get("b", 0)
┏━━━━━━━━━━━━━━━━━━━┓
┃ MapGet(m, 'b', 0) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ int64             │
├───────────────────┤
│                 2 │
│                 0 │
│              NULL │
└───────────────────┘

Methods

Name Description
contains Return whether the map contains key.
get Return the value for key from expr.
keys Extract the keys of a map.
length Return the number of key-value pairs in the map.
values Extract the values of a map.

contains

contains(key)

Return whether the map contains key.

Parameters

Name Type Description Default
key int | str | ir.IntegerValue | ir.StringValue Mapping key for which to check required

Returns

Type Description
BooleanValue Boolean indicating the presence of key in the map expression

Examples

>>> import ibis
>>> import pyarrow as pa
>>> ibis.options.interactive = True
>>> tab = pa.table(
...     {
...         "m": pa.array(
...             [[("a", 1), ("b", 2)], [("a", 1)], None],
...             type=pa.map_(pa.utf8(), pa.int64()),
...         )
...     }
... )
>>> t = ibis.memtable(tab)
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ m                   ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ map<!string, int64> │
├─────────────────────┤
│ {'a': 1, 'b': 2}    │
│ {'a': 1}            │
│ NULL                │
└─────────────────────┘
>>> t.m.contains("b")
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ MapContains(m, 'b') ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ boolean             │
├─────────────────────┤
│ True                │
│ False               │
│ NULL                │
└─────────────────────┘

get

get(key, default=None)

Return the value for key from expr.

Return default if key is not in the map.

Parameters

Name Type Description Default
key ir.Value Expression to use for key required
default ir.Value | None Expression to return if key is not a key in expr None

Returns

Type Description
Value The element type of self

Examples

>>> import ibis
>>> import pyarrow as pa
>>> ibis.options.interactive = True
>>> tab = pa.table(
...     {
...         "m": pa.array(
...             [[("a", 1), ("b", 2)], [("a", 1)], None],
...             type=pa.map_(pa.utf8(), pa.int64()),
...         )
...     }
... )
>>> t = ibis.memtable(tab)
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ m                   ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ map<!string, int64> │
├─────────────────────┤
│ {'a': 1, 'b': 2}    │
│ {'a': 1}            │
│ NULL                │
└─────────────────────┘
>>> t.m.get("a")
┏━━━━━━━━━━━━━━━━━━━━━━┓
┃ MapGet(m, 'a', None) ┃
┡━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                │
├──────────────────────┤
│                    1 │
│                    1 │
│                 NULL │
└──────────────────────┘
>>> t.m.get("b")
┏━━━━━━━━━━━━━━━━━━━━━━┓
┃ MapGet(m, 'b', None) ┃
┡━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                │
├──────────────────────┤
│                    2 │
│                 NULL │
│                 NULL │
└──────────────────────┘
>>> t.m.get("b", 0)
┏━━━━━━━━━━━━━━━━━━━┓
┃ MapGet(m, 'b', 0) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ int64             │
├───────────────────┤
│                 2 │
│                 0 │
│              NULL │
└───────────────────┘

keys

keys()

Extract the keys of a map.

Returns

Type Description
ArrayValue The keys of self

Examples

>>> import ibis
>>> import pyarrow as pa
>>> ibis.options.interactive = True
>>> tab = pa.table(
...     {
...         "m": pa.array(
...             [[("a", 1), ("b", 2)], [("a", 1)], None],
...             type=pa.map_(pa.utf8(), pa.int64()),
...         )
...     }
... )
>>> t = ibis.memtable(tab)
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ m                   ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ map<!string, int64> │
├─────────────────────┤
│ {'a': 1, 'b': 2}    │
│ {'a': 1}            │
│ NULL                │
└─────────────────────┘
>>> t.m.keys()
┏━━━━━━━━━━━━━━━━┓
┃ MapKeys(m)     ┃
┡━━━━━━━━━━━━━━━━┩
│ array<!string> │
├────────────────┤
│ ['a', 'b']     │
│ ['a']          │
│ NULL           │
└────────────────┘

length

length()

Return the number of key-value pairs in the map.

Returns

Type Description
IntegerValue The number of elements in self

Examples

>>> import ibis
>>> import pyarrow as pa
>>> ibis.options.interactive = True
>>> tab = pa.table(
...     {
...         "m": pa.array(
...             [[("a", 1), ("b", 2)], [("a", 1)], None],
...             type=pa.map_(pa.utf8(), pa.int64()),
...         )
...     }
... )
>>> t = ibis.memtable(tab)
>>> t
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ m                   ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ map<!string, int64> │
├─────────────────────┤
│ {'a': 1, 'b': 2}    │
│ {'a': 1}            │
│ NULL                │
└─────────────────────┘
>>> t.m.length()
┏━━━━━━━━━━━━━━┓
┃ MapLength(m) ┃
┡━━━━━━━━━━━━━━┩
│ int64        │
├──────────────┤
│            2 │
│            1 │
│         NULL │
└──────────────┘

values

values()

Extract the values of a map.

Returns

Type Description
ArrayValue The values of self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> m = ibis.map({"a": 1, "b": 2})
>>> m.values()

┌────────┐
│ [1, 2] │
└────────┘

StructValue

StructValue(self, arg)

A Struct is a nested type with ordered fields of any type.

For example, a Struct might have a field a of type int64 and a field b of type string.

Structs can be constructed with ibis.struct().

Examples

Construct a Struct column with fields a: int64 and b: string

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s": [{"a": 1, "b": "foo"}, {"a": 3, "b": None}, None]})
>>> t
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ s                           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ struct<a: int64, b: string> │
├─────────────────────────────┤
│ {'a': 1, 'b': 'foo'}        │
│ {'a': 3, 'b': None}         │
│ NULL                        │
└─────────────────────────────┘

You can use dot notation (.) or square-bracket syntax ([]) to access struct column fields

>>> t.s.a
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     3 │
│  NULL │
└───────┘
>>> t.s["a"]
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     3 │
│  NULL │
└───────┘

Attributes

Name Description
fields Return a mapping from field name to field type of the struct.
names Return the field names of the struct.
types Return the field types of the struct.

Methods

Name Description
destructure Destructure a StructValue into the corresponding struct fields.
lift Project the fields of self into a table.

destructure

destructure()

Destructure a StructValue into the corresponding struct fields.

When assigned, a destruct value will be destructured and assigned to multiple columns.

Returns

Type Description
list[AnyValue] Value expressions corresponding to the struct fields.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s": [{"a": 1, "b": "foo"}, {"a": 3, "b": None}, None]})
>>> t
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ s                           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ struct<a: int64, b: string> │
├─────────────────────────────┤
│ {'a': 1, 'b': 'foo'}        │
│ {'a': 3, 'b': None}         │
│ NULL                        │
└─────────────────────────────┘
>>> a, b = t.s.destructure()
>>> a
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     3 │
│  NULL │
└───────┘
>>> b
┏━━━━━━━━┓
┃ b      ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ foo    │
│ NULL   │
│ NULL   │
└────────┘

lift

lift()

Project the fields of self into a table.

This method is useful when analyzing data that has deeply nested structs or arrays of structs. lift can be chained to avoid repeating column names and table references.

Returns

Type Description
Table A projection with this struct expression’s fields.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
...     {
...         "pos": [
...             {"lat": 10.1, "lon": 30.3},
...             {"lat": 10.2, "lon": 30.2},
...             {"lat": 10.3, "lon": 30.1},
...         ]
...     }
... )
>>> t
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ pos                                ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ struct<lat: float64, lon: float64> │
├────────────────────────────────────┤
│ {'lat': 10.1, 'lon': 30.3}         │
│ {'lat': 10.2, 'lon': 30.2}         │
│ {'lat': 10.3, 'lon': 30.1}         │
└────────────────────────────────────┘
>>> t.pos.lift()
┏━━━━━━━━━┳━━━━━━━━━┓
┃ lat      lon     ┃
┡━━━━━━━━━╇━━━━━━━━━┩
│ float64float64 │
├─────────┼─────────┤
│    10.130.3 │
│    10.230.2 │
│    10.330.1 │
└─────────┴─────────┘

See Also

Table.unpack

array

ibis.array(values)

Create an array expression.

If any values are column expressions the result will be a column. Otherwise the result will be a scalar.

Parameters

Name Type Description Default
values Iterable[V] An iterable of Ibis expressions or Python literals required

Returns

Type Description
ArrayValue

Examples

Create an array scalar from scalar values

>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.array([1.0, None])

┌─────────────┐
│ [1.0, None] │
└─────────────┘

Create an array from column and scalar expressions

>>> t = ibis.memtable({"a": [1, 2, 3], "b": [4, 5, 6]})
>>> ibis.array([t.a, 42, ibis.literal(None)])
┏━━━━━━━━━━━━━━━━━━━━━━┓
┃ Array((a, 42, None)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>         │
├──────────────────────┤
│ [1, 42, ... +1]      │
│ [2, 42, ... +1]      │
│ [3, 42, ... +1]      │
└──────────────────────┘
>>> ibis.array([t.a, 42 + ibis.literal(5)])
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Array((a, Add(5, 42))) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<int64>           │
├────────────────────────┤
│ [1, 47]                │
│ [2, 47]                │
│ [3, 47]                │
└────────────────────────┘

map

ibis.map(keys, values=None)

Create a MapValue.

If any of the keys or values are Columns, then the output will be a MapColumn. Otherwise, the output will be a MapScalar.

Parameters

Name Type Description Default
keys Iterable[Any] | Mapping[Any, Any] | ArrayValue Keys of the map or Mapping. If keys is a Mapping, values must be None. required
values Iterable[Any] | ArrayValue | None Values of the map or None. If None, the keys argument must be a Mapping. None

Returns

Type Description
MapValue Either a MapScalar or MapColumn, depending on the input shapes.

Examples

Create a Map scalar from a dict with the type inferred

>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.map(dict(a=1, b=2))

┌──────────────────┐
│ {'a': 1, 'b': 2} │
└──────────────────┘

Create a Map Column from columns with keys and values

>>> t = ibis.memtable({"keys": [["a", "b"], ["b"]], "values": [[1, 2], [3]]})
>>> t
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ keys           values       ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ array<string>array<int64> │
├───────────────┼──────────────┤
│ ['a', 'b'][1, 2]       │
│ ['b'][3]          │
└───────────────┴──────────────┘
>>> ibis.map(t.keys, t.values)
┏━━━━━━━━━━━━━━━━━━━━┓
┃ Map(keys, values)  ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ map<string, int64> │
├────────────────────┤
│ {'a': 1, 'b': 2}   │
│ {'b': 3}           │
└────────────────────┘

struct

ibis.struct(value, type=None)

Create a struct expression.

If any of the inputs are Columns, then the output will be a StructColumn. Otherwise, the output will be a StructScalar.

Parameters

Name Type Description Default
value Iterable[tuple[str, V]] | Mapping[str, V] Either a {str: Value} mapping, or an iterable of tuples of the form (str, Value). required
type str | dt.DataType | None An instance of ibis.expr.datatypes.DataType or a string indicating the Ibis type of value. This is only used if all of the input values are Python literals. eg struct<a: float, b: string>. None

Returns

Type Description
StructValue An StructScalar or StructColumn expression.

Examples

Create a struct scalar literal from a dict with the type inferred

>>> import ibis
>>> ibis.options.interactive = True
>>> ibis.struct(dict(a=1, b="foo"))

┌──────────────────────┐
│ {'a': 1, 'b': 'foo'} │
└──────────────────────┘

Specify a type (note the 1 is now a float):

>>> ibis.struct(dict(a=1, b="foo"), type="struct<a: float, b: string>")

┌────────────────────────┐
│ {'a': 1.0, 'b': 'foo'} │
└────────────────────────┘

Create a struct column from a column and a scalar literal

>>> t = ibis.memtable({"a": [1, 2, 3]})
>>> ibis.struct([("a", t.a), ("b", "foo")])
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StructColumn({'a': a, 'b': 'foo'}) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ struct<a: int64, b: string>        │
├────────────────────────────────────┤
│ {'a': 1, 'b': 'foo'}               │
│ {'a': 2, 'b': 'foo'}               │
│ {'a': 3, 'b': 'foo'}               │
└────────────────────────────────────┘
Back to top