Skip to content

Schemas

This module contains APIs for interacting with table schemas.

Schema

Bases: Concrete, Coercible, MapSet

An object for holding table schema information.

Attributes

fields: FrozenDict[str, dt.DataType] instance-attribute

A mapping of str to DataType objects representing the type of each column.

Functions

apply_to(df)

Apply the schema self to a pandas DataFrame.

This method mutates the input DataFrame.

Parameters:

Name Type Description Default
df pd.DataFrame

Input DataFrame

required

Returns:

Type Description
DataFrame

Type-converted DataFrame

Examples:

Import the necessary modules

>>> import numpy as np
>>> import pandas as pd
>>> import ibis
>>> import ibis.expr.datatypes as dt

Construct a DataFrame with string timestamps and an int8 column that we're going to upcast.

>>> data = dict(
...     times=[
...         "2022-01-01 12:00:00",
...         "2022-01-01 13:00:01",
...         "2022-01-01 14:00:02",
...     ],
...     x=np.array([-1, 0, 1], dtype="int8")
... )
>>> df = pd.DataFrame(data)
>>> df
                 times  x
0  2022-01-01 12:00:00 -1
1  2022-01-01 13:00:01  0
2  2022-01-01 14:00:02  1
>>> df.dtypes
times    object
x          int8
dtype: object

Construct an ibis Schema that we want to cast to.

>>> sch = ibis.schema({"times": dt.timestamp, "x": "int16"})
>>> sch
ibis.Schema {
  times  timestamp
  x      int16
}

Apply the schema

>>> sch.apply_to(df)
                times  x
0 2022-01-01 12:00:00 -1
1 2022-01-01 13:00:01  0
2 2022-01-01 14:00:02  1
>>> df.dtypes  # `df` is mutated by the method
times    datetime64[ns]
x                 int16
dtype: object

equals(other)

Return whether other is equal to self.

Parameters:

Name Type Description Default
other Schema

Schema to compare self to.

required

Examples:

>>> import ibis
>>> first = ibis.schema({"a": "int"})
>>> second = ibis.schema({"a": "int"})
>>> assert first.equals(second)
>>> third = ibis.schema({"a": "array<int>"})
>>> assert not first.equals(third)

from_dask(dask_schema) classmethod

Return the equivalent ibis schema.

from_numpy(numpy_schema) classmethod

Return the equivalent ibis schema.

from_pandas(pandas_schema) classmethod

Return the equivalent ibis schema.

from_pyarrow(pyarrow_schema) classmethod

Return the equivalent ibis schema.

from_tuples(values) classmethod

Construct a Schema from an iterable of pairs.

Parameters:

Name Type Description Default
values Iterable[tuple[str, str | dt.DataType]]

An iterable of pairs of name and type.

required

Returns:

Type Description
Schema

A new schema

Examples:

>>> import ibis
>>> ibis.Schema.from_tuples([("a", "int"), ("b", "string")])
ibis.Schema {
  a  int64
  b  string
}

name_at_position(i)

Return the name of a schema column at position i.

Parameters:

Name Type Description Default
i int

The position of the column

required

Returns:

Type Description
str

The name of the column in the schema at position i.

Examples:

>>> import ibis
>>> sch = ibis.Schema({"a": "int", "b": "string"})
>>> sch.name_at_position(0)
'a'
>>> sch.name_at_position(1)
'b'

to_dask()

Return the equivalent dask dtypes.

to_numpy()

Return the equivalent numpy dtypes.

to_pandas()

Return the equivalent pandas datatypes.

to_pyarrow()

Return the equivalent pyarrow schema.


Last update: August 5, 2022