Skip to content

Schemas

This module contains APIs for interacting with table schemas.

Schema (Annotable, Comparable)

An object for holding table schema information.

Attributes

names: Sequence[str]

A sequence of str indicating the name of each column.

types: Sequence[ibis.expr.datatypes.core.DataType]

A sequence of DataType objects representing type of each column.

Methods

append(self, schema)

Append schema to self.

Parameters:

Name Type Description Default
schema Schema

Schema instance to append to self.

required

Examples:

>>> import ibis
>>> first = ibis.Schema.from_dict({"a": "int", "b": "string"})
>>> second = ibis.Schema.from_dict({"c": "float", "d": "int16"})
>>> first.append(second)
ibis.Schema {
  a  int64
  b  string
  c  float64
  d  int16
}

Returns:

Type Description
Schema

A new schema appended with schema.

apply_to(self, df)

Apply the schema self to a pandas DataFrame.

This method mutates the input DataFrame.

Parameters:

Name Type Description Default
df pd.DataFrame

Input DataFrame

required

Examples:

Import the necessary modules

>>> import numpy as np
>>> import pandas as pd
>>> import ibis
>>> import ibis.expr.datatypes as dt

Construct a DataFrame with string timestamps and an int8 column that we're going to upcast.

>>> data = dict(
...     times=[
...         "2022-01-01 12:00:00",
...         "2022-01-01 13:00:01",
...         "2022-01-01 14:00:02",
...     ],
...     x=np.array([-1, 0, 1], dtype="int8")
... )
>>> df = pd.DataFrame(data)
>>> df
                 times  x
0  2022-01-01 12:00:00 -1
1  2022-01-01 13:00:01  0
2  2022-01-01 14:00:02  1
>>> df.dtypes
times    object
x          int8
dtype: object

Construct an ibis Schema that we want to cast to.

>>> sch = ibis.schema({"times": dt.timestamp, "x": "int16"})
>>> sch
ibis.Schema {
  times  timestamp
  x      int16
}

Apply the schema

>>> sch.apply_to(df)
                times  x
0 2022-01-01 12:00:00 -1
1 2022-01-01 13:00:01  0
2 2022-01-01 14:00:02  1
>>> df.dtypes  # `df` is mutated by the method
times    datetime64[ns]
x                 int16
dtype: object

Returns:

Type Description
pd.DataFrame

Type-converted DataFrame

delete(self, names_to_delete)

Remove names_to_delete names from self.

Parameters:

Name Type Description Default
names_to_delete Iterable[str]

Iterable of str to remove from the schema.

required

Examples:

>>> import ibis
>>> sch = ibis.schema({"a": "int", "b": "string"})
>>> sch.delete({"a"})
ibis.Schema {
  b  string
}

equals(self, other)

Return whether other is equal to self.

Parameters:

Name Type Description Default
other Schema

Schema to compare self to.

required

Examples:

>>> import ibis
>>> first = ibis.schema({"a": "int"})
>>> second = ibis.schema({"a": "int"})
>>> first.equals(second)
True
>>> third = ibis.schema({"a": "array<int>"})
>>> first.equals(third)
False

from_dict(dictionary) classmethod

Construct a Schema from a Mapping.

Parameters:

Name Type Description Default
dictionary Mapping[str, str | dt.DataType]

Mapping from which to construct a Schema instance.

required

Examples:

>>> import ibis
>>> ibis.Schema.from_dict({"a": "int", "b": "string"})
ibis.Schema {
  a  int64
  b  string
}

Returns:

Type Description
Schema

A new schema

from_tuples(values) classmethod

Construct a Schema from an iterable of pairs.

Parameters:

Name Type Description Default
values Iterable[tuple[str, str | dt.DataType]]

An iterable of pairs of name and type.

required

Examples:

>>> import ibis
>>> ibis.Schema.from_tuples([("a", "int"), ("b", "string")])
ibis.Schema {
  a  int64
  b  string
}

Returns:

Type Description
Schema

A new schema

items(self)

Return an iterator of pairs of names and types.

Examples:

>>> import ibis
>>> sch = ibis.Schema.from_dict({"a": "int", "b": "string"})
>>> list(sch.items())
[('a', Int64(nullable=True)), ('b', String(nullable=True))]

Returns:

Type Description
Iterator[tuple[str, dt.DataType]]

Iterator of schema components

name_at_position(self, i)

Return the name of a schema column at position i.

Parameters:

Name Type Description Default
i int

The position of the column

required

Examples:

>>> import ibis
>>> sch = ibis.Schema.from_dict({"a": "int", "b": "string"})
>>> sch.name_at_position(0)
'a'
>>> sch.name_at_position(1)
'b'

Returns:

Type Description
str

The name of the column in the schema at position i.


Last update: August 5, 2022