Schemas¶
This module contains APIs for interacting with table schemas.
Schema
¶
Bases: Concrete
, Coercible
, MapSet
An object for holding table schema information.
Attributes¶
fields: FrozenDict[str, dt.DataType]
instance-attribute
¶
Functions¶
apply_to(df)
¶
Apply the schema self
to a pandas DataFrame
.
This method mutates the input DataFrame
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
Input DataFrame |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
Type-converted DataFrame |
Examples:
Import the necessary modules
>>> import numpy as np
>>> import pandas as pd
>>> import ibis
>>> import ibis.expr.datatypes as dt
Construct a DataFrame with string timestamps and an int8
column that
we're going to upcast.
>>> data = dict(
... times=[
... "2022-01-01 12:00:00",
... "2022-01-01 13:00:01",
... "2022-01-01 14:00:02",
... ],
... x=np.array([-1, 0, 1], dtype="int8")
... )
>>> df = pd.DataFrame(data)
>>> df
times x
0 2022-01-01 12:00:00 -1
1 2022-01-01 13:00:01 0
2 2022-01-01 14:00:02 1
>>> df.dtypes
times object
x int8
dtype: object
Construct an ibis Schema that we want to cast to.
>>> sch = ibis.schema({"times": dt.timestamp, "x": "int16"})
>>> sch
ibis.Schema {
times timestamp
x int16
}
Apply the schema
>>> sch.apply_to(df)
times x
0 2022-01-01 12:00:00 -1
1 2022-01-01 13:00:01 0
2 2022-01-01 14:00:02 1
>>> df.dtypes # `df` is mutated by the method
times datetime64[ns]
x int16
dtype: object
equals(other)
¶
Return whether other
is equal to self
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
Schema
|
Schema to compare |
required |
Examples:
>>> import ibis
>>> first = ibis.schema({"a": "int"})
>>> second = ibis.schema({"a": "int"})
>>> assert first.equals(second)
>>> third = ibis.schema({"a": "array<int>"})
>>> assert not first.equals(third)
from_dask(dask_schema)
classmethod
¶
Return the equivalent ibis schema.
from_numpy(numpy_schema)
classmethod
¶
Return the equivalent ibis schema.
from_pandas(pandas_schema)
classmethod
¶
Return the equivalent ibis schema.
from_pyarrow(pyarrow_schema)
classmethod
¶
Return the equivalent ibis schema.
from_tuples(values)
classmethod
¶
Construct a Schema
from an iterable of pairs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
values |
Iterable[tuple[str, str | dt.DataType]]
|
An iterable of pairs of name and type. |
required |
Returns:
Type | Description |
---|---|
Schema
|
A new schema |
Examples:
>>> import ibis
>>> ibis.Schema.from_tuples([("a", "int"), ("b", "string")])
ibis.Schema {
a int64
b string
}
name_at_position(i)
¶
Return the name of a schema column at position i
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
i |
int
|
The position of the column |
required |
Returns:
Type | Description |
---|---|
str
|
The name of the column in the schema at position |
Examples:
>>> import ibis
>>> sch = ibis.Schema({"a": "int", "b": "string"})
>>> sch.name_at_position(0)
'a'
>>> sch.name_at_position(1)
'b'
to_dask()
¶
Return the equivalent dask dtypes.
to_numpy()
¶
Return the equivalent numpy dtypes.
to_pandas()
¶
Return the equivalent pandas datatypes.
to_pyarrow()
¶
Return the equivalent pyarrow schema.