Skip to content

String Expressions

All string operations are valid for both scalars and columns.

StringValue (Value)

Methods

ascii_str(self)

Return the numeric ASCII code of the first character of a string.

Returns:

Type Description
ir.IntegerValue

ASCII code of the first character of the input

capitalize(self)

Capitalize the input string.

Returns:

Type Description
StringValue

Capitalized string

concat(self, other, *args)

Concatenate strings.

Parameters:

Name Type Description Default
other str | StringValue

String to concatenate

required
args str | StringValue

Additional strings to concatenate

()

Returns:

Type Description
StringValue

All strings concatenated

contains(self, substr)

Return whether the expression contains substr.

Parameters:

Name Type Description Default
substr str | StringValue

Substring for which to check

required

Returns:

Type Description
ir.BooleanValue

Boolean indicating the presence of substr in the expression

convert_base(self, from_base, to_base)

Convert a string representing an integer from one base to another.

Parameters:

Name Type Description Default
from_base int | ir.IntegerValue

Numeric base of the expression

required
to_base int | ir.IntegerValue

New base

required

Returns:

Type Description
ir.IntegerValue

Converted expression

endswith(self, end)

Determine if self ends with end.

Parameters:

Name Type Description Default
end str | StringValue

Suffix to check for

required

Examples:

>>> import ibis
>>> text = ibis.literal('Ibis project')
>>> text.endswith('project')
EndsWith('Ibis project', end='project')

Returns:

Type Description
ir.BooleanValue

Boolean indicating whether self ends with end

find(self, substr, start=None, end=None)

Return the position of the first occurence of substring.

Parameters:

Name Type Description Default
substr str | StringValue

Substring to search for

required
start int | ir.IntegerValue | None

Zero based index of where to start the search

None
end int | ir.IntegerValue | None

Zero based index of where to stop the search. Currently not implemented.

None

Returns:

Type Description
ir.IntegerValue

Position of substr in arg starting from start

find_in_set(self, str_list)

Find the first occurence of str_list within a list of strings.

No string in str_list can have a comma.

Parameters:

Name Type Description Default
str_list Sequence[str]

Sequence of strings

required

Examples:

>>> import ibis
>>> table = ibis.table(dict(strings='string'))
>>> result = table.strings.find_in_set(['a', 'b'])
>>> result
r0 := UnboundTable: unbound_table_0
  strings string
FindInSet(needle=r0.strings, values=[ValueList(values=['a', 'b'])])

Returns:

Type Description
ir.IntegerValue

Position of str_list in self. Returns -1 if self isn't found or if self contains ','.

hashbytes(self, how='sha256')

Compute the binary hash value of the input.

Parameters:

Name Type Description Default
how Literal['md5', 'sha1', 'sha256', 'sha512']

Hash algorithm to use

'sha256'

Returns:

Type Description
ir.BinaryValue

Binary expression

ilike(self, patterns)

Match patterns against self, case-insensitive.

This function is modeled after SQL's ILIKE directive. Use % as a multiple-character wildcard or _ as a single-character wildcard.

Use re_search or rlike for regular expression-based matching.

Parameters:

Name Type Description Default
patterns str | StringValue | Iterable[str | StringValue]

If pattern is a list, then if any pattern matches the input then the corresponding row in the output is True.

required

Returns:

Type Description
ir.BooleanValue

Column indicating matches

initcap(self)

Capitalize the input string.

Returns:

Type Description
StringValue

Capitalized string

join(self, strings)

Join a list of strings using self as the separator.

Parameters:

Name Type Description Default
strings Sequence[str | StringValue]

Strings to join with arg

required

Examples:

>>> import ibis
>>> sep = ibis.literal(',')
>>> result = sep.join(['a', 'b', 'c'])
>>> result
StringJoin(sep=',', [ValueList(values=['a', 'b', 'c'])])

Returns:

Type Description
StringValue

Joined string

left(self, nchars)

Return the nchars left-most characters.

Parameters:

Name Type Description Default
nchars int | ir.IntegerValue

Maximum number of characters to return

required

Returns:

Type Description
StringValue

Characters from the start

length(self)

Compute the length of a string.

Returns:

Type Description
ir.IntegerValue

The length of the input

like(self, patterns)

Match patterns against self, case-sensitive.

This function is modeled after the SQL LIKE directive. Use % as a multiple-character wildcard or _ as a single-character wildcard.

Use re_search or rlike for regular expression-based matching.

Parameters:

Name Type Description Default
patterns str | StringValue | Iterable[str | StringValue]

If pattern is a list, then if any pattern matches the input then the corresponding row in the output is True.

required

Returns:

Type Description
ir.BooleanValue

Column indicating matches

lower(self)

Convert string to all lowercase.

Returns:

Type Description
StringValue

Lowercase string

lpad(self, length, pad=' ')

Pad arg by truncating on the right or padding on the left.

Parameters:

Name Type Description Default
length int | ir.IntegerValue

Length of output string

required
pad str | StringValue

Pad character

' '

Examples:

>>> import ibis
>>> table = ibis.table(dict(strings='string'))
>>> expr = table.strings.lpad(5, '-')
>>> expr
r0 := UnboundTable: unbound_table_1
  strings string
LPad(r0.strings, length=5, pad='-')
>>> expr = ibis.literal('a').lpad(5, '-')  # 'a' becomes '----a'
>>> expr
LPad('a', length=5, pad='-')
>>> expr = ibis.literal('abcdefg').lpad(5, '-')  # 'abcdefg' becomes 'abcde'
>>> expr
LPad('abcdefg', length=5, pad='-')

Returns:

Type Description
StringValue

Padded string

lstrip(self)

Remove whitespace from the left side of string.

Returns:

Type Description
StringValue

Left-stripped string

parse_url(self, extract, key=None)

Parse a URL and extract its components.

key can be used to extract query values when extract == 'QUERY'

Parameters:

Name Type Description Default
extract Literal['PROTOCOL', 'HOST', 'PATH', 'REF', 'AUTHORITY', 'FILE', 'USERINFO', 'QUERY']

Component of URL to extract

required
key str | None

Query component to extract

None

Examples:

>>> url = "https://www.youtube.com/watch?v=kEuEcWfewf8&t=10"
>>> parse_url(url, 'QUERY', 'v')
'kEuEcWfewf8'

Returns:

Type Description
StringValue

Extracted string value

re_extract(self, pattern, index)

Return the specified match at index from a regex pattern.

Parameters:

Name Type Description Default
pattern str | StringValue

Reguar expression string

required
index int | ir.IntegerValue

Zero-based index of match to return

required

Returns:

Type Description
StringValue

Extracted match

re_replace(self, pattern, replacement)

Replace match found by regex pattern with replacement.

Parameters:

Name Type Description Default
pattern str | StringValue

Regular expression string

required
replacement str | StringValue

Replacement string or regular expression

required

Examples:

>>> import ibis
>>> table = ibis.table(dict(strings='string'))
>>> result = table.strings.replace('(b+)', r'<>')  # 'aaabbbaa' becomes 'aaa<bbb>aaa'
>>> result
r0 := UnboundTable: unbound_table_1
  strings string
StringReplace(r0.strings, pattern='(b+)', replacement='<\1>')

Returns:

Type Description
StringValue

Modified string

Return whether the values match pattern.

Returns True if the regex matches a string and False otherwise.

Parameters:

Name Type Description Default
pattern str | StringValue

Regular expression use for searching

required

Returns:

Type Description
ir.BooleanValue

Indicator of matches

repeat(self, n)

Repeat a string n times.

Parameters:

Name Type Description Default
n int | ir.IntegerValue

Number of repetitions

required

Returns:

Type Description
StringValue

Repeated string

replace(self, pattern, replacement)

Replace each exact match of pattern with replacement.

Parameters:

Name Type Description Default
pattern StringValue

String pattern

required
replacement StringValue

String replacement

required

Examples:

>>> import ibis
>>> table = ibis.table(dict(strings='string'))
>>> result = table.strings.replace('aaa', 'foo')  # 'aaabbbaaa' becomes 'foobbbfoo'
>>> result
r0 := UnboundTable: unbound_table_1
  strings string
StringReplace(r0.strings, pattern='aaa', replacement='foo')

Returns:

Type Description
StringValue

Replaced string

reverse(self)

Reverse the characters of a string.

Returns:

Type Description
StringValue

Reversed string

right(self, nchars)

Return up to nchars from the end of each string.

Parameters:

Name Type Description Default
nchars int | ir.IntegerValue

Maximum number of characters to return

required

Returns:

Type Description
StringValue

Characters from the end

rlike(self, pattern)

Return whether the values match pattern.

Returns True if the regex matches a string and False otherwise.

Parameters:

Name Type Description Default
pattern str | StringValue

Regular expression use for searching

required

Returns:

Type Description
ir.BooleanValue

Indicator of matches

rpad(self, length, pad=' ')

Pad self by truncating or padding on the right.

Parameters:

Name Type Description Default
self None

String to pad

required
length int | ir.IntegerValue

Length of output string

required
pad str | StringValue

Pad character

' '

Examples:

>>> import ibis
>>> table = ibis.table(dict(string_col='string'))
>>> expr = table.string_col.rpad(5, '-')
>>> expr
r0 := UnboundTable: unbound_table_2
  string_col string
RPad(r0.string_col, length=5, pad='-')
>>> expr = ibis.literal('a').rpad(5, '-')  # 'a' becomes 'a----'
>>> expr
RPad('a', length=5, pad='-')
>>> expr = ibis.literal('abcdefg').rpad(5, '-')  # 'abcdefg' becomes 'abcde'
>>> expr
RPad('abcdefg', length=5, pad='-')

Returns:

Type Description
StringValue

Padded string

rstrip(self)

Remove whitespace from the right side of string.

Returns:

Type Description
StringValue

Right-stripped string

split(self, delimiter)

Split as string on delimiter.

Parameters:

Name Type Description Default
delimiter str | StringValue

Value to split by

required

Returns:

Type Description
ir.ArrayValue

The string split by delimiter

startswith(self, start)

Determine whether self starts with end.

Parameters:

Name Type Description Default
start str | StringValue

prefix to check for

required

Examples:

>>> import ibis
>>> text = ibis.literal('Ibis project')
>>> text.startswith('Ibis')
StartsWith('Ibis project', start='Ibis')

Returns:

Type Description
ir.BooleanValue

Boolean indicating whether self starts with start

strip(self)

Remove whitespace from left and right sides of a string.

Returns:

Type Description
StringValue

Stripped string

substr(self, start, length=None)

Extract a substring.

Parameters:

Name Type Description Default
start int | ir.IntegerValue

First character to start splitting, indices start at 0

required
length int | ir.IntegerValue | None

Maximum length of each substring. If not supplied, searches the entire string

None

Returns:

Type Description
StringValue

Found substring

to_timestamp(self, format_str, timezone=None)

Parse a string and return a timestamp.

Parameters:

Name Type Description Default
format_str str

Format string in strptime format

required
timezone str | None

A string indicating the timezone. For example 'America/New_York'

None

Examples:

>>> import ibis
>>> date_as_str = ibis.literal('20170206')
>>> result = date_as_str.to_timestamp('%Y%m%d')
>>> result
StringToTimestamp('20170206', format_str='%Y%m%d')

Returns:

Type Description
ir.TimestampValue

Parsed timestamp value

translate(self, from_str, to_str)

Replace from_str characters in self characters in to_str.

To avoid unexpected behavior, from_str should be shorter than to_str.

Parameters:

Name Type Description Default
from_str StringValue

Characters in arg to replace

required
to_str StringValue

Characters to use for replacement

required

Examples:

>>> import ibis
>>> table = ibis.table(dict(string_col='string'))
>>> expr = table.string_col.translate('a', 'b')
>>> expr
r0 := UnboundTable: unbound_table_0
  string_col string
Translate(r0.string_col, from_str='a', to_str='b')
>>> expr = table.string_col.translate('a', 'bc')
>>> expr
r0 := UnboundTable: unbound_table_0
  string_col string
Translate(r0.string_col, from_str='a', to_str='bc')

Returns:

Type Description
StringValue

Translated string

upper(self)

Convert string to all uppercase.

Returns:

Type Description
StringValue

Uppercase string


Last update: August 5, 2022