String Expressions¶
All string operations are valid for both scalars and columns.
strings
¶
Classes¶
StringValue (AnyValue)
¶
Methods¶
ascii_str(self)
¶
Return the numeric ASCII code of the first character of a string.
Returns:
Type | Description |
---|---|
ir.IntegerValue |
ASCII code of the first character of the input |
capitalize(self)
¶
Capitalize the input string.
Returns:
Type | Description |
---|---|
StringValue |
Capitalized string |
concat(self, other, *args)
¶
Concatenate strings.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other |
str | StringValue |
String to concatenate |
required |
args |
str | StringValue |
Additional strings to concatenate |
() |
Returns:
Type | Description |
---|---|
StringValue |
All strings concatenated |
contains(self, substr)
¶
Return whether the expression contains substr
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
substr |
str | StringValue |
Substring for which to check |
required |
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Boolean indicating the presence of |
convert_base(self, from_base, to_base)
¶
Convert a string representing an integer from one base to another.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
from_base |
int | ir.IntegerValue |
Numeric base of the expression |
required |
to_base |
int | ir.IntegerValue |
New base |
required |
Returns:
Type | Description |
---|---|
ir.IntegerValue |
Converted expression |
endswith(self, end)
¶
Determine if self
ends with end
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
end |
str | StringValue |
Suffix to check for |
required |
Examples:
>>> import ibis
>>> text = ibis.literal('Ibis project')
>>> text.endswith('project')
EndsWith('Ibis project', end='project')
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Boolean indicating whether |
find(self, substr, start=None, end=None)
¶
Return the position of the first occurence of substring.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
substr |
str | StringValue |
Substring to search for |
required |
start |
int | ir.IntegerValue | None |
Zero based index of where to start the search |
None |
end |
int | ir.IntegerValue | None |
Zero based index of where to stop the search. Currently not implemented. |
None |
Returns:
Type | Description |
---|---|
ir.IntegerValue |
Position of |
find_in_set(self, str_list)
¶
Find the first occurence of str_list
within a list of strings.
No string in str_list
can have a comma.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
str_list |
Sequence[str] |
Sequence of strings |
required |
Examples:
>>> import ibis
>>> table = ibis.table([('strings', 'string')])
>>> result = table.strings.find_in_set(['a', 'b'])
Returns:
Type | Description |
---|---|
ir.IntegerValue |
Position of |
hashbytes(self, how='sha256')
¶
Compute the binary hash value of the input.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
how |
Literal['md5', 'sha1', 'sha256', 'sha512'] |
Hash algorithm to use |
'sha256' |
Returns:
Type | Description |
---|---|
ir.BinaryValue |
Binary expression |
ilike(self, patterns)
¶
Match patterns
against self
, case-insensitive.
This function is modeled after SQL's ILIKE
directive. Use %
as a
multiple-character wildcard or _
as a single-character wildcard.
Use re_search
or rlike
for regular expression-based matching.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
patterns |
str | StringValue | Iterable[str | StringValue] |
If |
required |
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Column indicating matches |
initcap(self)
¶
Capitalize the input string.
Returns:
Type | Description |
---|---|
StringValue |
Capitalized string |
join(self, strings)
¶
Join a list of strings using self
as the separator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
self |
None |
String expression |
required |
strings |
Sequence[str | StringValue] |
Strings to join with |
required |
Examples:
>>> import ibis
>>> sep = ibis.literal(',')
>>> result = sep.join(['a', 'b', 'c'])
Returns:
Type | Description |
---|---|
StringValue |
Joined string |
left(self, nchars)
¶
Return the nchars
left-most characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
nchars |
int | ir.IntegerValue |
Maximum number of characters to return |
required |
Returns:
Type | Description |
---|---|
StringValue |
Characters from the start |
length(self)
¶
Compute the length of a string.
Returns:
Type | Description |
---|---|
ir.IntegerValue |
The length of the input |
like(self, patterns)
¶
Match patterns
against self
, case-sensitive.
This function is modeled after the SQL LIKE
directive. Use %
as a
multiple-character wildcard or _
as a single-character wildcard.
Use re_search
or rlike
for regular expression-based matching.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
patterns |
str | StringValue | Iterable[str | StringValue] |
If |
required |
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Column indicating matches |
lower(self)
¶
Convert string to all lowercase.
Returns:
Type | Description |
---|---|
StringValue |
Lowercase string |
lpad(self, length, pad=' ')
¶
Pad arg
by truncating on the right or padding on the left.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
length |
int | ir.IntegerValue |
Length of output string |
required |
pad |
str | StringValue |
Pad character |
' ' |
Examples:
>>> import ibis
>>> table = ibis.table([('strings', 'string')])
>>> expr = table.strings.lpad(5, '-')
>>> expr = ibis.literal('a').lpad(5, '-') # 'a' becomes '----a'
>>> expr = ibis.literal('abcdefg').lpad(5, '-') # 'abcdefg' becomes 'abcde' # noqa: E501
Returns:
Type | Description |
---|---|
StringValue |
Padded string |
lstrip(self)
¶
Remove whitespace from the left side of string.
Returns:
Type | Description |
---|---|
StringValue |
Left-stripped string |
parse_url(self, extract, key=None)
¶
Parse a URL and extract its components.
key
can be used to extract query values when extract == 'QUERY'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
extract |
Literal['PROTOCOL', 'HOST', 'PATH', 'REF', 'AUTHORITY', 'FILE', 'USERINFO', 'QUERY'] |
Component of URL to extract |
required |
key |
str | None |
Query component to extract |
None |
Examples:
>>> url = "https://www.youtube.com/watch?v=kEuEcWfewf8&t=10"
>>> parse_url(url, 'QUERY', 'v')
'kEuEcWfewf8'
Returns:
Type | Description |
---|---|
StringValue |
Extracted string value |
re_extract(self, pattern, index)
¶
Return the specified match at index
from a regex pattern
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
str | StringValue |
Reguar expression string |
required |
index |
int | ir.IntegerValue |
Zero-based index of match to return |
required |
Returns:
Type | Description |
---|---|
StringValue |
Extracted match |
re_replace(self, pattern, replacement)
¶
Replace match found by regex pattern
with replacement
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
str | StringValue |
Regular expression string |
required |
replacement |
str | StringValue |
Replacement string or regular expression |
required |
Examples:
>>> import ibis
>>> table = ibis.table([('strings', 'string')])
>>> result = table.strings.replace('(b+)', r'<>') # 'aaabbbaa' becomes 'aaa<bbb>aaa' # noqa: E501
Returns:
Type | Description |
---|---|
StringValue |
Modified string |
re_search(self, pattern)
¶
Return whether the values match pattern
.
Returns True
if the regex matches a string and False
otherwise.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
str | StringValue |
Regular expression use for searching |
required |
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Indicator of matches |
repeat(self, n)
¶
Repeat a string n
times.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int | ir.IntegerValue |
Number of repetitions |
required |
Returns:
Type | Description |
---|---|
StringValue |
Repeated string |
replace(self, pattern, replacement)
¶
Replace each exact match of pattern
with replacement
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
StringValue |
String pattern |
required |
replacement |
StringValue |
String replacement |
required |
Examples:
>>> import ibis
>>> table = ibis.table([('strings', 'string')])
>>> result = table.strings.replace('aaa', 'foo') # 'aaabbbaaa' becomes 'foobbbfoo' # noqa: E501
Returns:
Type | Description |
---|---|
StringValue |
Replaced string |
reverse(self)
¶
Reverse the characters of a string.
Returns:
Type | Description |
---|---|
StringValue |
Reversed string |
right(self, nchars)
¶
Return up to nchars
from the end of each string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
nchars |
int | ir.IntegerValue |
Maximum number of characters to return |
required |
Returns:
Type | Description |
---|---|
StringValue |
Characters from the end |
rlike(self, pattern)
¶
Return whether the values match pattern
.
Returns True
if the regex matches a string and False
otherwise.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern |
str | StringValue |
Regular expression use for searching |
required |
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Indicator of matches |
rpad(self, length, pad=' ')
¶
Pad self
by truncating or padding on the right.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
self |
None |
String to pad |
required |
length |
int | ir.IntegerValue |
Length of output string |
required |
pad |
str | StringValue |
Pad character |
' ' |
Examples:
>>> import ibis
>>> table = ibis.table([('string_col', 'string')])
>>> expr = table.string_col.rpad(5, '-')
>>> expr = ibis.literal('a').rpad(5, '-') # 'a' becomes 'a----'
>>> expr = ibis.literal('abcdefg').rpad(5, '-') # 'abcdefg' becomes 'abcde' # noqa: E501
Returns:
Type | Description |
---|---|
StringValue |
Padded string |
rstrip(self)
¶
Remove whitespace from the right side of string.
Returns:
Type | Description |
---|---|
StringValue |
Right-stripped string |
split(self, delimiter)
¶
Split as string on delimiter
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
delimiter |
str | StringValue |
Value to split by |
required |
Returns:
Type | Description |
---|---|
ir.ArrayValue |
The string split by |
startswith(self, start)
¶
Determine whether self
starts with end
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start |
str | StringValue |
prefix to check for |
required |
Examples:
>>> import ibis
>>> text = ibis.literal('Ibis project')
>>> text.startswith('Ibis')
StartsWith('Ibis project', start='Ibis')
Returns:
Type | Description |
---|---|
ir.BooleanValue |
Boolean indicating whether |
strip(self)
¶
Remove whitespace from left and right sides of a string.
Returns:
Type | Description |
---|---|
StringValue |
Stripped string |
substr(self, start, length=None)
¶
Extract a substring.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start |
int | ir.IntegerValue |
First character to start splitting, indices start at 0 |
required |
length |
int | ir.IntegerValue | None |
Maximum length of each substring. If not supplied, searches the entire string |
None |
Returns:
Type | Description |
---|---|
StringValue |
Found substring |
to_timestamp(self, format_str, timezone=None)
¶
Parse a string and return a timestamp.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
format_str |
str |
Format string in |
required |
timezone |
str | None |
A string indicating the timezone. For example |
None |
Examples:
>>> import ibis
>>> date_as_str = ibis.literal('20170206')
>>> result = date_as_str.to_timestamp('%Y%m%d')
Returns:
Type | Description |
---|---|
ir.TimestampValue |
Parsed timestamp value |
translate(self, from_str, to_str)
¶
Replace from_str
characters in self
characters in to_str
.
To avoid unexpected behavior, from_str
should be shorter than
to_str
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
from_str |
StringValue |
Characters in |
required |
to_str |
StringValue |
Characters to use for replacement |
required |
Examples:
>>> import ibis
>>> table = ibis.table([('string_col', 'string')])
>>> expr = table.string_col.translate('a', 'b')
>>> expr = table.string_col.translate('a', 'bc')
Returns:
Type | Description |
---|---|
StringValue |
Translated string |
upper(self)
¶
Convert string to all uppercase.
Returns:
Type | Description |
---|---|
StringValue |
Uppercase string |