Introduction
Ibis and BigQuery have worked well together for years .
In Ibis 7.0.0, they work even better together with the addition of array functionality for BigQuery.
Let’s look at some examples using BigQuery’s IMDB sample data .
Basics
First we’ll connect to BigQuery and pluck out a table to work with.
We’ll start with from ibis.interactive import *
for maximum convenience.
1 from ibis.interactive import *
2 con = ibis.connect ("bigquery://ibis-gbq" )
3 con.set_database("bigquery-public-data.imdb" )
1
from ibis.interactive import *
imports Ibis APIs into the global namespace and enables interactive mode .
2
Connect to Google BigQuery. Compute (but not storage) is billed to the project you connect to–ibis-gbq
in this case.
3
Set the database to the project and dataset that we will use for analysis.
Let’s look at the tables in this dataset:
Tables
------
- name_basics
- reviews
- title_akas
- title_basics
- title_crew
- title_episode
- title_principals
- title_ratings
Let’s pull out the name_basics
table, which contains names and metadata about people listed on IMDB. We’ll call this ents
(short for entities
), and remove some columns we won’t need:
ents = con.tables.name_basics.drop("birth_year" , "death_year" )
ents
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ string │ string │
├───────────┼───────────────────┼────────────────────┼─────────────────────┤
│ nm7195872 │ Amanda Goetz │ NULL │ tt4529508 │
│ nm7204100 │ Overload │ NULL │ tt4828308,tt4538296 │
│ nm7206569 │ Carl Winter │ NULL │ NULL │
│ nm7208626 │ Doug Goodin │ NULL │ NULL │
│ nm7222505 │ Rickard Finndahl │ NULL │ tt4519546 │
│ nm7226759 │ Kenneth Bell │ NULL │ tt3545908 │
│ nm7227158 │ Savannah Gardner │ NULL │ tt4028790 │
│ nm7246216 │ Elisabeth Hofmann │ NULL │ tt4586074 │
│ nm7253303 │ Wisda Febriyanti │ NULL │ tt4594232 │
│ nm7255948 │ Charles Myers │ NULL │ tt2396758 │
│ … │ … │ … │ … │
└───────────┴───────────────────┴────────────────────┴─────────────────────┘
Splitting strings into arrays
We can see that known_for_titles
looks sort of like an array, so let’s call the split
method on that column and replace the existing column:
ents = ents.mutate(known_for_titles= _.known_for_titles.split("," ))
ents
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ string │ array<string> │
├───────────┼───────────────────┼────────────────────┼────────────────────────────┤
│ nm7195872 │ Amanda Goetz │ NULL │ [ 'tt4529508' ] │
│ nm7204100 │ Overload │ NULL │ [ 'tt4828308' , 'tt4538296' ] │
│ nm7206569 │ Carl Winter │ NULL │ [] │
│ nm7208626 │ Doug Goodin │ NULL │ [] │
│ nm7222505 │ Rickard Finndahl │ NULL │ [ 'tt4519546' ] │
│ nm7226759 │ Kenneth Bell │ NULL │ [ 'tt3545908' ] │
│ nm7227158 │ Savannah Gardner │ NULL │ [ 'tt4028790' ] │
│ nm7246216 │ Elisabeth Hofmann │ NULL │ [ 'tt4586074' ] │
│ nm7253303 │ Wisda Febriyanti │ NULL │ [ 'tt4594232' ] │
│ nm7255948 │ Charles Myers │ NULL │ [ 'tt2396758' ] │
│ … │ … │ … │ … │
└───────────┴───────────────────┴────────────────────┴────────────────────────────┘
Similarly for primary_profession
, since people involved in show business often have more than one responsibility on a project:
ents = ents.mutate(primary_profession= _.primary_profession.split("," ))
Array length
Let’s see how many titles each entity is known for, and then show the five people with the largest number of titles they’re known for:
This is computed using the length
API on array expressions:
(
ents.select("primary_name" , num_titles= _.known_for_titles.length())
.order_by(_.num_titles.desc())
.limit(5 )
)
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ primary_name ┃ num_titles ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ string │ int64 │
├──────────────────┼────────────┤
│ Sally Sun │ 5 │
│ Matthew Kavuma │ 5 │
│ Henry Townsend │ 5 │
│ Alex Koenigsmark │ 5 │
│ Carrie Schnelker │ 5 │
└──────────────────┴────────────┘
It seems like the length of the known_for_titles
might be capped at five!
Index
We can see the position of "actor"
in primary_profession
s:
ents.primary_profession.index("actor" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ArrayPosition(primary_profession, 'actor') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int64 │
├────────────────────────────────────────────┤
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ -1 │
│ … │
└────────────────────────────────────────────┘
A return value of -1
indicates that "actor"
is not present in the value:
Let’s look for entities that are not primarily actors:
We can do this using the index
method by checking whether the position of the string "actor"
is greater than zero:
actor_index = ents.primary_profession.index("actor" )
not_primarily_actors = actor_index > 0
1 not_primarily_actors.mean()
1
The average of a bool
column gives the percentage of True
values
Who are they?
ents[not_primarily_actors]
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├────────────┼────────────────────┼─────────────────────┼─────────────────────────────────────┤
│ nm2231782 │ Rene Tovar │ [ 'legal' , 'actor' ] │ [ 'tt21996928' ] │
│ nm2250015 │ Stephen Clark │ [ 'legal' , 'actor' ] │ [ 'tt1452628' , 'tt14372154' , ... +2 ] │
│ nm0352162 │ Brett Haber │ [ 'legal' , 'actor' ] │ [ 'tt1720280' , 'tt10928526' , ... +2 ] │
│ nm12169237 │ Endi Ndini │ [ 'editor' , 'actor' ] │ [ 'tt4557810' , 'tt14137514' , ... +2 ] │
│ nm14475156 │ Colby White │ [ 'editor' , 'actor' ] │ [ 'tt26313337' ] │
│ nm8979480 │ Bartosz Strusewicz │ [ 'editor' , 'actor' ] │ [ 'tt6685946' , 'tt7334964' , ... +2 ] │
│ nm3116354 │ Robert Marquis │ [ 'editor' , 'actor' ] │ [ 'tt8376014' , 'tt1283513' ] │
│ nm9962305 │ Vino Domi │ [ 'editor' , 'actor' ] │ [ 'tt8680966' , 'tt8669176' ] │
│ nm10346617 │ Lucas Oliveira │ [ 'editor' , 'actor' ] │ [ 'tt9483226' , 'tt7216954' , ... +1 ] │
│ nm7206820 │ Prince Sethi │ [ 'editor' , 'actor' ] │ [ 'tt14396686' , 'tt4219300' ] │
│ … │ … │ … │ … │
└────────────┴────────────────────┴─────────────────────┴─────────────────────────────────────┘
It’s not 100% clear whether the order of elements in primary_profession
matters here.
Containment
We can get people who are not actors using contains
:
non_actors = ents[~ ents.primary_profession.contains("actor" )]
non_actors
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├────────────┼───────────────────────┼────────────────────┼──────────────────┤
│ nm13613518 │ Silvia Vannini │ [ 'legal' ] │ [] │
│ nm11482673 │ Umit Yildirim │ [ 'legal' ] │ [] │
│ nm14796117 │ Kendall Jackson │ [ 'legal' ] │ [] │
│ nm3922637 │ Michael J. Douglas │ [ 'legal' ] │ [] │
│ nm5249145 │ Christopher Addy │ [ 'legal' ] │ [] │
│ nm9235293 │ Baolu Lan │ [ 'legal' ] │ [] │
│ nm14560328 │ Jean Paul S Voilleque │ [ 'legal' ] │ [] │
│ nm11250663 │ Kelly D. Shapiro │ [ 'legal' ] │ [] │
│ nm11355058 │ Sameer Oberoi │ [ 'legal' ] │ [] │
│ nm8655635 │ James Madison │ [ 'legal' ] │ [] │
│ … │ … │ … │ … │
└────────────┴───────────────────────┴────────────────────┴──────────────────┘
Element removal
We can remove elements from arrays too.
Let’s see who only has “actor” in the list of their primary professions:
ents.filter (
[
_.primary_profession.length() > 0 ,
_.primary_profession.remove("actor" ).length() == 0 ,
]
)
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├───────────┼────────────────────┼────────────────────┼─────────────────────────────────────┤
│ nm7217990 │ Jay Isbell │ [ 'actor' ] │ [ 'tt4450682' ] │
│ nm7218053 │ Eric Crowell │ [ 'actor' ] │ [ 'tt4500196' ] │
│ nm7218081 │ John Wyman │ [ 'actor' ] │ [ 'tt4500196' ] │
│ nm7223556 │ Daniel Hope │ [ 'actor' ] │ [ 'tt4558584' , 'tt9089514' ] │
│ nm7223623 │ Marcus Troy │ [ 'actor' ] │ [ 'tt0120660' , 'tt10914400' , ... +2 ] │
│ nm7241836 │ Havár Csongor │ [ 'actor' ] │ [ 'tt4580414' ] │
│ nm7242608 │ Seigô Uetaki │ [ 'actor' ] │ [ 'tt4581192' ] │
│ nm7245253 │ Mahmoud El Faituri │ [ 'actor' ] │ [ 'tt2849138' ] │
│ nm7254729 │ Tom Keesey │ [ 'actor' ] │ [ 'tt0924155' , 'tt0924156' , ... +2 ] │
│ nm7280985 │ Gabriel Garcia │ [ 'actor' ] │ [ 'tt4629714' ] │
│ … │ … │ … │ … │
└───────────┴────────────────────┴────────────────────┴─────────────────────────────────────┘
Slicing with square-bracket syntax
Let’s remove everyone’s first profession from the list, but only if they have more than one profession listed:
ents[_.primary_profession.length() > 1 ].mutate(
primary_profession= _.primary_profession[1 :],
)
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├────────────┼────────────────────────┼────────────────────┼────────────────────────────┤
│ nm3146692 │ Keith Sutton │ [ 'legal' ] │ [ 'tt0472984' ] │
│ nm2974992 │ Dávid Farkas │ [ 'legal' ] │ [ 'tt0114301' ] │
│ nm6819046 │ Walter Batt │ [ 'legal' ] │ [ 'tt8592196' ] │
│ nm2231544 │ Don Steele │ [ 'actor' ] │ [ 'tt0818746' ] │
│ nm7267327 │ Navarro Gray │ [ 'actor' ] │ [ 'tt1718437' , 'tt7945012' ] │
│ nm7783929 │ Christopher T. Connell │ [ 'editor' ] │ [ 'tt7510258' , 'tt5262988' ] │
│ nm6894982 │ Jason Robert Moore │ [ 'editor' ] │ [ 'tt4177962' ] │
│ nm10526935 │ Damien Mota │ [ 'editor' ] │ [ 'tt9889740' ] │
│ nm7151380 │ Chris Villa │ [ 'editor' ] │ [ 'tt4479468' ] │
│ nm7641135 │ Curt Champagne │ [ 'editor' ] │ [ 'tt5097098' ] │
│ … │ … │ … │ … │
└────────────┴────────────────────────┴────────────────────┴────────────────────────────┘
Set operations and sorting
Treating arrays as sets is possible with the union
and intersect
APIs.
Let’s take a look at intersect
:
Intersection
Let’s see if we can use array intersection to figure which actors share known-for titles and sort the result:
left = ents.filter (_.known_for_titles.length() > 0 ).limit(10_000 )
right = left.view()
shared_titles = (
left
.join(right, left.nconst != right.nconst)
.select(
s.startswith("known_for_titles" ),
left_name= "primary_name" ,
right_name= "primary_name_right" ,
)
.filter (_.known_for_titles.intersect(_.known_for_titles_right).length() > 0 )
.group_by(name= "left_name" )
.agg(together_with= _.right_name.collect())
.mutate(together_with= _.together_with.unique().sort())
)
shared_titles
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ name ┃ together_with ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ array<string> │
├───────────────────────┼─────────────────────────────────────────────────────┤
│ Chief Willie Sellars │ [ 'Amy Tan' , 'Bruce Williams' , ... +13 ] │
│ Rainer Spanagel │ [ 'Andra Arnicane' , 'Aziz Sheikh' , ... +9 ] │
│ Sam Cooper │ [ 'Amy Tan' , 'Bruce Williams' , ... +10 ] │
│ Michelle Porter │ [ 'Amy Tan' , 'Bruce Williams' , ... +8 ] │
│ Bryan Carter │ [ 'Andrew Bower' , 'Austin Murtha' , ... +11 ] │
│ Timothy Johnson │ [ 'Betsy Blaney' , 'Brendan Halko' , ... +19 ] │
│ Colin McLean │ [ 'Alison Garnham' , 'Ashley Fox' , ... +20 ] │
│ Jessica Williams │ [ 'Austin Williams' ] │
│ Leighann Falcon │ [ 'Aaron Green' , 'Alejandro Garza y Garza' , ... +71 ] │
│ Oldham Tuneless Choir │ [ 'Alex Matvienko' , 'Andy McDonald' , ... +38 ] │
│ … │ … │
└───────────────────────┴─────────────────────────────────────────────────────┘
Advanced operations
Flatten arrays into rows
Thanks to the tireless efforts of the folks working on sqlglot
, as of version 7.0.0 Ibis supports unnest
for BigQuery!
You can use it standalone on a column expression:
ents.primary_profession.unnest()
┏━━━━━━━━━━━━━━━━━━━━┓
┃ primary_profession ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ string │
├────────────────────┤
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ actor │
│ … │
└────────────────────┘
You can also use it in select
/mutate
calls to expand the table accordingly:
ents.mutate(primary_profession= _.primary_profession.unnest())
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ string │ string │ string │ array<string> │
├───────────┼────────────────────┼────────────────────┼──────────────────┤
│ nm7211030 │ Josh Berry │ actor │ [ 'tt4046896' ] │
│ nm7211205 │ Alan Douglas │ actor │ [ 'tt0038449' ] │
│ nm7213536 │ Wilson Recalde │ actor │ [ 'tt2333598' ] │
│ nm7214355 │ Julian Owen │ actor │ [ 'tt3488298' ] │
│ nm7215983 │ Zach Ladouceur │ actor │ [ 'tt4546288' ] │
│ nm7221941 │ Alain Milani │ actor │ [ 'tt4548654' ] │
│ nm7225536 │ Gary Flanzer │ actor │ [ 'tt4521030' ] │
│ nm7236543 │ Bastiaan Schreuder │ actor │ [ 'tt4506254' ] │
│ nm7241255 │ Jared Young │ actor │ [ 'tt4579992' ] │
│ nm7241835 │ Fagyas Alex │ actor │ [ 'tt4580414' ] │
│ … │ … │ … │ … │
└───────────┴────────────────────┴────────────────────┴──────────────────┘
Unnesting can be useful when joining nested data.
Here we use unnest to find people known for any of the godfather movies:
1 basics = con.tables.title_basics.filter (
[
_.title_type == "movie" ,
_.original_title.lower().startswith("the godfather" ),
_.genres.lower().contains("crime" ),
]
)
known_for_the_godfather = (
2 ents.mutate(tconst= _.known_for_titles.unnest())
3 .join(basics, "tconst" )
4 .select("primary_title" , "primary_name" )
.distinct()
.order_by(["primary_title" , "primary_name" ])
)
known_for_the_godfather
1
Filter the title_basics
data set to only the Godfather movies
2
Unnest the known_for_titles
array column
3
Join with basics
to get movie titles
4
Ensure that each entity is only listed once and sort the results
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ primary_title ┃ primary_name ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │
├───────────────┼─────────────────────┤
│ The Godfather │ A. Emmett Adams │
│ The Godfather │ Abe Vigoda │
│ The Godfather │ Al Lettieri │
│ The Godfather │ Al Martino │
│ The Godfather │ Al Pacino │
│ The Godfather │ Albert S. Ruddy │
│ The Godfather │ Alex Rocco │
│ The Godfather │ Andrea Eastman │
│ The Godfather │ Angelo Infanti │
│ The Godfather │ Anna Hill Johnstone │
│ … │ … │
└───────────────┴─────────────────────┘
Let’s summarize by showing how many people are known for each Godfather movie:
known_for_the_godfather.primary_title.value_counts()
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ primary_title ┃ primary_title_count ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ string │ int64 │
├────────────────────────┼─────────────────────┤
│ The Godfather Part III │ 196 │
│ The Godfather Part II │ 117 │
│ The Godfather │ 96 │
└────────────────────────┴─────────────────────┘
Filtering array elements
Filtering array elements can be done with the filter
method, which applies a predicate to each array element and returns an array of elements for which the predicate returns True
.
This method is similar to Python’s filter
function.
Let’s show all people who are neither editors nor actors:
ents.mutate(
1 primary_profession= _.primary_profession.filter (
lambda pp: ~ pp.isin(("actor" , "editor" ))
)
2 ).filter (_.primary_profession.length() > 0 )
1
This filter
call is applied to each array element
2
This filter
call is applied to the table
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├────────────┼────────────────────┼────────────────────┼──────────────────┤
│ nm14701100 │ Karin Roach │ [ 'legal' ] │ [] │
│ nm3709802 │ Kristin L. Holland │ [ 'legal' ] │ [] │
│ nm13336378 │ Rok Salazar │ [ 'legal' ] │ [] │
│ nm7514782 │ Christopher Spicer │ [ 'legal' ] │ [] │
│ nm11531194 │ J Manuel │ [ 'legal' ] │ [] │
│ nm9114713 │ Huy Nguyen │ [ 'legal' ] │ [] │
│ nm2230248 │ Jeffrey Galen │ [ 'legal' ] │ [] │
│ nm8479496 │ Fatima Amgane │ [ 'legal' ] │ [] │
│ nm2229345 │ Harold Brown │ [ 'legal' ] │ [] │
│ nm7383201 │ Ashley Silver │ [ 'legal' ] │ [] │
│ … │ … │ … │ … │
└────────────┴────────────────────┴────────────────────┴──────────────────┘
Applying a function to array elements
You can apply a function to run an ibis expression on each element of an array using the map
method.
Let’s normalize the case of primary_profession to upper case:
ents.mutate(
primary_profession= _.primary_profession.map (lambda pp: pp.upper())
).filter (_.primary_profession.length() > 0 )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ nconst ┃ primary_name ┃ primary_profession ┃ known_for_titles ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ string │ array<string> │ array<string> │
├───────────┼───────────────────┼────────────────────┼─────────────────────────────────────┤
│ nm7199328 │ Renzo Castro │ [ 'ACTOR' ] │ [ 'tt4623856' , 'tt4494580' , ... +1 ] │
│ nm7199362 │ Pankaj │ [ 'ACTOR' ] │ [ 'tt4367318' ] │
│ nm7200119 │ Thibault Péan │ [ 'ACTOR' ] │ [ 'tt4534250' ] │
│ nm7203213 │ Tim Goodman │ [ 'ACTOR' ] │ [ 'tt2234701' ] │
│ nm7207130 │ Ruupertti Arponen │ [ 'ACTOR' ] │ [ 'tt0185819' , 'tt10628202' ] │
│ nm7223822 │ Federico Carghini │ [ 'ACTOR' ] │ [ 'tt1556087' ] │
│ nm7232704 │ Ned Jackson │ [ 'ACTOR' ] │ [ 'tt0407361' , 'tt20453990' ] │
│ nm7238980 │ Rasmus Cassanelli │ [ 'ACTOR' ] │ [ 'tt0782510' ] │
│ nm7240979 │ Changyuan Zhou │ [ 'ACTOR' ] │ [ 'tt0311913' ] │
│ nm7242332 │ James Dolbeare │ [ 'ACTOR' ] │ [ 'tt10151048' , 'tt5769738' , ... +2 ] │
│ … │ … │ … │ … │
└───────────┴───────────────────┴────────────────────┴─────────────────────────────────────┘
Conclusion
Ibis has a sizable collection of array APIs that work with many different backends and as of version 7.0.0, Ibis supports a much larger set of those APIs for BigQuery!
Check out the API documentation for the full set of available methods.
Try it out, and let us know what you think.
Back to top