BigQuery

To use the BigQuery client, you will need a Google Cloud Platform account. Use the BigQuery sandbox to try the service for free.

BigQuery Quickstart

Install dependencies for Ibis’s BigQuery dialect:

pip install ibis-framework[bigquery]

Create a client by passing in the project id and dataset id you wish to operate with:

>>> con = ibis.bigquery.connect(project_id='ibis-gbq', dataset_id='testing')

By default ibis assumes that the BigQuery project that’s billed for queries is also the project where the data lives.

However, it’s very easy to query data that does not live in the billing project.

Note

When you run queries against data from other projects the billing project will still be billed for any and all queries.

If you want to query data that lives in a different project than the billing project you can use the ibis.bigquery.client.BigQueryClient.database() method of ibis.bigquery.client.BigQueryClient objects:

>>> db = con.database('other-data-project.other-dataset')
>>> t = db.my_awesome_table
>>> t.sweet_column.sum().execute()  # runs against the billing project

API

The BigQuery client is accessible through the ibis.bigquery namespace. See BigQuery for a tutorial on using this backend.

Use the ibis.bigquery.connect function to create a BigQuery client. If no credentials are provided, the pydata_google_auth.default() function fetches default credentials.

connect([project_id, dataset_id, …])

Create a BigQueryClient for use with Ibis.

BigQueryClient.database([name])

Create a database object.

BigQueryClient.list_databases([like])

BigQueryClient.list_tables([like, database])

BigQueryClient.table(name[, database])

Create a table expression.

The BigQuery client object

To use Ibis with BigQuery, you first must connect to BigQuery using the ibis.bigquery.connect() function, optionally supplying Google API credentials:

import ibis

client = ibis.bigquery.connect(
    project_id=YOUR_PROJECT_ID,
    dataset_id='bigquery-public-data.stackoverflow'
)

User Defined functions (UDF)

Note

BigQuery only supports element-wise UDFs at this time.

BigQuery supports UDFs through JavaScript. Ibis provides support for this by turning Python code into JavaScript.

The interface is very similar to the pandas UDF API:

import ibis.expr.datatypes as dt
from ibis.bigquery import udf

@udf([dt.double], dt.double)
def my_bigquery_add_one(x):
    return x + 1.0

Ibis will parse the source of the function and turn the resulting Python AST into JavaScript source code (technically, ECMAScript 2015). Most of the Python language is supported including classes, functions and generators.

When you want to use this function you call it like any other Python function–only it must be called on an ibis expression:

t = ibis.table([('a', 'double')])
expr = my_bigquery_add_one(t.a)
print(ibis.bigquery.compile(expr))

Privacy

This package is subject to the NumFocus privacy policy. Your use of Google APIs with this module is subject to each API’s respective terms of service.

Google account and user data

Accessing user data

The connect() function provides access to data stored in Google BigQuery and other sources such as Google Sheets or Cloud Storage, via the federated query feature. Your machine communicates directly with the Google APIs.

Storing user data

By default, your credentials are stored to a local file, such as ~/.config/pydata/ibis.json. All user data is stored on your local machine. Use caution when using this library on a shared machine.

Sharing user data

The BigQuery client only communicates with Google APIs. No user data is shared with PyData, NumFocus, or any other servers.

Policies for application authors

Do not use the default client ID when using Ibis from an application, library, or tool. Per the Google User Data Policy, your application must accurately represent itself when authenticating to Google API services.

Extending the BigQuery backend

  • Create a Google Cloud project.

  • Set the GOOGLE_BIGQUERY_PROJECT_ID environment variable.

  • Populate test data: python ci/datamgr.py bigquery

  • Run the test suite: pytest ibis/bigquery/tests