PySpark

Install

Install dependencies for Ibis’s PySpark dialect:

pip install 'ibis-framework[pyspark]'

or

conda install -c conda-forge ibis-pyspark

Note

When using the PySpark backend with PySpark 2.3.x, 2.4.x and pyarrow >= 0.15.0, you need to set ARROW_PRE_0_15_IPC_FORMAT=1. See here for details

Connect

The PySpark client is accessible through the ibis.pyspark namespace.

Use ibis.pyspark.connect to create a client.

Backend.connect(*args, **kwargs)

Return new client object with saved args/kwargs, having called .reconnect() on it.

Backend.database([name])

Return a Database object for the name database.

Backend.list_databases([like])

List existing databases in the current connection.

Backend.list_tables([like, database])

Return the list of table names in the current database.

Backend.table(name[, database])

Create a table expression that references a particular table or view in the database.