Backend.parquet_file(hdfs_dir, schema=None, name=None, database=None, external=True, like_file=None, like_table=None, persist=False)

Make indicated parquet file in HDFS available as an Ibis table.

The table created can be optionally named and persisted, otherwise a unique name will be generated. Temporarily, for any non-persistent external table created by Ibis we will attempt to drop it when the underlying object is garbage collected (or the Python interpreter shuts down normally).

  • hdfs_dir (string) – Path in HDFS

  • schema (ibis Schema) – If no schema provided, and neither of the like_* argument is passed, one will be inferred from one of the parquet files in the directory.

  • like_file (string) – Absolute path to Parquet file in HDFS to use for schema definitions. An alternative to having to supply an explicit schema

  • like_table (string) – Fully scoped and escaped string to an Impala table whose schema we will use for the newly created table.

  • name (string, optional) – random unique name generated otherwise

  • database (string, optional) – Database to create the (possibly temporary) table in

  • external (boolean, default True) – If a table is external, the referenced data will not be deleted when the table is dropped in Impala. Otherwise (external=False) Impala takes ownership of the Parquet file.

  • persist (boolean, default False) – Do not drop the table upon Ibis garbage collection / interpreter shutdown



Return type