Backend.connect(host='localhost', port=21050, database='default', timeout=45, use_ssl=False, ca_cert=None, user=None, password=None, auth_mechanism='NOSASL', kerberos_service_name='impala', pool_size=8, hdfs_client=None)

Create a Impala Backend for use with Ibis.

  • host (str, optional) – Host name of the impalad or HiveServer2 in Hive

  • port (int, optional) – Impala’s HiveServer2 port

  • database (str, optional) – Default database when obtaining new cursors

  • timeout (int, optional) – Connection timeout in seconds when communicating with HiveServer2

  • use_ssl (bool, optional) – Use SSL when connecting to HiveServer2

  • ca_cert (str, optional) – Local path to 3rd party CA certificate or copy of server certificate for self-signed certificates. If SSL is enabled, but this argument is None, then certificate validation is skipped.

  • user (str, optional) – LDAP user to authenticate

  • password (str, optional) – LDAP password to authenticate

  • auth_mechanism (str, optional) – {‘NOSASL’ <- default, ‘PLAIN’, ‘GSSAPI’, ‘LDAP’}. Use NOSASL for non-secured Impala connections. Use PLAIN for non-secured Hive clusters. Use LDAP for LDAP authenticated connections. Use GSSAPI for Kerberos-secured clusters.

  • kerberos_service_name (str, optional) – Specify particular impalad service principal.


>>> import ibis
>>> import os
>>> hdfs_host = os.environ.get('IBIS_TEST_NN_HOST', 'localhost')
>>> hdfs_port = int(os.environ.get('IBIS_TEST_NN_PORT', 50070))
>>> impala_host = os.environ.get('IBIS_TEST_IMPALA_HOST', 'localhost')
>>> impala_port = int(os.environ.get('IBIS_TEST_IMPALA_PORT', 21050))
>>> hdfs = ibis.impala.hdfs_connect(host=hdfs_host, port=hdfs_port)
>>> hdfs  
<ibis.filesystems.WebHDFS object at 0x...>
>>> client = ibis.impala.connect(
...     host=impala_host,
...     port=impala_port,
...     hdfs_client=hdfs,
... )
>>> client  
<ibis.backends.impala.Backend object at 0x...>

Return type