This is the implementation of the Impala data handler for MindsDB. Apache Impala is an MPP (Massive Parallel Processing) SQL query engine for processing huge volumes of data that is stored in the Apache Hadoop cluster. It is an open source software written in C++ and Java. It provides high performance and low latency compared to other SQL engines for Hadoop. In other words, Impala is the highest performing SQL engine (giving RDBMS-like experience) that provides the fastest way to access data stored in Hadoop Distributed File System.Documentation Index
Fetch the complete documentation index at: https://docs.mindsdb.com/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before proceeding, ensure the following prerequisites are met:- Install MindsDB locally via Docker or Docker Desktop.
- To connect Apache Impala to MindsDB, install the required dependencies following this instruction.
- Install or ensure access to Apache Impala.
Implementation
This handler is implemented usingimpyla, a Python library that allows you to use Python code to run SQL commands on Impala.
The required arguments to establish a connection are:
useris the username associated with the database.passwordis the password to authenticate your access.hostis the server IP address or hostname.portis the port through which TCP/IP connection is to be made.databaseis the database name to be connected.