Apache Kudu - QuickTechie | Software & IT Professional Network

About Apache Kudu

Apache Kudu is a distributed columnar storage engine and optimized for OLAP workloads. Kudu runs on commodity Hardware, is Horizontally scalable, and supports highly available operations.

Columnar Storage

A columnar database stores data of each column independently. This helps on reading data from disks only for those columns that are used in any given query. The cost is that operations that affect whole rows become proportionally more expensive. Advantages of the columnar database are

Queries that use only a few columns out of many.
Aggregating queries against large volumes of data.
Column-wise data compression.

A columnar database is a preferred choice for analytical applications because it allows to have many columns in a table just in case, and do not pay the cost for unused columns on read query execution time. Column-oriented databases are designed for big data processing and data warehousing, they often natively scale using distributed clusters of low-cost hardware to increase throughput.

Disadvantages of column-oriented databases. Online transactional processing. These databases are not very efficient with online transactional processing as much as they are for online analytical processing. This means they are not very good with updating transactions but are designed to analyse them.

However, columnar data is not ideal when you need to view multiple fields from each row. Traditional row databases tend to be better for queries seeking user-specific values only. Columnar databases can also take more time to write new data because column has to be written one by one.

Kudu is a Columnar Database, and stores data in strongly-typed columns and it has following benefits.

Efficiently Reading Data: Queries which are analytical in nature, usually read fewer or read single columns or portion of that column and ignore all other columns on that table. Hence, to fulfill this request database must read minimal number of blocks from disk. However, if you are using row based store than you have to read entire row, even if you want to use only few columns.
Efficiently Compressing Data: Since, columnar database has only single type of data in a column and stored at one place on the disk. This can be compressed much more efficiently than compressing entire block of rows which has mixed data types.

What is OLAP?

OLAP stands for Online Analytical Processing. In another words you can say in real-time, source data will be processed to generate analytical reports and insights. OLAP is also a Database Management System (DMBS). In business sense, OLAP allows companies to continuously plan, analyze and report operational activities, and maximizing efficiency, reducing expenses and ultimately conquering the market share. OLAP is the technology behind many BI (Business Intelligence) applications.

OLAP v/s OLTP

All Database management systems could be classified into two groups: OLAP (Online Analytical Processing) and OLDP (Online Transaction Processing). OLAP focuses on building reports, each based on large volumes of historical data, but frequency is less. While OLTP is being used for handling continuous stream of transactions and constantly modifying the current state of the data.

To build analytical reports efficiently it is crucial to be able to read columns separately, thus most OLAP databases are columnar.
While storing columns separately increases costs of operations on rows, like append or in-place modification, proportionally to the number of columns. And that can be huge if system tries to collect all details of an event just in case. Hence, most OLTP systems store data arranged by rows.

Benefits of Apache Kudu

There are many benefits of using Kudu to use in Business Intelligence and reporting. Few of them are listed below

It can have fast processing for Online Analytics (OLAP) workloads.
It has a strong and flexible consistency model.
You can choose consistency requirements on a per-request basis, including the option for strict-serializable consistency.
Kudu has structured data model.
Kudu can give strong performance for running sequential and random workload simultaneously.
It has tight integration with Impala.
It is a good alternative to using HDFS with Apache Parquet.
It can integrate with Apache NiFi and Apache Spark.
Fine grain authorization and access control: It can integrate with HMS (Hive Metastore) and Apache Ranger to provide fine-grain authorization and access control.
Kudu has Authenticated and ecrypted RPC communication.
High availability: Tablet Servers and Masters use the Raft Consensus Algorithm, which ensures that as long as more than half the total number of tablet replicas is available, the tablet is available for reads and writes. For instance, if 2 out of 3 replicas (or 3 out of 5 replicas, etc.) are available, the tablet is available. Reads can be serviced by read-only follower tablet replicas, even in the event of a leader replica’s failure.
Automatic fault detection and self-healing: to keep data highly available, the system detects failed tablet replicas and re-replicates data from available ones, so failed replicas are automatically replaced when enough Tablet Servers are available in the cluster.
Location awareness (a.k.a. rack awareness) to keep the system available in case of correlated failures and allowing Kudu clusters to span over multiple availability zones.
Logical backup (full and incremental) and restore.
Multi-row transactions (only for INSERT/INSERT_IGNORE operations)

Example Applications for Apache Kudu

Reporting applications where newly-arrived data needs to be immediately available for end users.
Time-series applications that must simultaneously support:
- Queries across large amounts of historic data.
- Granular queries about an individual entity that must return very quickly
Applications that use predictive models to make real-time decisions with periodic refreshes of the predictive model based on all historic data

Example Use Cases in Detail

1. Streaming Input with Near Real Time Availability

A common challenge in data analysis is one where new data arrives rapidly and constantly, and the same data needs to be available in near real time for reads, scans, and updates. Kudu offers the powerful combination of fast inserts and updates with efficient columnar scans to enable real-time analytics use cases on a single storage layer.

2. Time-series application with widely varying access patterns

A time-series schema is one in which data points are organized and keyed according to the time at which they occurred. This can be useful for investigating the performance of metrics over time or attempting to predict future behavior based on past data. For instance, time-series customer data might be used both to store purchase click-stream history and to predict future purchases, or for use by a customer support representative. While these different types of analysis are occurring, inserts and mutations may also be occurring individually and in bulk, and become available immediately to read workloads. Kudu can handle all of these access patterns simultaneously in a scalable and efficient manner.

Kudu is a good fit for time-series workloads for several reasons. With Kudu’s support for hash-based partitioning, combined with its native support for compound row keys, it is simple to set up a table spread across many servers without the risk of "hotspotting" that is commonly observed when range partitioning is used. Kudu’s columnar storage engine is also beneficial in this context, because many time-series workloads read only a few columns, as opposed to the whole row.

In the past, you might have needed to use multiple data stores to handle different data access patterns. This practice adds complexity to your application and operations, and duplicates your data, doubling (or worse) the amount of storage required. Kudu can handle all of these access patterns natively and efficiently, without the need to off-load work to other data stores.

3. Predictive Modeling

Data scientists often develop predictive learning models from large sets of data. The model and the data may need to be updated or modified often as the learning takes place or as the situation being modeled changes. In addition, the scientist may want to change one or more factors in the model to see what happens over time. Updating a large set of data stored in files in HDFS is resource-intensive, as each file needs to be completely rewritten. In Kudu, updates happen in near real time. The scientist can tweak the value, re-run the query, and refresh the graph in seconds or minutes, rather than hours or days. In addition, batch or incremental algorithms can be run across the data at any time, with near-real-time results.

4. Combining Data In Kudu With Legacy Systems

Companies generate data from multiple sources and store it in a variety of systems and formats. For instance, some of your data may be stored in Kudu, some in a traditional RDBMS, and some in files in HDFS. You can access and query all of these sources and formats using Impala, without the need to change your legacy systems.

Components of Apache Kudu

Table: A table is where your data is stored in Kudu. A table has a schema and a totally ordered primary key. A table is split into segments called tablets.
Tablet: A tablet is a contiguous segment of a table, similar to a partition in other data storage engines or relational databases. A given tablet is replicated on multiple tablet servers, and at any given point in time, one of these replicas is considered the leader tablet. Any replica can service reads, and writes require consensus among the set of tablet servers serving the tablet.
Tablet Server: A tablet server stores and serves tablets to clients. For a given tablet, one tablet server acts as a leader, and the others act as follower replicas of that tablet. Only leaders service write requests, while leaders or followers each service read requests. Leaders are elected using Raft Consensus Algorithm. One tablet server can serve multiple tablets, and one tablet can be served by multiple tablet servers.
Master: The master keeps track of all the tablets, tablet servers, the Catalog Table, and other metadata related to the cluster. At a given point in time, there can only be one acting master (the leader). If the current leader disappears, a new master is elected using Raft Consensus Algorithm. The master also coordinates metadata operations for clients. For example, when creating a new table, the client internally sends the request to the master. The master writes the metadata for the new table into the catalog table, and coordinates the process of creating tablets on the tablet servers. All the master’s data is stored in a tablet, which can be replicated to all the other candidate masters. Tablet servers heartbeat to the master at a set interval (the default is once per second).
Catalog Table: The catalog table is the central location for metadata of Kudu. It stores information about tables and tablets. The catalog table may not be read or written directly. Instead, it is accessible only via metadata operations exposed in the client API. The catalog table stores two categories of metadata:
- Tables: Table Schemas, locations and states.
- Tablets: The list of existing tables, which tablet servers have replicas of each tablet, the table's current state, and start and end keys.

Important Concepts of Apache Kudu

Raft Consensus Algorithm: Kudu uses the Raft consensus algorithm as a means to guarantee fault-tolerance and consistency, both for regular tablets and for master data. Through Raft, multiple replicas of a tablet elect a leader, which is responsible for accepting and replicating writes to follower replicas. Once a write is persisted in a majority of replicas it is acknowledged to the client. A given group of N replicas (usually 3 or 5) is able to accept writes with at most (N - 1)/2 faulty replicas.
Logical Replication: Kudu replicates operations, not on-disk data. This is referred to as logical replication, as opposed to physical replication. This has several advantages:
- Although inserts and updates do transmit data over the network, deletes do not need to move any data. The delete operation is sent to each tablet server, which performs the delete locally.
- Physical operations, such as compaction, do not need to transmit the data over the network in Kudu. This is different from storage systems that use HDFS, where the blocks need to be transmitted over the network to fulfill the required number of replicas.
- Tablets do not need to perform compactions at the same time or on the same schedule, or otherwise remain in sync on the physical storage layer. This decreases the chances of all tablet servers experiencing high latency at the same time, due to compactions or heavy write loads.

Apache Kudu and its technical Architecture

The following diagram shows a Kudu cluster with three masters and multiple tablet servers, each serving multiple tablets. It illustrates how Raft consensus is used to allow for both leaders and followers for both the masters and tablet servers. In addition, a tablet server can be a leader for some tablets, and a follower for others. Leaders are shown in gold, while followers are shown in blue.

Kudu Architecture

Kudu and Impala

Kudu has tight integration with Apache Impala, allowing you to use Impala to insert, query, update, and delete data from Kudu tablets using Impala’s SQL syntax, as an alternative to using the Kudu APIs to build a custom Kudu application. In addition, you can use JDBC or ODBC to connect existing or new applications written in any language, framework, or business intelligence tool to your Kudu data, using Impala as the broker.

Kudu and Hive Metastore

Kudu has an optional feature which allows it to integrate its own catalog with the Hive Metastore (HMS). The HMS is the de-facto standard catalog and metadata provider in the Hadoop ecosystem. When the HMS integration is enabled, Kudu tables can be discovered and used by external HMS-aware tools, even if they are not otherwise aware of or integrated with Kudu. Additionally, these components can use the HMS to discover necessary information to connect to the Kudu cluster which owns the table, such as the Kudu master addresses.