Updates July'24 - Datazip is now up to 20% Faster

Updates July'24 - Datazip is now up to 20% Faster

Our ClickHouse version has been upgraded from 23.3.3.52 LTS to 24.3.5.46 LTS. This update includes several new features and performance enhancements.

The following are the primary improvements that make ClickHouse even more efficient within the Datazip ecosystem:

Compatibility Enhancements:

This decreases the user effort to re-work the sql queries that are already being used.

MySQL Compatibility:

1. Enhanced Function Support

Improved support for various MySQL functions enables smoother data migrations and integrations. Eg:

makeDate, fromDaysSinceYearZero, concat, toDayOfWeek

Improvements to aggregate functions argMin / argMax / any / anyLast / anyHeavy, as well as ORDER BY {u8/u16/u32/u64/i8/i16/u32/i64} LIMIT 1 queries.

[Image source]

2. SQL Syntax Compatibility

Enhancements to the SQL parser now allow more MySQL-specific syntax, making it easier to run MySQL queries without modification.

Issues and improvements like:

  • Added the function toUInt128OrZero

  • The locate alias in MySQL now takes arguments in the order of (needle, haystack, [,start_pos]) by default. This is done to make it more compatible with MySQL.

  • Support for START TRANSACTION syntax typically used in MySQL syntax, now added.

  • Improvement for the MySQL compatibility protocol.

3. PostgreSQL Compatibility:

  • Expanded Data Type Support

    Better handling of PostgreSQL-specific data types ensures more seamless data transfers and application compatibility.

  • Function and Operator Enhancements

    Increased support for PostgreSQL functions and operators reduces the need for extensive query rewrites during migration.

  • Other improvements

    Add generate_series as a table function (a fancy name for PostgreSQL to call the current "numbers" function). This function creates a table with a bunch of numbers in a row, starting from 1 and going up by 1 each time.

[Image source]

Interface Improvements

These improvements enables to compatibility with third party tools such as Tableau and Looker Studio.

Before Upgrading

After Upgrading

Interactive UI Enhancements:

The user interface has received significant upgrades, including more responsive and intuitive controls, making database management tasks more efficient, better colors for multi-line graphs, and adding a new chart without scrolling up as Advanced dashboard now has controls always visible on scrolling.

Improved CLI Tools

Command-line interface tools have been refined to offer more powerful and flexible options for database administrators.

Example:

Similarly to clickhouse-local, clickhouse-client will accept the --output-format option as a synonym to the --format option.

Performance Improvements

Optimized Storage Engine

  • Enhanced performance for S3 (disable sharded mode of StorageS3 queue, because it will be rewritten) and Azure Blob storage integrations (allow local as object storage type instead of local_blob_storage and Support parallel reading for Azure blob storage), particularly benefiting instances using m6g, graviton, or Standard D4as and Standard Ds series with configurations of 4 CPU and 16 GB memory.

  • Recursion is now removed when reading from S3 and fixed usage of session_token in S3 engine.

  • Added asynchronous WriteBuffer for Azure blob storage similar to S3. This improves the performance of the experimental Azure object storage.

Query Execution Speed

Major improvements in query execution speed through better indexing and more efficient use of CPU and memory resources.

Read more about speeding up queries in Clickhouse.

Resource Management

Advanced resource management techniques have been implemented to optimize the use of CPUs and memory, ensuring higher performance and stability under load.

Additional Performance Boosts

Usage of FINAL Keyword

Several fixes and improvements have been done on the FINAL keyword which can be used in ClickHouse to de-duplicate when running the query. Find details here: How Deduplication in ClickHouse works?

Fixes

Fix Parts Splitter for queries with the FINAL modifier.

These enhancements collectively provide a significant boost to both compatibility and performance, particularly benefiting users relying on cloud storage solutions like S3 and Azure Blob Storage.

The updates make ClickHouse a more robust and efficient choice for high-performance data analytics and management.

For detailed information, please refer to the ClickHouse Changelog.