[TiDBer Chat Session 97] TiDB v7.5.0 LTS Released, Let's Talk About the New Features You're Most Excited About in v7.5.0✨

translator_bot · June 21, 2024, 11:06am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【TiDBer 唠嗑茶话会 97】TiDB v7.5.0 LTS 发版啦，谈谈你最期待的 v7.5.0 新特性✨

| username: 社区小助手

As the second Long-Term Support (LTS) version of the TiDB 7 series, TiDB 7.5 focuses on enhancing the stability of critical applications in large-scale scenarios. In the new version, TiDB has seen continuous improvements in scalability and performance, stability and high availability, SQL, and observability. TiDB 7.5 LTS includes new features, enhancements, and bug fixes from the previously released 7.2.0-DMR, 7.3.0-DMR, and 7.4.0-DMR versions, with over 70 optimizations and fixes in total.

For detailed key features and functionalities of 7.5.0, please see: TiDB 7.5.0 Release Notes | PingCAP 文档中心
Release Date: December 1, 2023

TiDB Version: 7.5.0

Trial Links: Quick Start | Production Deployment | Download Offline Package

TiDB 7.5.0 is a Long-Term Support (LTS) release.

Compared to the previous LTS (version 7.1.0), version 7.5.0 includes new features, enhancements, and bug fixes from 7.2.0-DMR, 7.3.0-DMR, and 7.4.0-DMR. When upgrading from 7.1.x to 7.5.0, you can download the TiDB Release Notes PDF to view all release notes between the two LTS versions. The table below lists some key features from 7.2.0 to 7.5.0:

Category	Feature	Description
Scalability and Performance	Support for running multiple `ADD INDEX` statements in parallel	This feature allows tasks to add multiple indexes to the same table to run concurrently. Previously, running two `ADD INDEX` statements X and Y simultaneously would take X’s time + Y’s time. Now, adding indexes X and Y in one SQL statement and running them concurrently significantly reduces the total time to add indexes. Internal tests show that performance can be improved by up to 94% when adding multiple indexes simultaneously, especially in wide table scenarios.
Stability and High Availability	Optimize Global Sort (experimental feature introduced in v7.4.0)	Introduced in TiDB v7.2.0, the distributed parallel execution framework for backend tasks was used as the basis for introducing global sorting in v7.4.0. This eliminates unnecessary I/O, CPU, and memory peaks caused by temporary unordered data during data reorganization tasks. Global sorting uses external object storage (currently Amazon S3) to store intermediate files during system jobs, increasing flexibility and reducing costs. Operations like `ADD INDEX` and `IMPORT INTO` will be faster, more flexible, stable, and reliable, with lower running costs.
Stability and High Availability	Resource control supports automatic management of background tasks (experimental feature introduced in v7.4.0)	Since v7.1.0, resource control has become an official feature, helping to mitigate resource and storage access interference between different workloads. TiDB v7.4.0 applied this resource control to the priority of background tasks. Resource control can identify and manage the execution priority of background tasks, such as automatic statistics collection, backup and restore, TiDB Lightning bulk data import, and online DDL. In the future, all background tasks will be included in resource control.
Stability and High Availability	Resource control supports managing queries that exceed expected resource consumption (experimental feature introduced in v7.2.0)	Resource control is a framework for isolating resources for workloads through resource groups, but it does not affect queries within each resource group. TiDB v7.2.0 introduced resource control for runaway queries, allowing you to control how TiDB identifies and handles queries for each resource group. Long-running queries may be terminated or throttled as needed, and you can identify queries through precise SQL text, SQL Digest, or Plan Digest. In TiDB v7.3.0, you can proactively monitor known bad queries, similar to a database-level SQL blocklist.
SQL	MySQL 8.0 compatibility (introduced in v7.4.0)	MySQL 8.0’s default character set is utf8mb4, and its default collation is `utf8mb4_0900_ai_ci`. TiDB v7.4.0 enhances compatibility with MySQL 8.0. Now you can more easily migrate or replicate databases created with the default collation in MySQL 8.0 to TiDB.
Database Management and Observability	`IMPORT INTO` statement integrates TiDB Lightning’s physical import mode capabilities (GA)	Before v7.2.0, to import data based on the file system, you needed to install TiDB Lightning and use its physical import mode. Now, this feature is integrated into the `IMPORT INTO` statement, allowing you to quickly import data without installing any additional tools. This statement also supports the distributed execution framework, enabling distributed execution of import tasks, improving efficiency for large-scale data imports.
Stability and High Availability	Select applicable TiDB nodes for distributed execution of `ADD INDEX` or `IMPORT INTO` SQL statements (GA)	You can flexibly choose to execute `ADD INDEX` and `IMPORT INTO` SQL statements on existing TiDB nodes or newly added TiDB nodes. This method can achieve resource isolation from other TiDB nodes, ensuring optimal performance when executing these statements and avoiding performance impacts on existing business. In v7.5.0, this feature is officially GA.
Stability and High Availability	DDL tasks support pause and resume operations (GA)	Adding indexes can consume a lot of resources and affect online traffic. Even with restrictions in resource groups or isolation of marked nodes, you may still need to pause these tasks in emergencies. Since v7.2.0, TiDB natively supports pausing any number of background tasks simultaneously, releasing the required resources without canceling or restarting the tasks.
Stability and High Availability	TiDB Dashboard performance analysis supports TiKV heap memory analysis	In previous versions, investigating TiKV OOM or high memory usage issues often required manually running `jeprof` in the instance environment to generate a Heap Profile. From v7.5.0, TiKV supports remote processing of Heap Profiles, allowing you to directly obtain flame graphs and call graphs of Heap Profiles through TiDB Dashboard. This feature provides the same ease of use as Go heap memory analysis.

Feature Details

Scalability

Support setting the service scope of TiDB nodes to select applicable TiDB nodes for distributed execution of ADD INDEX or IMPORT INTO tasks (GA) #46258 @ywqzzy In resource-intensive clusters, parallel execution of ADD INDEX or IMPORT INTO tasks may occupy a large amount of TiDB node resources, leading to cluster performance degradation. To avoid performance impacts on existing business, v7.4.0 introduced the variable tidb_service_scope as an experimental feature to control the service scope of each TiDB node under the TiDB backend task distributed framework. You can select several nodes from existing TiDB nodes or set the service scope for newly added TiDB nodes, and all distributed execution tasks of ADD INDEX and IMPORT INTO will only run on these nodes. In v7.5.0, this feature is officially GA. For more information, please refer to the user documentation.

Performance

The TiDB backend task distributed parallel execution framework becomes an official feature (GA), improving the performance and stability of parallel execution of ADD INDEX or IMPORT INTO tasks #45719 @wjhuang2016 Introduced in v6.6.0, the TiDB backend task distributed parallel execution framework becomes an official feature (GA). Before TiDB v7.1.0, only one TiDB node could execute DDL tasks at a time. From v7.1.0, under the distributed parallel execution framework, multiple TiDB nodes can execute the same DDL task in parallel. From v7.2.0, the distributed parallel execution framework supports multiple TiDB nodes executing the same IMPORT INTO task in parallel, better utilizing TiDB cluster resources and significantly improving the performance of DDL and IMPORT INTO tasks. Additionally, you can linearly improve the performance of DDL and IMPORT INTO tasks by adding TiDB nodes. To use the distributed parallel execution framework, simply set the value of tidb_enable_dist_task to ON:

SET GLOBAL tidb_enable_dist_task = ON;

For more information, please refer to the user documentation.

Improved the performance of adding multiple indexes in one SQL statement #41602 @tangenta Before v7.5.0, the performance of adding multiple indexes in one SQL statement (ADD INDEX) was similar to adding multiple indexes using multiple independent SQL statements. From v7.5.0, the performance of adding multiple indexes in one SQL statement has significantly changed, especially in wide table scenarios, with internal tests showing a performance improvement of up to 94%.

Database Management

DDL tasks support pause and resume operations becoming an official feature (GA) #18015 @godouxm The pause and resume functionality for DDL tasks introduced in v7.2.0 becomes an official feature (GA). This feature allows temporarily pausing resource-intensive DDL operations (such as creating indexes) to save resources and minimize the impact on online traffic. When resources allow, you can seamlessly resume DDL tasks without canceling and restarting them. The pause and resume functionality for DDL tasks improves resource utilization, enhances user experience, and simplifies the schema change process. You can pause or resume multiple DDL tasks using the following ADMIN PAUSE DDL JOBS or ADMIN RESUME DDL JOBS statements:

ADMIN PAUSE DDL JOBS 1,2;
ADMIN RESUME DDL JOBS 1,2;

For more information, please refer to the user documentation.

BR supports backup and restore of statistics #48008 @Leavrth From TiDB v7.5.0, the BR backup tool supports backing up and restoring database statistics, introducing the parameter --ignore-stats in the backup command. When this parameter is set to false, the BR backup tool supports backing up and restoring column, index, and table-level statistics. Therefore, the TiDB database restored from the backup no longer needs to manually run statistics collection tasks or wait for automatic collection tasks to complete, simplifying database maintenance and improving query performance. For more information, please refer to the user documentation.

Observability

TiDB Dashboard performance analysis supports TiKV heap memory analysis #15927 @Connor1996 In previous versions, investigating TiKV OOM or high memory usage issues often required manually running jeprof in the instance environment to generate a Heap Profile. From v7.5.0, TiKV supports remote processing of Heap Profiles, allowing you to directly obtain flame graphs and call graphs of Heap Profiles through TiDB Dashboard. This feature provides the same ease of use as Go heap memory analysis. For more information, please refer to the user documentation.

Data Migration

IMPORT INTO SQL statement becomes an official feature (GA) #46704 @D3Hunter In v7.5.0, the IMPORT INTO SQL statement officially becomes GA. This statement integrates the capabilities of TiDB Lightning’s physical import mode, allowing you to quickly import data in CSV, SQL, and PARQUET formats into an empty table in TiDB. This import method eliminates the need to separately deploy and manage TiDB Lightning, reducing the difficulty of data import while significantly improving data import efficiency. For more information, please refer to the user documentation.
Data Migration (DM) supports intercepting incompatible (data consistency-breaking) DDL changes #9692 @GMHDBJD Before v7.5.0, using DM’s Binlog Filter feature could only migrate or filter specified Events with coarse granularity, such as filtering ALTER DDL Events. This method was limited in some business scenarios, such as allowing ADD COLUMN but not DROP COLUMN, but previous DM versions would filter all ALTER Events. Therefore, v7.5.0 refines the granularity of DDL Event handling, supporting filtering fine-grained DDL Events like MODIFY COLUMN (modifying column data types), DROP COLUMN, etc., which can cause data loss, truncation, or precision loss. You can configure as needed. It also supports intercepting incompatible DDL changes and reporting errors, allowing you to intervene manually to avoid impacting downstream business data. For more information, please refer to the user documentation.
Support real-time updating of incremental data validation checkpoints #8463 @lichunzhu Before

translator_bot · June 21, 2024, 11:06am

| username: Jellybean | Original post link

v7.5 LTS has many useful features that can be tested. The most anticipated ones are the optimization of resource control and the support for pausing and resuming DDL.

translator_bot · June 21, 2024, 11:06am

| username: ShawnYan | Original post link

7.5 LTS, worth having~~

Supports running multiple ADD INDEX statements in parallel.
Especially in the case of wide tables, internal test data shows that adding multiple indexes simultaneously can improve performance by up to 94%.

translator_bot · June 21, 2024, 11:07am

| username: xfworld | Original post link

The biggest feature is the support for MySQL 8.x

translator_bot · June 21, 2024, 11:07am

| username: Kongdom | Original post link

MySQL 8.0 Compatibility

translator_bot · June 21, 2024, 11:07am

| username: Jolyne | Original post link

Compatibility with MySQL 8

translator_bot · June 21, 2024, 11:07am

| username: tidb菜鸟一只 | Original post link

- Through TiDB Dashboard, we can directly obtain the flame graph and call graph of TiKV.
  I’m quite interested in this.

translator_bot · June 21, 2024, 11:07am

| username: 啦啦啦啦啦 | Original post link

Parallel ADD INDEX

translator_bot · June 21, 2024, 11:07am

| username: Hacker007 | Original post link

Support running multiple ADD INDEX statements in parallel.

translator_bot · June 21, 2024, 11:07am

| username: LI-ldc | Original post link

Running multiple ADD INDEX statements in parallel can improve performance by up to 94%. That’s amazing!

translator_bot · June 21, 2024, 11:07am

| username: dba远航 | Original post link

Trying it out!! It’s really great.

translator_bot · June 21, 2024, 11:07am

| username: DBRE | Original post link

MySQL 8.0 compatibility

translator_bot · June 21, 2024, 11:07am

| username: 孤君888 | Original post link

Compatible with MySQL 8

translator_bot · June 21, 2024, 11:07am

| username: chenhanneu | Original post link

Compatible, the default character set for MySQL 8.0 is utf8mb4, with the default collation utf8mb4_0900_ai_ci.

translator_bot · June 21, 2024, 11:07am

| username: TIDB-Learner | Original post link

The resource control support function is worth looking forward to. Compatibility with 8.0 is the general trend. If resource management becomes more automated and intelligent, it will be even more anticipated by users.

translator_bot · June 21, 2024, 11:07am

| username: 像风一样的男子 | Original post link

When will stored procedures be supported? It’s urgent!

translator_bot · June 21, 2024, 11:07am

| username: TIDB-Learner | Original post link

The community edition will not support it. The enterprise edition will support it.

translator_bot · June 21, 2024, 11:07am

| username: 半瓶醋仙 | Original post link

Version 7.5 LTS eliminates unnecessary I/O, CPU, and memory spikes caused by temporary unordered data during data reorganization tasks. Global sorting uses external object storage (currently Amazon S3) to store intermediate files during system jobs, increasing flexibility and reducing costs.

translator_bot · June 21, 2024, 11:07am

| username: Aionn | Original post link

Supports MySQL 8.0+, although some old systems are using 5.7, 8.0+ is being used more and more.

translator_bot · June 21, 2024, 11:07am

| username: Tank001 | Original post link

MySQL 8.0 compatibility