[TiDBer Chat Session 53] Version 6.5.0 Released, What Features Are You Most Looking Forward To?

translator_bot · June 22, 2024, 8:38pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【TiDBer 唠嗑茶话会 53 】6.5.0 版本上线，说说你最期待的特性是哪些？

| username: Billmay表妹

TiDB 6.5.0 Release Notes

Release Date: December 29, 2022

TiDB Version: 6.5.0

Trial Links: Quick Start | Production Deployment | Download Offline Package

TiDB 6.5.0 is a Long-Term Support (LTS) release.

Compared to the previous LTS (version 6.1.0), version 6.5.0 includes new features, improvements, and bug fixes from 6.2.0-DMR, 6.3.0-DMR, and 6.4.0-DMR, and introduces the following key features:

Add Index Acceleration feature GA, improving index addition performance by approximately 10 times compared to v6.1.0.
TiDB Global Memory Control feature GA, allowing management of global memory thresholds via tidb_server_memory_limit.
Support for high-performance, globally monotonically increasing AUTO_INCREMENT column attribute GA, compatible with MySQL.
FLASHBACK CLUSTER TO TIMESTAMP feature adds compatibility support for TiCDC and PITR, now GA.
Optimizer introduces a more accurate cost model Cost Model Version 2 GA, and enhances index merge INDEX MERGE functionality to support AND connected expressions.
Support for pushing down JSON_EXTRACT() function to TiFlash.
Support for password management policies to meet password compliance audit requirements.
TiDB Lightning and Dumpling support importing and exporting compressed format files.
TiDB Data Migration (DM) incremental data validation feature GA.
TiDB snapshot backup supports resumable uploads, and PITR recovery performance improved by 50%, reducing RPO to 5 minutes in general scenarios.
TiCDC synchronizes data to Kafka, increasing throughput from 4000 rows per second to 35000 rows per second, and reducing replication latency to 2 seconds.
Provides row-level Time to live (TTL) data lifecycle management (experimental feature).
TiCDC supports object storage for Amazon S3, Azure Blob Storage, and NFS object storage (experimental feature).

Topic of the Issue:

Version 6.5.0 is online, what features are you most looking forward to?

Activity Rewards:

Participation Award

Participate in the topic discussion to earn 50 points~

Activity Time:

2022.12.30-2023.1.6 Come and make a flag~

More introduction to version 6.5.0, please click to view

New Features

SQL

TiDB’s index addition performance is improved by approximately 10 times (GA) #35983 @benjamin2037 @tangenta TiDB v6.3.0 introduced Add Index Acceleration as an experimental feature, speeding up the index addition backfill process. This feature is officially GA in v6.5.0 and enabled by default, with expected performance improvements of approximately 10 times for adding indexes to large tables. Add Index Acceleration applies to scenarios where a single SQL statement adds indexes serially, and only affects one of the SQL statements when multiple SQL statements add indexes in parallel.
Provides lightweight metadata locks to improve the success rate of DML during DDL changes (GA) #37275 @wjhuang2016 TiDB v6.3.0 introduced metadata locks as an experimental feature, coordinating the priority of DML and DDL statements during table metadata changes to avoid Information schema is changed errors for DML statements. This feature is officially GA in v6.5.0 and enabled by default, applicable to various DDL change scenarios. When upgrading from versions before v6.5.0 to v6.5.0 and later, TiDB automatically enables this feature by default. If you need to disable this feature, you can set the system variable tidb_enable_metadata_lock to OFF. For more information, please refer to the user documentation.
Supports quickly rolling back the cluster to a specific point in time using the FLASHBACK CLUSTER TO TIMESTAMP command (GA) #37197 #13303 @Defined2014 @bb7133 @JmPotato @Connor1996 @HuSharp @CalvinNeo TiDB v6.4.0 introduced the FLASHBACK CLUSTER TO TIMESTAMP statement as an experimental feature, supporting quick rollback of the entire cluster to a specified point in time within the Garbage Collection (GC) life time. This feature adds compatibility support for TiCDC and PITR in v6.5.0 and is officially GA, suitable for quickly undoing DML misoperations, supporting minute-level quick rollback of the cluster, and supporting multiple rollbacks on the timeline to determine the time of specific data changes. For more information, please refer to the user documentation.
Fully supports non-transactional DML statements including INSERT, REPLACE, UPDATE, and DELETE #33485 @ekexium In large-scale data processing scenarios, single large transaction SQL processing may affect cluster stability and performance. Non-transactional DML statements split a DML statement into multiple SQL statements for internal execution. The split statements sacrifice transaction atomicity and isolation but greatly improve cluster stability. TiDB has supported non-transactional DELETE statements since v6.1.0, and v6.5.0 adds support for non-transactional INSERT, REPLACE, and UPDATE statements. For more information, please refer to Non-transactional DML Statements and BATCH Statements.
Supports Time to live (TTL) (experimental feature) #39262 @lcwangchao TTL provides row-level lifecycle control policies. In TiDB, tables with TTL attributes will automatically check and delete expired row data based on the configuration. The goal of TTL design is to help users periodically and timely clean up unnecessary data without affecting online read and write loads. For more information, please refer to the user documentation.
Supports saving TiFlash query results using INSERT INTO SELECT statements (experimental feature) #37515 @gengliqi Starting from v6.5.0, TiDB supports pushing down the SELECT clause (analytical query) in INSERT INTO SELECT statements to TiFlash, allowing you to conveniently save TiFlash query results to the specified TiDB table in INSERT INTO for subsequent analysis, achieving result caching (i.e., result materialization). For example:

INSERT INTO t2 SELECT Mod(x,y) FROM t1;

In the experimental feature stage, this feature is disabled by default. To enable this feature, set the system variable tidb_enable_tiflash_read_for_write_stmt to ON. When using this feature, there are no special restrictions on the result table specified by INSERT INTO, and you can freely choose whether to add TiFlash replicas to the table. Typical use cases for this feature include:

Using TiFlash for complex analysis
Repeatedly using TiFlash query results or responding to high-concurrency online requests
The result set is relatively small compared to the input data of the query, recommended within 100 MiB For more information, please refer to the user documentation.
Supports binding historical execution plans (experimental feature) #39199 @fzzf678 Due to various factors affecting SQL statement execution, the previously optimal execution plan may occasionally be replaced by a new execution plan, affecting SQL performance. In such scenarios, the optimal execution plan may still be in the SQL execution history and has not been cleared. In v6.5.0, TiDB extends the binding objects in the CREATE [GLOBAL | SESSION] BINDING statement to support creating bindings based on historical execution plans. When the execution plan of a SQL statement changes, as long as the original execution plan is still in the SQL execution history memory table (e.g., statements_summary), you can bind the original execution plan by specifying plan_digest in the CREATE [GLOBAL | SESSION] BINDING statement to quickly restore SQL performance. This method simplifies the handling of execution plan mutation issues and improves operational efficiency. For more information, please refer to the user documentation.

Security

Supports password complexity policies #38928 @CbcWestwolf When TiDB enables the password complexity policy feature, TiDB will check the password length, the number of uppercase and lowercase characters, the number of numeric characters, the number of special characters, password dictionary matching, and whether it is the same as the username when users set passwords, ensuring that users set secure passwords. TiDB supports the password strength check function VALIDATE_PASSWORD_STRENGTH(), which is used to determine the strength of a given password. For more information, please refer to the user documentation.
Supports password expiration policies #38936 @CbcWestwolf TiDB supports password expiration policies, including manual password expiration, global-level automatic password expiration, and account-level automatic password expiration. When the password expiration policy feature is enabled, users must regularly change their passwords to prevent the risk of password leakage due to long-term use, improving password security. For more information, please refer to the user documentation.
Supports password reuse policies #38937 @keeplearning20221 TiDB supports password reuse policies, including global-level password reuse policies and account-level password reuse policies. When the password reuse policy feature is enabled, users cannot use passwords that have been used recently or in the last few times, reducing the risk of password leakage due to repeated use and improving password security. For more information, please refer to the user documentation.
Supports password consecutive error login restriction policies #38938 @lastincisor When TiDB enables the password consecutive error login restriction policy feature, if users enter the wrong password multiple times consecutively when logging in, the account will be temporarily locked and automatically unlocked after the lockout time. For more information, please refer to the user documentation.

Observability

TiDB Dashboard supports independent Pod deployment in Kubernetes environments #1447 @SabaPing Starting from TiDB v6.5.0 and TiDB Operator v1.4.0, TiDB Dashboard can be deployed as an independent Pod on Kubernetes. In the TiDB Operator environment, you can directly access the IP of this Pod to open TiDB Dashboard. Independent deployment of TiDB Dashboard offers the following benefits:
- TiDB Dashboard’s computation will no longer put pressure on PD nodes, better ensuring cluster operation.
- If PD nodes are inaccessible due to exceptions, you can still use TiDB Dashboard for cluster diagnostics.
- When exposing TiDB Dashboard to the public network, you don’t have to worry about the privilege port permissions of PD, reducing cluster

translator_bot · June 22, 2024, 8:38pm

| username: tidb菜鸟一只 | Original post link

That must be it, the SQL optimizer is the lifelong enemy of DBAs.

translator_bot · June 22, 2024, 8:38pm

| username: 裤衩儿飞上天 | Original post link

The optimizer introduces a more precise cost model Cost Model Version 2 GA, and enhances the optimizer’s support for index merge INDEX MERGE functionality for expressions connected by AND.
The Add Index Acceleration feature is GA, improving the performance of adding indexes by approximately 10 times compared to version v6.1.0.
TiDB snapshot backup supports resumable uploads, and the recovery performance of PITR has improved by 50%, reducing RPO to 5 minutes in general scenarios.

translator_bot · June 22, 2024, 8:38pm

| username: 数据小黑 | Original post link

Several features are eye-catching:

TiCDC supports outputting change data to storage sink.
Full support for non-transactional DML statements including INSERT, REPLACE, UPDATE, and DELETE.
Support for Time to Live (TTL) (experimental feature).

translator_bot · June 22, 2024, 8:38pm

| username: 啦啦啦啦啦 | Original post link

TiDB’s global memory control should be able to reduce OOM situations.

translator_bot · June 22, 2024, 8:38pm

| username: TiDBer_杨龟干外公 | Original post link

Looking forward to it, TiCDC synchronizes data to Kafka.

translator_bot · June 22, 2024, 8:38pm

| username: TiDBer_jYQINSnf | Original post link

For this feature, shout out loud, “Awesome!”
Since version 4.0, I’ve been hoping that TiDB could manage its own memory well, preferring to reject SQL rather than OOM. Now it’s finally achieved, thumbs up

Additionally, what’s not mentioned is 系统变量 | PingCAP 文档中心
This triggers GC based on memory usage, which is also very important. However, it can only be triggered once per minute, which is a bit too restrictive. If it could be triggered without limit, triggering GC when memory reaches 70%, then in some cases, we could turn off Go’s automatic GC and rely on this to reclaim memory, as Go’s GC significantly impacts performance.

translator_bot · June 22, 2024, 8:38pm

| username: 边城元元 | Original post link

TiDB Global Memory Control

translator_bot · June 22, 2024, 8:38pm

| username: xingzhenxiang | Original post link

The TiDB global memory control feature has reached GA, and you can manage the global memory threshold by setting tidb_server_memory_limit to 1. The memory of OOM (Out of Memory) is too profound.

translator_bot · June 22, 2024, 8:38pm

| username: luoscorn | Original post link

TiCDC, the business data link is becoming increasingly complex.

translator_bot · June 22, 2024, 8:38pm

| username: LI-ldc | Original post link

TiDB Global Memory Control;
Supports [Password Management], meeting password compliance audit requirements, finally able to schedule compliance reviews.

translator_bot · June 22, 2024, 8:38pm

| username: waeng | Original post link

TiDB Global Memory Control

translator_bot · June 22, 2024, 8:38pm

| username: ShawnYan | Original post link

Finally, JSON format output is supported. I’ve been looking forward to this for quite a while.

translator_bot · June 22, 2024, 8:38pm

| username: Billmay表妹 | Original post link

Yes, the suggestions everyone provided last time have been implemented quickly~

translator_bot · June 22, 2024, 8:38pm

| username: 天蓝色的小九 | Original post link

TiDB Global Memory Control

translator_bot · June 22, 2024, 8:38pm

| username: danghuagood | Original post link

TiCDC synchronizes data to Kafka, increasing throughput from 4000 rows per second to 35000 rows per second, and reducing replication latency to 2 seconds.

translator_bot · June 22, 2024, 8:38pm

| username: TiDBer_CQ | Original post link

TiCDC synchronizes data to Kafka, increasing throughput from 4,000 rows per second to 35,000 rows per second, and reducing replication latency to 2 seconds.

translator_bot · June 22, 2024, 8:38pm

| username: ti-tiger | Original post link

This feature is quite interesting, looking forward to it.

translator_bot · June 22, 2024, 8:38pm

| username: dapan3927 | Original post link

TiDB Lightning and Dumpling support importing and exporting compressed format files.