Is Local Index Supported and What is the Current Progress?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: local索引是否已经支持以及目前进度

| username: 青木科技-熊猛

Currently, it is found that the SQL performance degrades significantly during high-concurrency OLAP aggregation queries on a single table. Relevant data:
Base table: 60 million records
TiKV nodes: 2
TiFlash: 2 replicas
Slowest single SQL query time: within 0.8 seconds
Number of concurrent SQL queries: 200 within 1 second, all are aggregation queries.
Slowest query time after concurrency: 10 seconds

We are currently using physical sharding to reduce the single table data volume to within 4 million, which makes the performance degradation less severe, generally within 3 seconds.

However, maintaining physical sharding is very troublesome, so we are considering partitioning. But after researching, we found that partitioning does not support local indexes, so we want to understand the progress of local index support.

Related question: 分区表的索引是如何存储的? - TiDB 的问答社区

| username: zhanggame1 | Original post link

Currently, there are no local indexes in the future plans.

| username: redgame | Original post link

It shouldn’t be doing this.

| username: tidb菜鸟一只 | Original post link

Do you only have 2 TiKV nodes now? If your slow SQL is using TiKV, you should add more nodes. If it’s using TiFlash, then partitioning won’t be very useful.

| username: h5n1 | Original post link

After partitioning, it is actually local. TiDB does not have global indexes yet.

| username: Defined2014 | Original post link

Currently, in TiDB, the indexes of each partition in a partitioned table do not exist together. Each partition in a partitioned table can be considered as a separate physical table.

| username: 青木科技-熊猛 | Original post link

In fact, partitioning is still useful. The concern with not partitioning is that even with an index, TiFlash still performs a table full scan, which results in reading a lot of data (for example, the entire table data is 60 million), while the actual useful data is very little (single store single month data might be just 1 million).

| username: tidb菜鸟一只 | Original post link

Partitioning does not affect TiFlash. If you use TiKV, you will definitely use indexes. The difference between scanning global indexes and partition indexes will not be significant…

| username: 青木科技-熊猛 | Original post link

The current test results show that partitioning affects TiFlash. Specify the partition in the SQL instead of relying on partition pruning. This way, after partitioning, TiFlash will directly read the partitioned table. Even during a TableFullScan, the amount of data read is very small.

At the same time, after partitioning, the performance degradation of high concurrency (200 QPS) OLAP read issues is not as severe.

| username: tidb菜鸟一只 | Original post link

Directly specifying PARTITION actually works, learned something new.

SELECT /*+ READ_FROM_STORAGE(TIFLASH(dys.`t_click_log`)) */ * FROM dys.`t_click_log` PARTITION(p1);

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.