Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: shard index 特性疑问
TiDB v6.1 supports hotspot secondary index tidb_shard, and there are a few questions:
- In the PR list (planner,expression,util: scatter hotspot index in certain scenes by expression index by TonsnakeLin · Pull Request #30659 · pingcap/tidb · GitHub), I only see how select is supported, but how is insert supported? That is, in the index (tidb_shard(a), a), does the prefix tidb_shard(a) need to be concatenated in front of a and stored in TiKV? If not, how is the write hotspot dispersed?
- Currently, there are many limitations on shard index (tidb/docs/design/2022-01-04-integer-shard-index.md at master · pingcap/tidb · GitHub). CockroachDB has also implemented shard-index (Index Sequential Keys with Hash-sharded Indexes), but with almost no limitations. Has TiDB referred to CockroachDB? Are there any plans for further improvements?
Support hash-sharded INDEX for sequential column like time
Currently, shard index has too many limitations, only supporting equality queries and not range queries, etc. This feature is somewhat impractical in business applications.
The limitations listed here:
- Non-equality queries cannot use indexes.
- When
AND
and OR
are mixed in the query condition and the outermost operator is AND
, SHARD INDEX cannot be used.
GROUP BY
cannot use SHARD INDEX.
ORDER BY
cannot use SHARD INDEX.
- The
ON
clause cannot use SHARD INDEX.
- Subqueries in
WHERE
cannot use SHARD INDEX.
- SHARD INDEX can only scatter unique indexes of integer fields.
- SHARD INDEX joint indexes may become invalid.
- SHARD INDEX cannot follow the FastPlan process, affecting optimizer performance.
- SHARD INDEX cannot use execution plan caching.
There are also some limitations: only supports unique indexes and supports int type input.
I think using hash with bucket = N is pretty good. TiDB’s approach of using uk((tidb_shard(a)), a) is quite strange and doesn’t support specifying the number of buckets.
There are currently no plans to remove these limitations. Please keep an eye on the roadmap and release notes.