Some TiDB Maintenance Notes

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB 一些运维笔记

| username: Billmay表妹

Thanks to @凌云Cloud for the contribution.

Storage Limitations

  • Single column 6M, single row 6M (increasing the default value will cause performance issues)
  • A single transaction supports up to 300,000 rows of data, with each row less than 6M, and the entire transaction cannot exceed 100M
  • Storage node kv should not be too large, kv single node should not exceed 2T, i.e., about 20,000 regions per node (based on 100M), exceeding this value requires adding nodes to solve the problem of excessive single node capacity

Split Data

TiDB’s underlying data storage uses the Region method, which can be described with [StartKey, EndKey). The default is 96MB, and user customization is supported, but official modification is not recommended. Regions are stored in RowID order by default. Users do not need to worry about how the database and tables are stored at the bottom level; TiDB will automatically split and generate new Regions based on the Region size. Initially, this works very well…

Problems:

  • Hotspot issues: Using auto-increment IDs as per MySQL table creation habits will cause data to always be written to the latest Region, leading to write hotspots and increased time consumption.
  • High-frequency Region splits can affect cluster performance.

Partition Table Issues

TiDB partition tables have many usage restrictions. For applications with large amounts of data, especially those with archiving needs, TiDB partition tables are not very suitable, such as in high-stability financial scenarios.

  • Each unique key in the partition table must include all columns used in the partition expression.
  • Partition pruning is currently unstable and is still an experimental feature.
  • Plan Cache cannot be used.
  • IndexJoin execution method cannot be used.
  • Global indexes are not supported.
  • SQL mode cannot be modified.

Backup Issues

  • Before version 4.x, TiDB backups were done using mydumper/mysqldumper, which are purely logical backups. It is difficult to back up large tables, and it requires adjusting the gc lifetime long enough to complete. For large tables reaching 2-3 billion rows (over 700G), this time needs to be set very long (over 12 hours).
ps: gc lifetime is associated with mvcc, and setting it too long will affect cluster performance.
  • After version 4.x, dumping and BR are used. Dumping can dynamically modify the gc time as needed, but it is uncontrollable for the DB. For large clusters, TiDB introduced BR, which is strictly logical. During backup, performance will drop by 15% to 30%.
  • Incremental backup issues: Normally, the gc time in the production environment will not be set too long, and incremental backups need to be done before the last GC safepoint, which is generally difficult to meet in production environments.
  • TiDB’s log components are independent, and incremental logs cannot be backed up simultaneously.

Binlog Management Issues

  • pump: The log management component pump is independent. By default, multiple pumps need to be deployed in production to prevent transaction write failures. The total of all pumps is a complete transaction log. If the log of one pump node is not consumed, it will lead to data loss.
  • The maintenance management tools for pump and drainer are lacking. For example, it is difficult to remove offline nodes from the cluster, which can confuse maintenance personnel.

Memory OOM

  • TiDB Server executing full table scans and slow queries will cause memory to grow uncontrollably, and there are currently no clear parameters to control this.
  • When deploying TiKV dual instances on a single machine, the memory of a single instance needs to be controlled, otherwise, it is prone to OOM.

Monitoring Issues

Monitoring for each cluster is independent. If there are multiple clusters, they cannot be centrally managed, mainly reflected in the following aspects:

  • Data: Each cluster corresponds to one Prometheus. When there are multiple clusters, the corresponding monitoring data cannot be centrally managed.
  • Interface display: To display uniformly on one interface, indirect processing is required, manually adding data sources and display interfaces of different clusters to Grafana.

There is no automatic backup and recovery. The TiDB monitoring interface cannot automate backup and recovery management, and DBAs still need to handle it separately in maintenance.