After TiDB is connected to real-time data, the performance is very slow

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB接入实时数据后,性能非常慢

| username: xiaour

[TiDB Usage Environment] Production Environment
[TiDB Version] v4.0.8
[Encountered Problem] TiDB performance is very slow after integrating real-time data
[Reproduction Path] Previously, it was all offline data. Later, a real-time data write was added, and after that, the database performance dropped sharply.
[Problem Phenomenon and Impact]
Currently, we can only stop the real-time write task, but user orders cannot be queried. Some RegionMerge parameter optimizations were made, but they had no effect. Pessimistic transactions were also disabled.
[Attachment]
tidb-test-Overview_2022-10-21T09_03_15.727Z.json (2.0 MB)

| username: xfworld | Original post link

Aren’t you going to introduce the cluster configuration, hardware situation, and distribution structure?
Also, what level of real-time write support is required? How large is the data? What is the structure? Shouldn’t you provide more details?

| username: h5n1 | Original post link

  1. What is the cluster configuration? Check the output of tiup cluster display.
  2. The PD wait time is relatively high. Check the TiDB server CPU, PD CPU, and PD disk performance.
  3. It looks like the region heartbeat is a bit high. You can try enabling hibernate-regions to see if it helps (it’s an experimental feature in 4.x).
    TiKV 配置文件描述 | PingCAP 归档文档站
  4. Consider upgrading to the latest version of 4.x.
| username: xiaour | Original post link


| username: xiaour | Original post link

The cluster configuration was posted in another reply. The real-time write is around 10-50 entries per second, with approximately 5 dependent tasks writing simultaneously, and the data increases daily. However, sometimes tasks fail or there are data issues, which involves rerunning offline data. This can lead to a sudden spike in pressure, reaching up to 5k entries per second. When this real-time task was first launched, it did not significantly impact TiDB’s read and write performance. However, after about ten days, the performance drastically declined, with read services becoming almost unusable and writes becoming very slow.

| username: xfworld | Original post link

You can check the monitoring. When the pressure is high, it’s easy to determine which part is slow.

| username: 近墨者zyl | Original post link

The machine configuration is too low. Check the network, it’s also gigabit, and the network card as well, but the efficiency of real-time SQL writes is still low.

| username: h5n1 | Original post link

The configuration is too low. Check if the CPU of TiDB, PD, and TiKV is fully utilized during real-time writing. Add more resources.

| username: xiaour | Original post link

Well, after checking, it is basically determined that the slow SQL has many subqueries, causing the entire real-time write and read to be slow.

| username: xfworld | Original post link

Congratulations, that was easy to solve.

| username: xiaour | Original post link

After the configuration was upgraded, most of them were above 16C. Later, it was found to be due to slow SQL.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.