TiDB select * from information_schema.tables limit 1 query has no response

translator_bot · June 22, 2024, 9:53pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb select * from information_schema.tables limit 1 查询无响应

| username: dbms-ops

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] V5.1.2
[Reproduction Path] What operations were performed that caused the issue
Referenced 统计信息简介 | PingCAP 文档中心 to modify configuration parameters, but the issue was not resolved
[Encountered Issue: Problem Phenomenon and Impact]
The query on this table is very slow, and when the user’s client connects, it often leads to no response and timeout
[Resource Configuration]
3TIDB + 3TiKV + 3PD
This database is a test database with a relatively large number of tables
[Attachments: Screenshots/Logs/Monitoring]
Execution plan of the statement

May I ask if there are any troubleshooting ideas that can be provided?

translator_bot · June 22, 2024, 9:53pm

| username: h5n1 | Original post link

Try using trace select to see what results you get.

translator_bot · June 22, 2024, 9:53pm

| username: dbms-ops | Original post link

This will get stuck and cannot display any information.

translator_bot · June 22, 2024, 9:53pm

| username: dbms-ops | Original post link

Do we have a way to locate the cause of TiKV CPU anomalies through logs or to identify specific tables being read?

translator_bot · June 22, 2024, 9:54pm

| username: h5n1 | Original post link

Is this exception only present when querying tables, or is it always present? You can first check the thread CPU utilization in the tikv-detail monitoring to see which type of thread has high CPU usage.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

The high CPU anomaly has always been there, and I suspect it is related to the high CPU usage of TiKV. Checking the thread CPU did not reveal any particularly abnormal nodes.

translator_bot · June 22, 2024, 9:54pm

| username: h5n1 | Original post link

There is an obvious hotspot on this machine. Check the heatmap on the dashboard to see which object it is, and then look at the related slow SQL.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

Confirmed that some large tables in the business are performing delete operations.

translator_bot · June 22, 2024, 9:54pm

| username: 我是咖啡哥 | Original post link

I encountered slow queries on system tables in Oracle before. It can be optimized by collecting system table statistics.
It seems that TiDB can’t do this. Manually collecting statistics on such mem tables will result in an error.

ANALYZE table tables;
ERROR 1142 (42000): INSERT command denied to user 'root'@'%' for table 'tables'

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

May I ask: When I was troubleshooting the slowness of information_schema.tables, I noticed that querying this table triggers a region scan to determine the number of rows and index size. Where can I find detailed information about this part of the knowledge?

translator_bot · June 22, 2024, 9:54pm

| username: h5n1 | Original post link

I also couldn’t find the relevant description. Some metadata of the table is stored in TiKV, so it will scan some regions.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

This table cannot be analyzed, I have also tried.

translator_bot · June 22, 2024, 9:54pm

| username: tidb菜鸟一只 | Original post link

The information_schema.tables is actually not a table, but a system view. I think you should check if there are a lot of DDL operations being executed. When DDL operations are executed, they can actually affect the queries on this system view.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

The business delete transaction has already stopped, but the CPU usage still hasn’t decreased.

The heatmap shows that the table reads are not that large.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

Some truncate tasks are being executed in DDL.

translator_bot · June 22, 2024, 9:54pm

| username: h5n1 | Original post link

Take a look at the slow SQL analysis on the dashboard, excluding delete operations.

translator_bot · June 22, 2024, 9:54pm

| username: h5n1 | Original post link

Check the leader distribution in TiKV on the overview page.

translator_bot · June 22, 2024, 9:54pm

| username: dbms-ops | Original post link

Apart from delete operations, some tables in the business are read very frequently, and the table data volume is quite large. The auto-increment field used auto_increment, causing hotspot regions to concentrate on a single machine. The table has now been rectified, and TiDB performance has stabilized.

Thank you all for your suggestions.

translator_bot · June 22, 2024, 9:54pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.