[Issue] Unable to Execute Drop and Create Table and Database Commands After Continuing to Stress Test TiDB Cluster with Sysbench

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【故障】使用sysbench对TiDB集群继续压测以后无法执行 Drop 和 Create 表和库

| username: TiDBer_X3DgmgrB

[Test Environment for TiDB]

  • PD: 3 nodes, 16C32G

  • TiDB: 3 nodes, 16C32G

  • TiFlash: 5 nodes, 16C32G

  • TiKV: 20 nodes, 16C32G

  • TiUP is damaged, metadata is lost, and the TiUP node cannot be repaired temporarily

[TiDB Version] v6.1.0

[Reproduction Path] Perform a stress test on the TiDB database using sysbench

  • sysbench --db-driver=mysql --time=300 --threads=100 --report-interval=1 --mysql-host= --mysql-port=4000 --mysql-user= --mysql-password= --mysql-db=test --tables=10 --table_size=1000000 oltp_read_write prepare

  • sysbench --db-driver=mysql --time=300 --threads=100 --report-interval=1 --mysql-host= --mysql-port=4000 --mysql-user= --mysql-password= --mysql-db=test --tables=10 --table_size=1000000 oltp_read_write run

[Encountered Issues: Symptoms and Impact]

To evaluate cluster performance, a stress test was conducted on the TiDB cluster using sysbench, generating 10 tables with 100 million rows of data each. The prepare phase was normal, but during the run phase, an error occurred: FATAL: sbtest1 does not exist, and the program froze. After canceling the run command and attempting to drop all test tables for retesting, the drop command execution time exceeded 3000s without completion or failure indication. Additionally, it was found that while Select, Insert, and Delete operations were normal, create table, create database, drop table, and drop database commands could not be executed, and restarting the cluster or servers did not resolve the issue.

| username: h5n1 | Original post link

admin show ddl jobs to see if there are any DML operations on the dropped table? Use the latest 6.1.7 version directly for the new test environment.

| username: Fly-bird | Original post link

For stress testing, you can directly restart the cluster.

| username: TiDBer_X3DgmgrB | Original post link

I discovered through admin show ddl jobs that there were indeed piled-up jobs, with some tables still adding indexes but not completing for some reason. After canceling them using ADMIN CANCEL DDL JOBS, everything returned to normal. Thank you very much!

| username: yulei7633 | Original post link

The key to the problem is admin show ddl jobs.

| username: TiDBer_小阿飞 | Original post link

The magic of rebooting