TiDB INSERT Write Slow

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb INSERT 写入慢

| username: TiDBer_vC2lhR9G

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.0
[Encountered Problem: Phenomenon and Impact]

A table in the production environment has very slow writes. Bulk inserts result in slow queries, sometimes even over 1 second. The server is under no pressure, and there are no obvious write hotspots.


SQL statement:

INSERT INTO qc_report_rds (id, advertiser_id, aweme_id, ad_id, creative_id, marketing_goal, order_platform, marketing_scene, promotion_way, creative_material_mode, image_mode, smart_bid_type, pricing_category, stat_cost, show_cnt, click_cnt, convert_cnt, pay_order_count, pay_order_amount, create_order_count, create_order_amount, prepay_order_count, prepay_order_amount, dy_follow, luban_live_enter_cnt, live_watch_one_minute_count, live_fans_club_join_cnt, luban_live_slidecart_click_cnt, luban_live_click_product_cnt, luban_live_comment_cnt, luban_live_share_cnt, luban_live_gift_cnt, luban_live_gift_amount, dy_share, dy_comment, dy_like, total_play, play_duration_3s, play_25_feed_break, play_50_feed_break, play_75_feed_break, play_over, stat_time, uid, creator, create_time, update_time, del_flag) VALUES (1642858101091893249, 1758606025911374, 62871145733, 1761881661499443, 1761881796423710, 2, 1, 1, 2, 3, 15, 0, 2, 0.03, 1, 0, 0, 0, 0.0, 0, 0.0, 0, 0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0, 0, 0, 1, 0, 0, 0, 0, 0, '2023-04-03 19:00:00.0', 'd603099454ee699432fda0563e4c5441', 0, '2023-04-03 19:54:22.0', '2023-04-03 19:54:28.785', 0);

Execution plan:

	id      	task	estRows	operator info	actRows	execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 	memory 	disk
	Insert_1	root	0      	N/A          	0      	time:1.6s, loops:1, prepare: {total: 1.6s, auto_id_allocator: {rebase_cnt: 1, Get:{num_rpc:4, total_time:1.51ms}, commit_txn: {prewrite:5.79ms, get_commit_ts:1.43ms, commit:3.71ms, slowest_prewrite_rpc: {total: 0.006s, region_id: 241596, store: 172.24.196.8:20160, tikv_wall_time: 4.2ms, scan_detail: {get_snapshot_time: 6.17µs, rocksdb: {block: {cache_hit_count: 13}}}, write_detail: {store_batch_wait: 110.9µs, propose_send_wait: 0s, persist_log: {total: 847.2µs, write_leader_wait: 96.9µs, sync_log: 664.4µs, write_memtable: 4.4µs}, commit_log: 3.75ms, apply_batch_wait: 13.1µs, apply: {total:104.9µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 13.1µs, write_memtable: 42.7µs}}}, commit_primary_rpc: {total: 0.004s, region_id: 241596, store: 172.24.196.8:20160, tikv_wall_time: 3.4ms, scan_detail: {get_snapshot_time: 4.61µs, rocksdb: {block: {}}}, write_detail: {store_batch_wait: 868µs, propose_send_wait: 0s, persist_log: {total: 261.9µs, write_leader_wait: 95ns, sync_log: 230.5µs, write_memtable: 1.99µs}, commit_log: 2.38ms, apply_batch_wait: 15.9µs, apply: {total:44.1µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 15.9µs, write_memtable: 18.2µs}}}, region_num:1, write_keys:1, write_byte:55}}}, insert:108µs, commit_txn: {prewrite:6.62ms, slowest_prewrite_rpc: {total: 0.007s, region_id: 332115, store: 172.24.196.8:20160, tikv_wall_time: 5.85ms, scan_detail: {get_snapshot_time: 96.5µs, rocksdb: {block: {cache_hit_count: 9}}}, write_detail: {store_batch_wait: 1.75ms, propose_send_wait: 0s, persist_log: {total: 1.4ms, write_leader_wait: 167ns, sync_log: 766.5µs, write_memtable: 28.9µs}, commit_log: 3.6ms, apply_batch_wait: 59.9µs, apply: {total:102.2µs, mutex_lock: 0s, write_leader_wait: 0s, write_wal: 59.9µs, write_memtable: 53.3µs}}}, region_num:7, write_keys:7, write_byte:636}	3.50 KB	N/A


[Resource Configuration]

Table structure:


There are 5 normal indexes and 1 unique composite index.

[Attachments: Screenshots/Logs/Monitoring]

| username: 裤衩儿飞上天 | Original post link

How is the disk I/O performance? Are there any bottlenecks?
Please share the Grafana monitoring for the writes.

| username: 人如其名 | Original post link

The issue occurred during the prepare phase, indicating that it should have been slow when organizing data for insertion. However, it’s strange that the execution information of the execution plan doesn’t show where it was slow. If possible, you can enable pprof during the slow period to see where the main time consumption is.

| username: tidb菜鸟一只 | Original post link

Directly insert a piece of data on the server and see if it’s slow.

| username: Kongdom | Original post link

If you have patience, you can refer to the SOP for troubleshooting.

| username: TiDBer_vC2lhR9G | Original post link

This is the disk read/write situation of one of the TiKV nodes. They are all independent SSD disks, so this rate should not be the bottleneck.

| username: TiDBer_vC2lhR9G | Original post link

It may not be reproducible directly on the server, it’s quite fast. The same simple SQL, some are fast and some are slow.

| username: 魔礼养羊 | Original post link

Looking at the dashboard, the issue is most likely in the prewrite phase. Specifically, I suggest preparing a test environment and test data tables.

First, in the test environment, run the same SQL statement and test its speed (to determine if it’s a computational or region issue);
Then, test the statement again by creating a new row, reducing the amount of data written, and alternately using 6 indexes (to determine if it’s a statement issue).

I suspect that the probability of a write hotspot in the region is high,

| username: wzf0072 | Original post link

The link you provided explains the latency breakdown of TiDB transactions. It details the various stages of a transaction and the time taken at each stage. This information can help in identifying and optimizing performance bottlenecks in TiDB.

| username: tidb狂热爱好者 | Original post link

First, remove the auto-increment and remove the index.

| username: TiDBer_vC2lhR9G | Original post link

These indexes are all quite commonly used, right? :upside_down_face:

| username: wzf0072 | Original post link

Observe the following data on the Performance Overview panel for database time overview and QPS:

| username: TiDBer_vC2lhR9G | Original post link

I can’t assist with that.

| username: dbaspace | Original post link

Insert a piece of data and see if it’s slow.

Pay attention to this monitoring panel and see if a certain TiKV is under heavy pressure. Also, this metric is high.

Previously encountered frequent write hotspots, causing cluster instability. In severe cases, this metric becomes very high.

| username: dbaspace | Original post link

Could you provide the DDL information?

| username: 孤君888 | Original post link

It looks like there are too many indexes. I also encounter this situation here. You can delete the indexes, insert the data, and then create the indexes after the insertion is complete.

| username: TiDBer_vC2lhR9G | Original post link

There is above.

| username: TiDBer_vC2lhR9G | Original post link

There are mixed read-write scenarios, cannot delete.

| username: wzf0072 | Original post link

The average QPS for insert is 290, and the average QPS for update is 1.7k. Are there other update statements being executed? Or is the high update rate due to maintaining indexes? If it’s a test environment, you can add or remove a few indexes to test.

| username: TiDBer_vC2lhR9G | Original post link

The update QPS is not high. It’s a normal production business request with other tables involved, not just this table being tested. The ID is generated sequentially and uniquely by the code, so there’s no issue with dispersion. However, clustering might occur because, within a short period, the same batch of IDs could fall within the same range.