How to Troubleshoot Long Prewrite Time and Commit Time

translator_bot · June 22, 2024, 6:16pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: prewrite time commit time 时间长怎么定位问题

| username: TiDBer_20QjYTLl

[Test Environment] Testing environment
[TiDB Version] 5.1.0
[Reproduction Path] Using DM to synchronize data, during the loading stage, a large number of tables are created and data is imported
[Encountered Problem: Phenomenon and Impact]
The create table statement is slow, usually taking a few seconds to tens of seconds. The replace into statement is also slow. In the slow query log, it is found that the precommit_time and commit_time are very long. How can this situation be located?

translator_bot · June 22, 2024, 6:16pm

| username: Billmay表妹 | Original post link

[Attachment: Screenshot/Log/Monitoring]
Is there any relevant content you can look at? Screenshot

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

Here are some of the data analyzed by the dashboard. What other monitoring data do you need?

translator_bot · June 22, 2024, 6:16pm

| username: dba-kit | Original post link

The DDL in TiDB 5 is still serial, so table creation statements need to queue up, which can take a long time. The slowness of the replace into statement in precommit_time commit_time is also expected, as these two steps involve locking and writing to disk. If your table has a lot of data and includes one or two particularly large tables, these situations are actually normal.

PS: If you want to quickly set up a TiDB cluster from MySQL, you can consider using the Physical Import Mode of tidb-lightning to write directly into TiKV, and then use DM for incremental synchronization.

PPS: In this mode, the node running tidb-lightning needs to be highly configured; the higher the configuration, the faster the parsing speed. Additionally, if you are using TiDB 5.1.0, importing will make the TiDB cluster unwritable. It is recommended to use the latest version of TiDB 6.1 + TiDB 6.2’s lightning, using the incremental mode for importing (or directly use 6.5).

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

Thank you for your reply. Our current MySQL version is 5.7. Will there be compatibility issues if we use TiDB 6.x? We are using TiDB 5.1.0 because it is compatible with MySQL 5.7.

translator_bot · June 22, 2024, 6:16pm

| username: xingzhenxiang | Original post link

Because TiDB is compatible with MySQL, it has always been compatible with MySQL 5.7, and TiDB 6.x is the same.

translator_bot · June 22, 2024, 6:16pm

| username: tidb狂热爱好者 | Original post link

There are no compatibility issues, so it is recommended to use the latest version.

translator_bot · June 22, 2024, 6:16pm

| username: Billmay表妹 | Original post link

It is recommended to use 6.5.0 in the test environment.

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

Additionally, I have another question. You mentioned that during the import process in TiDB version 5.1.0, the TiDB cluster is not writable. What does this mean? Does it cause write blocking, or will it result in an error?

translator_bot · June 22, 2024, 6:16pm

| username: xingzhenxiang | Original post link

The image is not visible. Please provide the text you need translated.

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

The documentation for physical mode states that sorted-kv-dir must be an empty directory. If it is not empty, what consequences will occur? Additionally, does the size of this directory need to be the same as the size of the dump directory?
When incremental-import is set to true, what will happen if there is already data in the target database?

translator_bot · June 22, 2024, 6:16pm

| username: Hacker007 | Original post link

If there is data, a primary key conflict may occur.

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

Will conflicts cause the import process to stop? I tested once using Lightning to import into a database that already had data. There were definitely conflicts, but no errors were reported. Then I found a bug when querying. When I executed select count(*) from table, it was normal, but when I executed select count(*), sum(field) from table, the data was incorrect. However, when the selected field in the sum function was the first field of the table, it was normal. How can I troubleshoot this issue? I have now cleared the target database and re-imported it, and now select count(*), sum(field) from table is normal again.

translator_bot · June 22, 2024, 6:16pm

| username: dba-kit | Original post link

It depends on which field you are counting. The count function will ignore data with null values. Therefore, it is best to use count(1) or count(*), rather than specifying a field name.

translator_bot · June 22, 2024, 6:16pm

| username: Tank001 | Original post link

Our conflict will cause the scheduling to stop.

translator_bot · June 22, 2024, 6:16pm

| username: Hacker007 | Original post link

The default for on-duplicate is replace into. If it’s another type, it will cause a primary key conflict. Based on the issue you’re encountering, does your table lack a primary key?

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

What I said is to use the physical mode.

translator_bot · June 22, 2024, 6:16pm

| username: yilong | Original post link

Please do not directly use Physical Import Mode to import data into a TiDB cluster that is already in production, as this will severely impact online business. If you need to import data into a production cluster, please refer to Restricting the Scheduling Scope to Table Level During Import.

translator_bot · June 22, 2024, 6:16pm

| username: TiDBer_20QjYTLl | Original post link

I am not importing into the production TiDB cluster; I am repeatedly testing. The previously imported data has not been deleted. We have many tables, and it’s quite troublesome to drop the database before each test. Moreover, the test environment is still in use.