Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: modify column执行时间过长
[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.1
[Reproduction Path] Operations performed that led to the issue
Modified the length of a field in this table
[Encountered Issue: Problem Phenomenon and Impact]
Execution time is too long, and the ROW_COUNT in ddl jobs is much larger than the Row_count in STATS_META.
Table health is 95
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]
Is the data volume large?
More than 2.6 billion rows
What kind of modify operation are you doing that requires data backfill? This is causing it to be slow.
Modify the length of a field.
This is slow, estimated to take about ten hours, just have to wait.
It has already been running for more than 60 hours, and why is the row count in show ddl jobs
much larger than the actual row count?
The table statistics are inaccurate. Run analyze on the table after the DDL is completed.
Okay, for now, we still have to wait, right?
select count(*) is also more than 2.6 billion, the statistical information should be accurate, right?

Please send the SQL for the modification.
Could it be that records were inserted in the meantime? After all, it has been running for over 60 hours.
The ROW_COUNT()
function returns the number of rows affected by the last executed SQL statement. The execution logic of the ROW_COUNT()
function is as follows:
- If the last statement was a DDL statement, the
ROW_COUNT()
function will return 0. For example, CREATE TABLE
, DROP TABLE
, etc.
- If the last statement was an
UPDATE
, INSERT
, DELETE
, ALTER TABLE
, or LOAD DATA
statement, the ROW_COUNT()
function will return the number of affected rows.
- If the last statement was a
SELECT
statement that returns a result set, the ROW_COUNT()
function will return -1.
- If the last statement was not a
SELECT
statement that returns a result set, the ROW_COUNT()
function will return the number of affected rows. For example: SELECT * FROM t1 INTO OUTFILE 'file_name'
.
Since when did DDL take so long? I use TiDB, and apart from indexing, I’ve never seen anything take this long.
This is a bit unusual. We modified the fields of 1.1 billion records, and it was completed very quickly.
This backfill will generate twice the amount of data for the entire table, so with this operation, modify_column will probably result in about twice the total data volume, which is around 5.2 billion rows.
It’s been more than 5 days, and it still hasn’t finished running. There are only more than 2.6 billion rows in the table, but the current row count is over 8 billion.
ALTER TABLE xxxx_dw
.xxxxx_sku
MODIFY COLUMN xxx_price
decimal(16, 4) NULL DEFAULT NULL COMMENT ‘Price’ AFTER xxxx_num
Could it be that there is an interruption and retry in the middle, causing the line count to keep accumulating?
It shouldn’t be possible, TiDB was up 5 days ago.