Adding NOT NULL Constraint to Large Table Takes Time

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 大表增加not null约束耗时

| username: 心在飞翔

In TiDB version 7.5.1, modifying a column from null to not null on a table of 50GB size takes about 30 seconds. Is this normal?

| username: zhanggame1 | Original post link

What type of column?

| username: 心在飞翔 | Original post link

First, execute alter table test add name1 char(10) default ''; This returns in seconds.

Then, execute alter table test modify name1 char(10) not null default ''; This statement takes more than 30 seconds to execute.

| username: 小龙虾爱大龙虾 | Original post link

Normally, it should return within seconds without data backfilling. Did you encounter an MDL lock? You can try with another table.

| username: Kongdom | Original post link

Could it be the index column?

| username: 心在飞翔 | Original post link

Not an index column

| username: 心在飞翔 | Original post link

This TiDB setup hasn’t run any business yet, so there are no MDL locks.

| username: 心在飞翔 | Original post link

Your table has a small amount of data, so you might not be able to see the effect. Based on my tests, modifying data types, adding NOT NULL constraints, and adding indexes are relatively time-consuming operations, while other operations are relatively fast.

| username: TiDBer_刚 | Original post link

Why do I feel like it’s filling in null values, so it’s slow?

| username: zhaokede | Original post link

The speed is normal. For a 50GB table, it’s roughly around 100 million rows of data.

| username: 小龙虾爱大龙虾 | Original post link

The amount of data is not the issue; the main concern is whether there is a data backfill operation or if it’s just modifying metadata. Modifying only metadata is very fast.

| username: h5n1 | Original post link

I encountered the same issue during testing. There is no data backfill, but the modify operation is slow, and there are no query DMLs on the table.
image

| username: 心在飞翔 | Original post link

Yes, my test results are the same. Although I didn’t see the backfill information in INFORMATION_SCHEMA.DDL_JOBS, it didn’t return in seconds.

| username: FutureDB | Original post link

Were there any actual NULL values in the table fields before changing the field to non-NULL?

| username: h5n1 | Original post link

Although it doesn’t seem to backfill data, there might be similar data validation operations when changing null to not null.

| username: zhaokede | Original post link

To verify the data, it should be sufficient to return as soon as one non-compliant entry is found, and it shouldn’t take that long, right?

| username: h5n1 | Original post link

The specific tasks are unclear, but it might involve similar operations.

| username: WinterLiu | Original post link

Learned
Could it be performing data validation, checking for null values?

| username: zhh_912 | Original post link

Normally, this might be related to the table structure. You need to check how many columns there are.

| username: 友利奈绪 | Original post link

The speed is almost the same, that’s all I can say.