Error in Incremental Data Import with Lightning

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: lightning增量数据导入报错

| username: TiDBer_QHSxuEa1

I need to incrementally import data into the target database. When using Lightning, an error occurred. After handling it and restarting, it prompted that the checkpoints of several tables need to be processed first. If I follow the prompt and use --checkpoint-error-destroy, it will directly delete these tables, which means my original data will also be lost. How should I handle this situation?

| username: Miracle | Original post link

Can’t you just re-import the missing data?

| username: WalterWj | Original post link

Do not use destroy, using ignore is also fine. There is an explanation on the official website: TiDB Lightning 断点续传 | PingCAP 文档中心

| username: TiDBer_QHSxuEa1 | Original post link

Originally, the table had 10 rows of data. This time, lightning imported an additional 5 rows. Using destroy will directly drop the table, deleting the original 10 rows of data as well. If you mean re-backing up the table data at the breakpoint and then importing it again, that’s a bit primitive… I came to ask if there’s a more reasonable method because I don’t want to be too primitive. Logically, breakpoint resumption should consider this situation.

| username: TiDBer_QHSxuEa1 | Original post link

I see, may I ask if using ignore directly carries the risk of data loss?

| username: WalterWj | Original post link

In most cases, there is no risk. There is a checksum at the end, so it should be fine.

| username: Miracle | Original post link

Sorry, I misread and thought the table was empty.
You can try using ignore as mentioned above.

| username: zhang_2023 | Original post link

There’s no problem with the database, just ignore it and continue importing.

| username: dba远航 | Original post link

You can use the --ignore option.

| username: redgame | Original post link

Yes, ignoring is feasible.

| username: 小于同学 | Original post link

You can do it without using -destroy.

| username: dba-kit | Original post link

Normally, Lightning will automatically resume from the breakpoint. Did you change the configuration file of the Lightning task after the error occurred?

| username: TiDBer_QHSxuEa1 | Original post link

It hasn’t changed, right? Resuming from a breakpoint is only triggered by an abnormal interruption, correct? If errors occur due to issues like unique key conflicts or table structure inconsistencies during import, even if the errors are handled, you still need to clear or ignore the error points to continue the import task, right?

| username: dba-kit | Original post link

Uh, indeed as you said, only non-data errors like OOM or active KILL interruptions will trigger resumable transmission. Errors caused by data issues need to be handled manually.

| username: WalterWj | Original post link

Actually, it depends on whether it is in the load phase or the import phase.

| username: 郑旭东石家庄 | Original post link

I suggest deleting and starting over. If you have the original data, what are you afraid of? If you continue from the last import, you still have to consider whether there are data loss issues.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.