When trying to import table data using 4 instances of TiDB Lightning, on 3 instances the process ended with the message “tidb lightning exit successfully”, but on one instance the process continues to run. In this case, the message [DEBUG] [checksum.go:415] ["update PD safePoint limit with TTL"] [currnet_ts=442641659888402494]
is repeated in the logs. Found on the Internet that you can request the command ./pd-ctl servise-gc-safepoint
information about ‘safe_point’.
Here’s what this command shows on the PD that this Lightning instance is connecting to:
{ "service_gc_safe_points": [ { "service_id": "gc_worker", "expired_at": 9223372036854775807, "safe_point": 442642849380171776 }, { "service_id": "lightning-8bd078d3-6410-40d1-8370-5b9c20b05885", "expired_at": 1688549529, "safe_point": 442641659888402494 } ], "gc_safe_point": 442641339889483776 }
Can you tell me how to fix this so that the Lightning instance can successfully complete the import?
TiDB version: v7.1.0
Based on the information you provided, it seems that the TiDB Lightning process on one instance is still running and not exiting successfully. The repeated message in the logs indicates that the process is updating the PD safePoint limit with TTL.
To check the safe_point information, you can use the command ./pd-ctl service-gc-safepoint
. However, the information you provided seems to be incomplete. Could you please provide the complete output of the command?
In general, if the TiDB Lightning process is still running, you can try to stop it manually by sending a SIGTERM signal to the process. You can find the process ID (PID) of the TiDB Lightning process by using the ps
command, and then use the kill
command to send the SIGTERM signal to the process. For example:
ps aux | grep tidb-lightning
kill -15 <PID>
After stopping the TiDB Lightning process, you can check the logs to see if there are any error messages or warnings that might indicate the cause of the issue. If you see errors in the log that result in data inaccuracies, you can follow the steps described in [1] to resolve the issue.
If you need further assistance, please provide more information about the issue and the complete output of the ./pd-ctl service-gc-safepoint
command.
[1]: TiDB Lightning Distributed Import - Checkpoint Error Handling
Before your message, I stopped the process with the command kill -2 PID
. I started it again. The problem has not been solved. Now when requesting pd-ctl service-gc-safepoint
:
{ “service_gc_safe_points”: [ { “service_id”: “gc_worker”, “expired_at”: 9223372036854775807, “safe_point”: 442644998856114176 }, { “service_id”: “lightning-8bd078d3-6410-40d1-8370-5b9c20b05885”, “expired_at”: 1688557529, “safe_point”: 442641659888402494 }, { “service_id”: “lightning-8ffffaba-aed5-4121-bd12-dfe8d8aaadc5”, “expired_at”: 1688557815, “safe_point”: 442645142786277378 } ], “gc_safe_point”: 442641339889483776 }
Next, I installed the re-started process. I attach a file with errors from the tidb-lightning logs
error_after_kill_15 (15.8 KB)
last_1000_rows_from_log (248.1 KB)