Error One When Exporting Data with Dumpling

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: dumpling导出数据报错一

| username: jingyesi3401

[TiDB Usage Environment] Production Environment

[TiDB Version] v5.1.0

[Encountered Issue] When using the dumpling command to export data, the following error occurs (tikv_gc_life_time has already been set to 10h). This data can be queried using Navicat.

  1. Export command:

  2. Error message:

| username: hey-hoho | Original post link

I suspect that the number of export threads is too large, causing the TiDB node to crash.

  1. Try reducing the number of threads.
  2. Check if the TiDB node has restarted.
| username: jingyesi3401 | Original post link

The cluster status is normal. Let me try reducing the number of threads.

| username: jingyesi3401 | Original post link

The number of threads has already been adjusted to 16, but the error still persists.

| username: hey-hoho | Original post link

Please upload the TiDB logs from the time of the error.

| username: jingyesi3401 | Original post link

I used the dumpling command to export on the relay server. I will now send the TiDB logs. Please help analyze the reason. Thank you! tidb.rar (2.0 MB)

| username: hey-hoho | Original post link

What is the configuration of the machine where dumpling is located? If the configuration is not high, it is recommended to set the default number of threads to 4.

Additionally, how many rows of data meet the export conditions? Try adding the -r 10000 parameter to export in batches.

From the existing log information, it appears that dumpling closed the connection.

| username: jingyesi3401 | Original post link

Transit server configuration: 32-core CPU, 64GB memory, thread count changed to 4 (it used to be 32), and -r is set to 1000, but the problem persists.
Image 1: Export command

Image 2: Export log 1

Image 3: Export log 2

| username: TammyLi | Original post link

From the error message, it seems that TiDB restarted (e.g., TiDB OOM) during the dumpling export. You can confirm this by checking TiDB uptime and TiDB memory in Grafana. What is the size of your TiDB memory? Additionally, it is best if the dumpling version is >= TiDB version.

| username: jingyesi3401 | Original post link

TiDB memory 3 nodes each with 16 cores and 32GB
Image 1: uptime

TiDB version: v5.1.0
dumpling version: 5.1.0

| username: jingyesi3401 | Original post link

Moreover, I tested exporting only one statement and encountered the same error, which is not related to the cluster parameters.

Image 1: Export command

Image 2: Error message

Image 3: Cluster parameter information.

config show all
{
“client-urls”: “http://0.0.0.0:2379”,
“peer-urls”: “http://0.0.0.0:2380”,
“advertise-client-urls”: “http://x.x.x.x:2379”,
“advertise-peer-urls”: “http://x.x.x.x:2380”,
“name”: “pd-x.x.x.x-2379”,
“data-dir”: “/data/tidb-data/pd-2379”,
“force-new-cluster”: false,
“enable-grpc-gateway”: true,
“initial-cluster”: “pd-x.x.x.x-2379=http://x.x.x.x:2380,pd-x.x.x.x-2379=http://x.x.x.x:2380,pd-x.x.x.x-2379=http://x.x.x.x:2380”,
“initial-cluster-state”: “new”,
“initial-cluster-token”: “pd-cluster”,
“join”: “”,
“lease”: 3,
“log”: {
“level”: “”,
“format”: “text”,
“disable-timestamp”: false,
“file”: {
“filename”: “/data/tidb-deploy/pd-2379/log/pd.log”,
“max-size”: 300,
“max-days”: 0,
“max-backups”: 0
},
“development”: false,
“disable-caller”: false,
“disable-stacktrace”: false,
“disable-error-verbose”: true,
“sampling”: null
},
“tso-save-interval”: “3s”,
“tso-update-physical-interval”: “50ms”,
“enable-local-tso”: false,
“metric”: {
“job”: “pd-x.x.x.x-2379”,
“address”: “”,
“interval”: “15s”
},
“schedule”: {
“max-snapshot-count”: 48,
“max-pending-peer-count”: 3,
“max-merge-region-size”: 20,
“max-merge-region-keys”: 200000,
“split-merge-interval”: “1h0m0s”,
“enable-one-way-merge”: “false”,
“enable-cross-table-merge”: “true”,
“patrol-region-interval”: “100ms”,
“max-store-down-time”: “30m0s”,
“leader-schedule-limit”: 4,
“leader-schedule-policy”: “count”,
“region-schedule-limit”: 2048,
“replica-schedule-limit”: 64,
“merge-schedule-limit”: 8,
“hot-region-schedule-limit”: 4,
“hot-region-cache-hits-threshold”: 3,
“store-limit”: {
“1”: {
“add-peer”: 15,
“remove-peer”: 15
},
“2”: {
“add-peer”: 15,
“remove-peer”: 15
},
“4376258”: {
“add-peer”: 15,
“remove-peer”: 15
},
“5”: {
“add-peer”: 15,
“remove-peer”: 15
},
“56”: {
“add-peer”: 30,
“remove-peer”: 30
},
“57”: {
“add-peer”: 30,
“remove-peer”: 30
},
“6”: {
“add-peer”: 15,
“remove-peer”: 15
},
“7”: {
“add-peer”: 15,
“remove-peer”: 15
}
},
“tolerant-size-ratio”: 0,
“low-space-ratio”: 0.8,
“high-space-ratio”: 0.7,
“region-score-formula-version”: “v2”,
“scheduler-max-waiting-operator”: 5,
“enable-remove-down-replica”: “true”,
“enable-replace-offline-replica”: “true”,
“enable-make-up-replica”: “true”,
“enable-remove-extra-replica”: “true”,
“enable-location-replacement”: “true”,
“enable-debug-metrics”: “false”,
“enable-joint-consensus”: “true”,
“schedulers-v2”: [
{
“type”: “balance-region”,
“args”: null,
“disable”: false,
“args-payload”: “”
},
{
“type”: “balance-leader”,
“args”: null,
“disable”: false,
“args-payload”: “”
},
{
“type”: “hot-region”,
“args”: null,
“disable”: false,
“args-payload”: “”
},
{
“type”: “label”,
“args”: null,
“disable”: false,
“args-payload”: “”
}
],
“schedulers-payload”: {
“balance-hot-region-scheduler”: null,
“balance-leader-scheduler”: {
“name”: “balance-leader-scheduler”,
“ranges”: [
{
“end-key”: “”,
“start-key”: “”
}
]
},
“balance-region-scheduler”: {
“name”: “balance-region-scheduler”,
“ranges”: [
{
“end-key”: “”,
“start-key”: “”
}
]
},
“label-scheduler”: {
“name”: “label-scheduler”,
“ranges”: [
{
“end-key”: “”,
“start-key”: “”
}
]
}
},
“store-limit-mode”: “manual”
},
“replication”: {
“max-replicas”: 3,
“location-labels”: “”,
“strictly-match-label”: “false”,
“enable-placement-rules”: “true”,
“isolation-level”: “”
},
“pd-server”: {
“use-region-storage”: “true”,
“max-gap-reset-ts”: “24h0m0s”,
“key-type”: “table”,
“runtime-services”: “”,
“metric-storage”: “”,
“dashboard-address”: “http://x.x.x.x:2379”,
“flow-round-by-digit”: 3
},
“cluster-version”: “5.1.0”,
“labels”: {},
“quota-backend-bytes”: “8GiB”,
“auto-compaction-mode”: “periodic”,
“auto-compaction-retention-v2”: “1h”,
“TickInterval”: “500ms”,
“ElectionInterval”: “3s”,
“PreVote”: true,
“security”: {
“cacert-path”: “”,
“cert-path”: “”,
“key-path”: “”,
“cert-allowed-cn”: null,
“redact-info-log”: false,
“encryption”: {
“data-encryption-method”: “plaintext”,
“data-key-rotation-period”: “168h0m0s”,
“master-key”: {
“type”: “plaintext”,
“key-id”: “”,
“region”: “”,
“endpoint”: “”,
“path”: “”
}
}
},
“label-property”: {},
“WarningMsgs”: null,
“DisableStrictReconfigCheck”: false,
“HeartbeatStreamBindInterval”: “1m0s”,
“LeaderPriorityCheckInterval”: “1m0s”,
“dashboard”: {
“tidb-cacert-path”: “”,
“tidb-cert-path”: “”,
“tidb-key-path”: “”,
“public-path-prefix”: “”,
“internal-proxy”: false,
“enable-telemetry”: true,
“enable-experimental”: false
},
“replication-mode”: {
“replication-mode”: “majority”,
“dr-auto-sync”: {
“label-key”: “”,
“primary”: “”,
“dr”: “”,
“primary-replicas”: 0,
“dr-replicas”: 0,
“wait-store-timeout”: “1m0s”,
“wait-sync-timeout”: “1m0s”,
“wait-async-timeout”: “2m0s”
}
}
}

| username: db_user | Original post link

The error reported is that the connection is not working. Check on the machine running dumpling to see if you can connect normally using your dumpling connection information.

| username: jingyesi3401 | Original post link

Remote login access and Telnet database server are both possible.


| username: jingyesi3401 | Original post link

I exported data from another database without any issues (with threads set to 32 and -F set to 10240Mi), and I also tested running dumpling on other servers, resulting in the same outcome.

| username: db_user | Original post link

Could you copy the table to another database and then try the same command on both databases to see their performance? It seems like this situation shouldn’t occur.

| username: jingyesi3401 | Original post link

I ran the same query command in both the old and new environments (same table but different data volumes). It works in the old environment but not in the new one. Could it be that the large data volume in this table is causing the issue?

| username: db_user | Original post link

Well, the table is too large, and I have a few guesses:

  1. An OOM (Out of Memory) occurred during the query. You can check the logs for any “welcome” messages. I don’t have rar installed, so I can’t decompress and check.
  2. The query took too long, causing dumpling to timeout. You can query the corresponding statement on TiDB to see how long the query takes.
  3. The table structure might be corrupted, causing the issue. However, checking a file that’s over 10TB is quite a hassle…
| username: xingzhenxiang | Original post link

Try switching to mysqldumper.

| username: Min_Chen | Original post link

Hello, please send the tidb-server logs when exporting a single statement using dumpling. If there are multiple tidb-servers and the export is done through load balancing, please send the logs of all tidb-servers.

| username: jingyesi3401 | Original post link

Using mysqldumper for export is possible.