Pump unable to generate binlog

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pump无法产生binlog

| username: liyuntang

[TiDB Usage Environment]
Production environment deployed in k8s, with more than 1 million data tables

[TiDB Version]
v6.5.0

[Reproduction Path]
Start the pump program to synchronize binlog

[Encountered Issues: Problem Phenomenon and Impact]
1. The pump cannot generate binlog in the specified directory
2. When the pump service is shut down, all TiDB node pods crash. Why do they all crash?
3. What is the [DBStats] process, and what are the processing steps?

[Resource Configuration]

[Attachments: Screenshots/Logs/Monitoring]

TiDB node pod status when pump is shut down

TiDB node pod startup logs:
[2023/04/17 18:25:13.422 +08:00] [INFO] [client.go:328] [“[pumps client] write binlog to available pumps all failed, will try unavailable pumps”]
[2023/04/17 18:25:13.422 +08:00] [WARN] [session.go:2218] [“run statement failed”] [conn=7929359001148457417] [schemaVersion=3496320] [error=“[global:3]critical error write binlog failed, the last error no available pump to write binlog”] [session=“{\n "currDBName": "mysql",\n "id": 7929359001148457417,\n "status": 2,\n "strictMode": true,\n "user": {\n "Username": "root",\n "Hostname": "100.64.32.201",\n "CurrentUser": false,\n "AuthUsername": "root",\n "AuthHostname": "%"\n }\n}”]
[2023/04/17 18:25:13.422 +08:00] [FATAL] [conn.go:1138] [“critical error, stop the server”] [conn=7929359001148457417] [error=“[global:3]critical error write binlog failed, the last error no available pump to write binlog”] [stack=“github.com/pingcap/tidb/server.(*clientConn).Run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1138\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:625”]

Pump logs:

[2023/04/17 17:31:29.571 +08:00] [INFO] [version.go:50] [“Welcome to Pump”] [“Release Version”=v6.5.0] [“Git Commit Hash”=589d79bcc4f0e9fa982847192baf6dd3eb3a0f41] [“Build TS”=“2022-12-16 08:19:30”] [“Go Version”=go1.19.3] [“Go OS/Arch”=linux/amd64]
[2023/04/17 17:31:29.571 +08:00] [INFO] [main.go:48] [“start pump…”] [config=“{"log-level":"info","node-id":"","addr":"http://10.40.224.48:8250","advertise-addr":"http://10.40.224.48:8250","socket":"","pd-urls":"http://36.0.3.246:2379,http://36.0.14.116:2379,http://36.0.13.229:2379","EtcdDialTimeout":5000000000,"data-dir":"/data/data.pump","heartbeat-interval":2,"gc":"7","log-file":"/data/pump.log","security":{"ssl-ca":"","ssl-cert":"","ssl-key":"","cert-allowed-cn":null},"gen-binlog-interval":3,"MetricsAddr":"","MetricsInterval":15,"storage":{"sync-log":null,"kv_chan_cap":0,"slow_write_threshold":0,"kv":null,"stop-write-at-available-space":null}}”]
[2023/04/17 17:31:29.571 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.578 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.578 +08:00] [INFO] [server.go:132] [“get clusterID success”] [clusterID=7194634827558957893]
[2023/04/17 17:31:29.578 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.583 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.584 +08:00] [INFO] [store.go:75] [“new store”] [path=“tikv://36.0.13.229:2379,36.0.14.116:2379,36.0.3.246:2379?disableGC=true”]
[2023/04/17 17:31:29.584 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[36.0.13.229:2379,36.0.14.116:2379,36.0.3.246:2379]”]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.589 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.590 +08:00] [INFO] [store.go:81] [“new store with retry success”]
[2023/04/17 17:31:29.590 +08:00] [INFO] [storage.go:138] [NewAppendWithResolver] [options=“{"ValueLogFileSize":524288000,"Sync":true,"KVChanCapacity":1048576,"SlowWriteThreshold":1,"StopWriteAtAvailableSpace":10737418240,"KVConfig":null}”]
[2023/04/17 17:31:29.590 +08:00] [INFO] [storage.go:1408] [“open metadata db”] [config=“{"block-cache-capacity":8388608,"block-restart-interval":16,"block-size":4096,"compaction-L0-trigger":8,"compaction-table-size":67108864,"compaction-total-size":536870912,"compaction-total-size-multiplier":8,"write-buffer":67108864,"write-L0-pause-trigger":24,"write-L0-slowdown-trigger":17}”]
[2023/04/17 17:31:29.595 +08:00] [INFO] [storage.go:220] [“Append info”] [gcTS=0] [maxCommitTS=0] [headPointer=“{"Fid":0,"Offset":0}”] [handlePointer=“{"Fid":0,"Offset":0}”]
[2023/04/17 17:31:29.600 +08:00] [INFO] [server.go:440] [“register success”] [NodeID=ksc_epc:8250]
[2023/04/17 17:31:29.601 +08:00] [INFO] [server.go:457] [“start to server request”] [addr=http://10.40.224.48:8250]
[2023/04/17 17:31:39.597 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":461,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:39.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853829665947650]
[2023/04/17 17:31:49.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":1038,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:49.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853832024981509]
[2023/04/17 17:31:59.597 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":1530,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:59.599 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853834384277507]
[2023/04/17 17:32:09.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":2101,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:32:09.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853837530005507]
[2023/04/17 17:32:19.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats="{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":2678,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"

| username: xfworld | Original post link

Use TiCDC, don’t use the binlog component anymore, it’s not compatible.

| username: tidb菜鸟一只 | Original post link

With version 6.5, don’t use binlog anymore.

| username: db_user | Original post link

The complete crash occurred because you shut down the pump, but did not adjust the binlog-related parameters of TiDB. This caused the crash because TiDB has a check. Just turn off the binlog parameters.

| username: WalterWj | Original post link

Configure ignore error in TiDB

| username: 胡杨树旁 | Original post link

Is the binlog enabled in the configuration file? How is the pump set up?

| username: liuis | Original post link

It looks like an operational error. You need to disable binlog first, and then restart the pump.