Checkpoint TSO Cannot Be Updated in Real-Time

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: checkpoint tso无法实时更新

| username: 答辩潮人

【TiDB Usage Environment】Production Environment
【TiDB Version】4.0.15
【Encountered Problem: Phenomenon and Impact】The checkpoint TSO of the drainer monitoring item cannot be updated in real-time during synchronization. This can trigger the binlog_drainer_checkpoint_high_delay issue. Also, is there a detailed document explaining the fields of the tidb_log.checkpoint table?
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】

| username: tidb狂热爱好者 | Original post link

The version is too old.

| username: 答辩潮人 | Original post link

:sob: I didn’t encounter any issues during my testing.

| username: Raymond | Original post link

I have also looked at the code for this issue before, and I am curious as well. If the downstream consumer of drainer is a file, the logic for updating the commit TSO in the savepoint file is as follows: if there is a DDL, the savepoint file will be updated immediately. If there is a fake binlog or DML binlog, it will be updated if it exceeds an interval of 3 seconds (you can see that if there is no data being written to the cluster, the savepoint file will continue to be updated if it exceeds 3 seconds since the last update). So, I am also a bit curious as to why the commit TSO is not being updated. If I have time later, I will continue to research this. Currently, I suspect that the machine where the drainer is located might be under some pressure, such as high memory or CPU usage.

| username: Raymond | Original post link

Another possibility is that the data retrieval from the pump is slow. The drainer can only enter the logic of updating commits after obtaining the data from the pump.

| username: Raymond | Original post link

It should be what I said.
Another possibility is that the data is being fetched slowly from the pump.
Observe through Grafana.
Check the “tidb-binlog-drainer-Pump Handle TSO” monitoring graph.
Look at the time period of the alarm and see if the curve is flat.

| username: redgame | Original post link

Sure, I can update it.

| username: 答辩潮人 | Original post link

Sorry, I wasn’t online over the weekend.

There is no pressure on the pump. If there is an anomaly caused by configuration or data, it can be confirmed that the TSO will not update when there is no data. It will update immediately when there is data. But take a look at this content from the upper layer to see if there are any issues.

| username: Raymond | Original post link

What can be confirmed is that when there is no data, TSO will not update ----> Is your downstream drainer’s consumption target a file?
If it is a file, this conclusion is not valid because the pump will generate fake binlog.
After the drainer receives the fake binlog generated by the pump, it will update the committs in the savepoint file approximately every 3 seconds. You can observe this and easily draw this conclusion.

| username: 答辩潮人 | Original post link

It’s not a file, it’s synchronizing from one TiDB to another TiDB. Normally, it should synchronize and update the information in the savepoint table, but now it only triggers the update of the savepoint table when there is data update.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.