How to obtain the TSO of a TiDB cluster for use as the start-ts in a CDC changefeed

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 怎么获取tidb 集群的tso 用作cdc changefeed 的start-ts

| username: Raymond

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
Dear teachers, when using ticdc to synchronize data from one TiDB cluster to another TiDB cluster or Kafka, the start-ts is required when starting the changefeed. How can I obtain the current TSO of the TiDB cluster to use as the start-ts? Is there any command for this?

| username: dba-kit | Original post link

If you want to use the current TiDB backup to set up a new TiDB cluster, there are two methods:

  1. Dumpling + TiDB-Lightning method: Similar to mydumper, you can find the TSO at the time of backup in the meta file of dumpling.
  2. BR backup + BR restore method: After a successful BR backup, you can find the TSO value in the standard output.

If you only want to synchronize the latest data to the new cluster and do not care about historical data, you can use the current timestamp (in milliseconds) and shift it left by 18 bits to get the current TSO value.

MySQL []> select REPLACE(unix_timestamp(current_timestamp(3)),'.','') <<18 as current_tso;
+--------------------+
| current_tso        |
+--------------------+
| 438595539700809728 |
+--------------------+
1 row in set (0.000 sec)
| username: dba-kit | Original post link

In the PD page of Grafana monitoring under the cluster category, you can also see the current TSO information
image

PS: This value is a converted timestamp and not the actual TSO value, so just ignore it.

| username: dba-kit | Original post link

I found another way to get TSO

MySQL [(none)]> begin;select @@tidb_current_ts;
Query OK, 0 rows affected (0.000 sec)

+--------------------+
| @@tidb_current_ts  |
+--------------------+
| 438595309234028545 |
+--------------------+
1 row in set (0.000 sec)

| username: weixiaobing | Original post link

If it’s not a full initialization, the default start-ts is the current time, so there’s no need to specify it explicitly.

| username: 裤衩儿飞上天 | Original post link

May I ask if it is possible to directly specify in a time format, such as “2023-01-01 00:00:00”?

| username: weixiaobing | Original post link

TiDB has a dedicated function for converting time to TSO, tidb_parse_tso

| username: Raymond | Original post link

The official website explains that it is the TSO of the current transaction. Can it be used as the TSO for the entire transaction?

| username: Raymond | Original post link

Yes, this is correct.

  1. If it is a cluster that already has data, there will be a TSO when backing up using the backup tool.
  2. If it is a cluster without data, you can use CDC for synchronization without specifying TS, and it can be synchronized from the current start-ts.
| username: dba-kit | Original post link

According to the official documentation, TSO is a timestamp that represents a point in time. If you need to start synchronization from the current point, it doesn’t matter whether it’s the TSO of a transaction or the cluster; you can just choose a point in time. Any changes made by transactions greater than this TSO will be synchronized. (Of course, as mentioned above, for your scenario, you can simply not specify start-ts.)

TSO stands for Time Stamp Oracle, which is a monotonically increasing timestamp provided by PD (Placement Driver) for each transaction.

| username: Raymond | Original post link

Yes, thank you for your reply.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.