Is the timestamp of RowChangedEvent in TiCDC the same as the timestamp of a TiDB transaction?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiCDC 中 RowChangedEvent 的时间戳是否为 TiDB 事务的时间戳

| username: Tsunaou

[TiDB Usage Environment] Test environment set up with TiUP
[TiDB Version] v6.5.1
[Encountered Issue: Phenomenon and Impact]
In an optimistic transaction, two consecutive update operations are performed on the same row, and the timestamps corresponding to these two operations are captured in the RowChangedEvent in TiCDC (TiFlow).

In the experiment, I created a table to represent key-value mappings:

CREATE TABLE IF NOT EXISTS txn0 (k INT UNIQUE NOT NULL PRIMARY KEY, v INT NOT NULL);

Using MySQL’s ON DUPLICATE KEY UPDATE syntax to represent a write operation w(key, value)

INSERT INTO txn0 (k, v) VALUES (key, value) ON DUPLICATE KEY UPDATE v = value

However, during the experiment, I observed two rather peculiar phenomena:

Phenomenon 1: Failed transaction operations are also captured by RowChangedEvent

In such a failed transaction: a transaction conflict occurred

{
        :type :fail, 
        :f :txn, 
        :value [[:r 38 nil] [:w 47 3]], 
        :time 7791358885, 
        :process 29, 
        :error [:write-conflict 
            "Write conflict, 
            txnStartTS=440395008611188895,
            conflictStartTS=440395008611188878, 
            conflictCommitTS=440395008611188926, 
            key={tableID=140, tableName=test.txn0, indexID=1, indexValues={47, }}, 
            originalKey=74800000000000008c5f69800000000000000103800000000000002f, 
            primary={
                tableID=140, 
                tableName=test.txn0, indexID=1, indexValues={47, }
            }, 
            originalPrimaryKey=74800000000000008c5f69800000000000000103800000000000002f, 
            reason=Optimistic [try again later]"
        ], 
        :index 1044
}

However, we observed this update operation in the RowChangedEvent


The corresponding log output is

[2023/03/28 11:22:40.913 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row="{\"start-ts\":440395041365295225,\"commit-ts\":440395041365295255,\"row-id\":47,\"table\":{\"db-name\":\"test\",\"tbl-name\":\"txn0\",\"tbl-id\":143,\"is-partition\":false},\"column-infos\":[{\"ID\":1,\"IsPKHandle\":true,\"VirtualGenCol\":false,\"Ft\":{\"Tp\":3,\"Flag\":4103,\"Flen\":11,\"Decimal\":0,\"Charset\":\"binary\",\"Collate\":\"binary\",\"Elems\":null,\"ElemsIsBinaryLit\":null}},{\"ID\":2,\"IsPKHandle\":false,\"VirtualGenCol\":false,\"Ft\":{\"Tp\":3,\"Flag\":4097,\"Flen\":11,\"Decimal\":0,\"Charset\":\"binary\",\"Collate\":\"binary\",\"Elems\":null,\"ElemsIsBinaryLit\":null}}],\"columns\":[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":47,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3,\"default\":null}],\"pre-columns\":null}"]

What confuses me here is that pre-columns is empty, but the log clearly states that it conflicted with the transaction with start_ts 440395008611188878. After investigation, it was indeed this transaction.
image
So, does this mean that the definition of pre-columns is not the previous state before the modification?
Moreover, I did not enable the transaction retry feature, but the captured start_ts here is 440395041365295225, which is inconsistent with the txnStartTS=440395008611188895 in the transaction conflict error. Does this mean that the timestamps of RowChangedEvent and the transaction timestamps do not match?

Phenomenon 2: The timestamps corresponding to the RowChangedEvent of two operations in the same transaction are inconsistent

Similarly, in the previous execution, there was a successful transaction:

{
        :type :ok, 
        :f :txn, 
        :value [[:w 77 10] [:w 77 11]], 
        :time 9705289105, 
        :process 3, 
        :index 1982
}

It updated w(77,10) and w(77,11) respectively. Since they are the same key, there should be no data partitioning issue.
However, the corresponding timestamps are inconsistent, and the pre-columns information does not match, which is quite puzzling.

image

[Attachments: Screenshots/Logs/Monitoring]
cdc.json (138.4 KB)
cdc-out.log (974.2 KB)
history.edn (964.1 KB)

| username: Billmay表妹 | Original post link

Based on the information you provided, you performed two consecutive update operations on the same row within an optimistic transaction and captured the timestamps of the corresponding RowChangedEvent in TiCDC (TiFlow). In the experiment, you observed two rather peculiar phenomena:

  1. The failed transaction operation was also captured by RowChangedEvent, but the pre-columns were empty, and the timestamp did not match the transaction’s timestamp.
  2. The timestamps corresponding to the RowChangedEvent of the two operations within the same transaction were inconsistent, and the pre-columns information did not match either.

For the first issue, the definition of pre-columns is the previous state before the modification, so it is expected that pre-columns are empty. As for the timestamp mismatch, it might be due to the implementation mechanism of TiCDC (TiFlow). TiCDC (TiFlow) captures data changes by listening to TiKV’s Raft logs, and the timestamps in the Raft logs are the Leader node’s timestamps, not TiDB’s timestamps. Therefore, a timestamp mismatch might occur.

For the second issue, the inconsistent timestamps corresponding to the RowChangedEvent of the two operations within the same transaction might also be due to the implementation mechanism of TiCDC (TiFlow). TiCDC (TiFlow) captures data changes by listening to TiKV’s Raft logs, and the timestamps in the Raft logs are the Leader node’s timestamps, not TiDB’s timestamps. Therefore, a timestamp inconsistency might occur. As for the mismatch in pre-columns information, it might be because TiCDC (TiFlow) generates separate RowChangedEvents for multiple operations within the same transaction, and the pre-columns information of these events might be inconsistent.

In summary, TiCDC (TiFlow) captures data changes by listening to TiKV’s Raft logs, and the timestamps in the Raft logs are the Leader node’s timestamps, not TiDB’s timestamps. Therefore, a timestamp mismatch might occur when using TiCDC (TiFlow). Additionally, for multiple operations within the same transaction, TiCDC (TiFlow) generates separate RowChangedEvents, and the pre-columns information of these events might be inconsistent.

| username: Tsunaou | Original post link

Thank you very much for your answer!

According to what you said, the Raft log does not actually save the timestamp information of the transaction, or TiCDC’s monitoring does not utilize this part of the data.

My understanding of RowChangedEvent comes from this series of articles:

Column - TiCDC Source Code Interpretation (1) – Overview of TiCDC Architecture | TiDB Community

I initially assumed that the StartTs and CommitTs in the RowChangedEvent structure were the timestamps of the transaction. Of course, as far as I know, there is indeed a TiDB timestamp information recorded in the Raft log, but whether the two are consistent still needs further confirmation.

Currently, I am trying to test TiDB snapshot isolation based on this idea. According to what you said, I will re-read the TiFlow code, then optimize my timestamp acquisition process to see if there is a bug or if I have obtained data that does not belong to the current test.
Haha, actually, it would be great if the RowChangedEvent contained a transaction ID, as this would allow for unique differentiation (but I remember that the transaction ID of TiDB 2PC actually comes from the PD timestamp).

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.