Why is synchronizing columnar data so fast?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 请问同步列存数据为什么这么快

| username: TiDBer_JlY1JCJ5

According to the official operation documentation, follow the steps for data testing at the address:

Regarding the last step, Step 4: Synchronize Columnar Data, I have some questions:

  1. Why doesn’t TiFlash automatically synchronize TiKV data after deployment, and instead requires manually specifying the table to enable it?
  2. For millions of data, why is the synchronization to the columnar storage so fast? After executing ALTER TABLE test.customer SET TIFLASH REPLICA 1; it almost synchronizes immediately. Is the data synchronization from row storage to columnar storage really that fast?
| username: 小龙虾爱大龙虾 | Original post link

  1. The reason is that not all tables have AP requirements, and storing all tables in columnar format wastes space. TiDB is positioned as an HTAP database.
  2. In information_schema.tiflash_replica, the PROGRESS reaching 1 indicates that the columnar storage synchronization is complete, not when the ALTER TABLE statement returns.
| username: TiDBer_JlY1JCJ5 | Original post link

Thank you for your explanation, but I tested the example with 6 million data in the document, where the lineitem table has 6 million data. After I executed ALTER TABLE, I then executed SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = ‘test’ and TABLE_NAME = ‘lineitem’; to check the PROGRESS. It took about 5 seconds to reach 1, the synchronization is really fast. It took me almost half an hour to insert 6 million data into the table, but it only took 5 seconds to synchronize the data from row storage to column storage. Is it really that fast? Can you briefly explain the reason?

| username: 小龙虾爱大龙虾 | Original post link

You can analyze the table, then check the information_schema to see how big the table is, or look at the size of the table’s region. 6 million records shouldn’t take half an hour, right? Constructing test data is too slow. You can analyze it through slow SQL. Constructing a TiFlash replica involves scanning TiKV’s region and sending it to TiFlash via snapshot. 6 million records don’t seem that large.

| username: TiDBer_小阿飞 | Original post link

Because TiFlash only needs to read the idx value of the raft log to synchronize, while synchronizing tables to TiKV requires scheduling and balancing nodes through TiDB server, PD, and TiKV, and it also needs to retain multiple versions.

| username: TiDBer_JlY1JCJ5 | Original post link

Thank you. Storing all tables in columnar format wastes space. How can we see the size of this space, and how can we see how much space is occupied by columnar storage and row storage respectively?

| username: TiDBer_JlY1JCJ5 | Original post link

Thank you. TiFlash only needs to read the idx value of the raft log to synchronize. May I ask where this synchronization process is documented? I would like to understand it in detail.

| username: zhanggame1 | Original post link

A few million pieces of data may seem like a lot, but it’s actually only a few megabytes.

| username: TiDBer_JlY1JCJ5 | Original post link

Yes, but using the command tiup bench tpch --sf=1 prepare from the documentation takes nearly half an hour to generate 6 million data.

| username: 小龙虾爱大龙虾 | Original post link

It really can’t be checked because the underlying layer is SST files, making it impossible to confirm which table it belongs to. The data size in tikv_region_status is also estimated and there is compression involved.

| username: zhanggame1 | Original post link

Replicating data at the database layer is much faster than executing SQL.

| username: 有猫万事足 | Original post link

TiFlash uses DeltaTree.

It is specifically optimized for high-frequency data writing issues. If you’re interested, you can check out the source code reading series. If reading the article is tiring, there are also dedicated source code interpretation videos on Bilibili.

| username: Kongdom | Original post link

:yum: What I understand is that for row storage, 1000 identical records need to be stored 1000 times, while for column storage, 1000 identical records only need to be stored once. From this perspective, it is definitely faster and takes up less space.

| username: TiDBer_小阿飞 | Original post link

I feel that the main reason is that there are fewer things to query and associate, so only one IDX is needed, haha :joy:

| username: andone | Original post link

TiFlash only needs to read the idx value of the raft log to synchronize.

| username: swino | Original post link

Come and learn

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.