Why is the same file read by multiple TiKV nodes during BR recovery? Is it because of multiple replicas?

translator_bot · June 20, 2024, 12:51pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: BR恢复时为什么同一个文件会被多个tikv节点读取？是因为多副本的原因吗？

| username: 滴滴嗒嘀嗒

As shown in the figure below, the file 94_103_6ff34803059b51f62fa4bddae19647c8e0db22c9db17931ba4cc39e4f5f188d1_1718349131300_default.sst has read requests on both nodes 21.3 and 21.4:

translator_bot · June 20, 2024, 12:51pm

| username: zhaokede | Original post link

Based on your configuration, if the backup data is placed in a shared folder, multiple replicas will definitely be read by multiple nodes.

translator_bot · June 20, 2024, 12:51pm

| username: forever | Original post link

Because the data in one of your files will be in multiple regions, and multiple regions may exist on each node.

translator_bot · June 20, 2024, 12:51pm

| username: lemonade010 | Original post link

It’s not a shared storage issue, right? This way, the backup will be distributed across various servers. It’s best to put them together during recovery.

translator_bot · June 20, 2024, 12:51pm

| username: 我是吉米哥 | Original post link

Automatically scatter during recovery.

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

The data must be stored together, otherwise BR will report an error and exit if it can’t find the desired file. In this case, the recovery was successful.

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

What does “automatic sharding” mean?

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

Is it possible for the same piece of data in a file to be in multiple regions? According to the logs, two nodes both read 131072 bytes of data from the offset 0 of the file.

According to the official documentation, different regions should contain different data.

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

Can I understand your meaning as there is a replica of region1 from node1 on node2, so the same piece of data in the file will be written to both region1 and its replica during recovery, similar to the diagram below:

translator_bot · June 20, 2024, 12:51pm

| username: 我是吉米哥 | Original post link

A single SST contains multiple regions, and these regions are automatically scattered to different stores.

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

So both nodes here have read the same piece of data (reading 131072 from the offset 0 of the file):

Can it be understood that the same region within a single SST has been scattered across different stores?

translator_bot · June 20, 2024, 12:51pm

| username: 我是吉米哥 | Original post link

The region is the smallest unit. An SST contains multiple regions. You can place regions 1/4/7 from the SST into store1, 2/5/8 into store2, and 3/6/9 into store3.

translator_bot · June 20, 2024, 12:51pm

| username: 滴滴嗒嘀嗒 | Original post link

According to this, this phenomenon is unreasonable. How should we understand different nodes reading the same block of data within the same file?

translator_bot · June 20, 2024, 12:51pm

| username: forever | Original post link

I understand that reading is to find data suitable for this node.

translator_bot · June 20, 2024, 12:51pm

| username: 小于同学 | Original post link

Multiple regions