TiFlash 6.1 Multi-Replica Storage Imbalance

| username: polars

[TiDB Usage Environment]
tidb 6.1

[Overview] Scenario + Problem Overview
tikv 3 replicas, tiflash 3 replicas, the test database has tiflash 3 replicas enabled, but the data storage is concentrated on tiflash-0, with very little data on tiflash-1 and tiflash-2, and mpp execution is also only on tiflash-0.

Table sharding situation

| username: ShawnYan | Original post link

Could you please clarify the topology? Should TiKV/TiFlash be deployed independently or mixed, and under what specific circumstances? Are the three TiFlash nodes of different specifications?

| username: ddhe9527 | Original post link

Are there a total of 3 TiFlash nodes? Does each table with TiFlash replicas have 3 replicas?

| username: polars | Original post link

Physical machine with 5 nodes, mixed deployment of TiKV and TiFlash.

| username: polars | Original post link

Three TiFlash nodes, TiFlash replicas, configured three replicas for the test database.

| username: polars | Original post link

At the beginning, I only used one table for testing, with 3 nodes and 3 replicas. The storage was basically balanced across the 3 nodes, and with MPP enabled for queries, all 3 nodes could be used simultaneously. Later, I added a few more tables, and encountered some issues. After performing a scale-in and then a scale-out, the replicas of the subsequent tables became unbalanced, with data only on the tiflash-0 node.

| username: ddhe9527 | Original post link

The capacity and availability of tiflash-0 and the other two nodes differ slightly. You can check the Store Region Score on the PD → Balance page in Grafana. Also, how large is the test database approximately?

| username: polars | Original post link

The test database currently has 4 replicas, using almost 1TB.

| username: ddhe9527 | Original post link

The Store Region Score of tiflash-0 and the other two TiFlash nodes differ significantly. PD is likely trying to schedule, so you should check the PD → Operator page on Grafana to see if any operators have been generated. Additionally, it seems that you have deployed tiflash-1 and tiflash-2 on the same machine, which is highly discouraged. Moreover, this machine has only about 100GB of remaining space, and PD will not schedule onto it if there is insufficient remaining space.

| username: polars | Original post link

After restarting PD, TiKV, and TiFlash, many scheduling tasks appeared, and it should have started scheduling. It’s very strange why it didn’t schedule before.

| username: ddhe9527 | Original post link

Let’s observe a bit longer then.

| username: system | Original post link

