Under the compute-storage separation architecture, data on S3 will not be cleared after scaling in and reducing TiFlash nodes

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 存算分离架构下,在scale-in缩容tiflash节点后,S3上数据不会被清理掉

| username: dba-kit

After executing set tiflash replica 0, the tiflash_replica system table and config placement-rules show will immediately no longer display the previously set tables. However, the TiFlash replicas are deleted asynchronously. Even after the TiFlash write_node status becomes Tombstone, many data files will still remain on the disk.

For regular TiFlash nodes, executing tiup cluster prune will actively delete the local data directory to clear the TiFlash data. However, in a storage-compute separation architecture, the data on S3 will persist and will not be automatically deleted, requiring manual cleanup.

| username: xfworld | Original post link

It’s Boss Wei’s work again…

| username: flow-PingCAP | Original post link

The cleanup mechanism is roughly as follows:

  1. Select one GC master from multiple write nodes to do the work.
  2. The GC master periodically checks if there are any files that can be deleted. The rules for deletion or compaction are: 1. The file efficiency is below 50% (a file may contain some data that has already been deleted, while other parts are still useful); 2. The last modification time of the file is more than 1 hour ago.
  3. There are two methods for file deletion: profiles.default.remote_gc_method. 1 means relying on the tagging of S3 Objects and the lifecycle settings of the bucket to delete; 2 means using S3’s ListObjects to scan and delete by itself.

So, if you take the write node offline as you mentioned above, there will be no GC master to do the work…

| username: kkpeter | Original post link

It seems that I can only write a script to clean it up automatically first. :grinning:

| username: JaySon-Huang | Original post link

Two additional points:

  1. If there are multiple TiFlash write nodes and one of them goes offline, the other write node will become the GC master. Upon detecting that the other store_id is already in a tombstone state, it will also delete the data of the old store_id on S3.
  2. If profiles.default.remote_gc_method is set to the default value of 1, which relies on S3 Object tagging and bucket lifecycle settings for deletion, the minimum expiration time for S3 lifecycle is 1 day. Therefore, under the lifecycle mechanism, “deleted files” will be cleaned up by the S3 lifecycle mechanism at least 1 day later. Since the storage cost of AWS S3 is lower than the cost of API calls, this mechanism is suitable for AWS S3. If using self-hosted MinIO or similar, where API call costs are low and faster space cleanup is desired, you can set the value of remote_gc_method to 2. Under this mechanism, files will be cleaned up as soon as 1 hour after being “marked for deletion.”

If all write nodes are offline and no one is available to perform GC on S3, you can currently only write a script or clean it up on the management page.

| username: dba-kit | Original post link

I verified the scale-down process, and it is indeed continuously deleting data on S3.

| username: WinterLiu | Original post link

Learned, :+1: :+1: :+1: :+1: :+1:

| username: dba-kit | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.