Do TiDB cluster server upgrades require configuration file modifications?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB集群服务器升级是否需要修改配置文件?

| username: SummerGu

Hello, may I ask if I upgrade the server memory from 16G to 32G, do I need to change the memory usage settings in the configuration file? Or will it automatically adjust to the optimal state during the upgrade?

| username: tidb菜鸟一只 | Original post link

Manual adjustments are required.

| username: Kongdom | Original post link

If it hasn’t been set before, there’s no need to modify it. If it has been set before, then the corresponding modifications need to be made.

| username: 小龙虾爱大龙虾 | Original post link

Some default values are set as percentages, which look appropriate and don’t need to be changed. However, if they are set as fixed values, they might need to be adjusted.

| username: zhanggame1 | Original post link

If it hasn’t been modified, generally there’s no need to change it.

| username: 随缘天空 | Original post link

Some parameters may be set according to the machine’s configuration, but many people generally use the default values during installation. If you haven’t set them before, you can ignore them. If you have set them before, just configure them proportionally.

| username: Jellybean | Original post link

If you are upgrading the memory of the TiKV node machine, it is recommended to appropriately increase the block cache capacity parameter of TiKV after the memory upgrade. Note that this should be modified using the tiup edit config method, and after modification, the TiKV cluster needs to be reloaded and restarted using tiup reload.

If it is a TiDB node machine, you can appropriately increase the transaction memory limit. The modification method is similar to that of TiKV mentioned above.

If it is a TiPD node machine, no adjustment is needed; you can keep the current settings.

If it is a TiCDC node machine, you can appropriately increase the per table memory quota, similar to the method for TiKV.

Of course, you can choose not to adjust, but making adjustments can fully utilize resources and provide a better performance experience.

| username: Soysauce520 | Original post link

If there are no issues with TiDB and PD, keep them unchanged. For TiKV, you can appropriately adjust storage.block-cache.capacity. For TiFlash, check if max_memory_usage_for_all_queries is configured (the default is 0). If it is configured, you can increase it; if not, leave it as is.

| username: dba远航 | Original post link

Some outdated parameters need to be adjusted.

| username: 哈喽沃德 | Original post link

If the resources become smaller, don’t use them; if the resources become larger, optimize them.

| username: 春风十里 | Original post link

It would be better to make adjustments. Let me ask if there is a manual way to collect system statistics, and whether increasing memory requires recollecting system statistics?

| username: zhanggame1 | Original post link

Increasing memory does not require recollecting system statistics. You can manually run ANALYZE TABLE XXXX. Use the SHOW STATS_HEALTHY command to check the health status.

| username: 江湖故人 | Original post link

Statistics are information about the data distribution of the table, stored on the disk. Increasing the memory size does not require recollecting them. If there is a significant improvement in memory IO performance, it might affect the optimizer’s judgment. In such cases, you can adjust the underlying cost factors, but generally, no one does this. :face_with_open_eyes_and_hand_over_mouth:

| username: 烂番薯0 | Original post link

Automatic is unlikely.

| username: 春风十里 | Original post link

Thank you for the reply. Actually, I just want to know how the underlying cost factors are calculated. I understand that Oracle has manual collection of system statistics, which can be used to calculate the hardware’s single IO capability and multi-block read throughput based on system statistics to determine whether to perform a full table scan or use an index. Generally speaking, increasing memory may not directly affect cost calculation. I understand that in a distributed system, cost calculation should include network costs in addition to IO and CPU, compared to a standalone system. However, I haven’t found the logic for cost factor calculation in TiDB’s documentation. If anyone knows, please share the link.

| username: 江湖故人 | Original post link

TiDB and Oracle don’t have this concept; the cost factor is a concept within the PostgreSQL system. The general idea is to assign a coefficient to each row of data for CPU/random read/sequential read/index read, etc. When calculating the cost, the number of rows in the statistical information is multiplied by this coefficient, and the execution plan ultimately selects the access path with the least cost. When there are significant hardware adjustments, optimizing this coefficient can make the optimizer as accurate as possible.