If possible, set up a rollback cluster. If there are issues with the upgrade, you can roll back. I think this is more reliable. Currently, the business also needs to test on the new version in advance.
If there is an opportunity to shut down everything for cold backup, it can be considered a fallback plan, right? I suggest testing it multiple times in the test environment.
Here are some excerpts from the upgrade instructions:
If the original cluster is a version before 6.2, when upgrading to version 6.2 and above, you may encounter situations where the upgrade gets stuck in some scenarios. You can refer to How to solve the problem of upgrade getting stuck.
When upgrading TiFlash from a version before v6.3.0 to v6.3.0 and later versions, special attention is needed: When deploying TiFlash on a Linux AMD64 architecture hardware platform, the CPU must support the AVX2 instruction set. When deploying TiFlash on a Linux ARM64 architecture hardware platform, the CPU must support the ARMv8 architecture. For details, please refer to the description in the 6.3.0 version Release Notes.
If the directory does not exist, TiDB will automatically create it at startup. If the directory creation fails or TiDB does not have read/write permissions for the directory, Fast Online DDL may encounter unpredictable issues during runtime.
It is recommended to first upgrade TiDB to one of the stable intermediate versions, and then upgrade to the target version. For example, you can first upgrade TiDB to a 6.2.x or 6.3.x version, and then upgrade to version 6.5.7.