Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 从V6.2升级到V6.3版本时,TIFLASH启动不成功
The same V6.2 was upgraded to 6.3 in the test environment without issues. However, the upgrade failed for TiFlash in the production environment.
Manually executing systemctl start tiflash-9000.service
did not report any errors, but port 9000 did not start, and the related TiFlash process did not start successfully.
Comparing the production and test environments, the installation and deployment configuration parameters are the same (all using default parameters, no custom parameters). The only difference is that the test environment was gradually upgraded from V4.* to V5., while the production environment was gradually upgraded from V5. to V6.*.
Could you please advise on how I should proceed? Thank you!!
Currently, I can only manually replace the BIN directory with the original V6.2 program, and then manually execute systemctl start tiflash-9000.service
to start it successfully. The 9000 port and process status are normal.
I don’t know what went wrong with the upgrade to 6.3.
Please advise. Thank you!
[2022/10/11 00:06:36.103 +08:00] [WARN] [StorageConfigParser.cpp:241] [“Application: The configuration "path" is deprecated. Check [storage] section for new style.”] [thread_id=1]
[2022/10/11 00:07:13.504 +08:00] [WARN] [StorageConfigParser.cpp:241] [“Application: The configuration "path" is deprecated. Check [storage] section for new style.”] [thread_id=1]
[2022/10/11 00:07:23.699 +08:00] [ERROR] [] [“Application: null context when constructing CivetServer. Possible problem binding to port.”] [thread_id=1]
[2022/10/11 00:07:39.357 +08:00] [WARN] [StorageConfigParser.cpp:241] [“Application: The configuration "path" is deprecated. Check [storage] section for new style.”] [thread_id=1]
Are you using 6.3 in production? That thing is a test version.
I recommend the stable version. The 6.3 version doesn’t even have a dashboard.
I recommend using the stable version. The 6.3 version doesn’t have a dashboard.
There is a dashboard that you can check out. But for production, use the stable version and don’t be a guinea pig.
Check if the Prometheus port is being occupied.
Switch to the tisb account on that machine and execute the command to see the returned error.
It’s impossible. The port hasn’t started.
It should be a BUG.
Hearing you say that, it scared me quite a bit. Every time a new version comes out, I first upgrade it on the test environment, run it for a few days without any issues, and then upgrade the production version as well.
I will report it to the relevant teacher to see if it is a BUG.
Hahaha~ There probably isn’t any version without bugs!
Without bugs, there would be no DBAs~
The error above looks like a configuration issue. Could you please provide the complete log after the startup failure?
The configuration can’t be wrong because version 6.2 is the production version currently running. The same default parameters were used: upgrading from the test machine with version 6.2 to 6.3 worked fine. However, on the production server, which also uses the default parameters, upgrading from 6.2 to 6.3 resulted in an error. The error indicated that TIFLASH couldn’t start. Checking port 9000 showed it couldn’t start properly and wasn’t being listened to. Manually starting TIFLASH also didn’t work.
Later, I manually rolled back the TIFLASH program to the original 6.2 version, and manually restarting TIFLASH succeeded. Therefore, I suspect there’s a bug in the 6.3 version of TIFLASH.
Regardless, I don’t want to risk it in the production environment. Currently, the test machine is running version 6.3, while the production environment has been manually rolled back to version 6.2, and no issues have been found so far.
Hello, could you please send the complete log after the startup failure?
How do I roll back the version?
I recall there were compatibility issues with TiFlash that caused upgrade failures. The troubleshooting method is to set the number of replicas on TiFlash to 0, and after synchronization is complete (all replicas are deleted), try upgrading again. I’m not sure if your production environment allows this operation.
It is not recommended to roll back; you can refer to the suggestions from @数据小黑.
TiFlash was not used at all. No replicas were made. It was only deployed and started during installation, but no substantial use was made.
Now upgrading to version 6.5, the same upgrade is unsuccessful. This time, no error messages are reported.