Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 俗话说常在河边走,哪能不湿鞋。作为dba,来说说你手抖引发的血案
At a client’s (dad’s) company, due to continuous overtime, I was in a daze. At noon, I was supposed to perform a backup and first checked the original backup. As a result, my hand slipped, and I deleted one of their backups. Fortunately, the impact was not significant, and I re-backed it up.
I was extremely anxious the entire afternoon and became even more dazed. That evening, I had to perform an operation to add a partition. I first checked the original partition, and somehow, I ended up dropping it…
And then there was nothing more
Did it give you a cold sweat?
I was completely dumbfounded at that moment…
Didn’t even have time to break into a cold sweat.
I once executed a shrink operation on an Oracle database, which triggered an Oracle bug and caused the database to be down for several hours.
All highly risky operations are double-checked by two people, and no major issues have occurred so far.
It’s okay, but it’s about deleting the database and running away.
After rebuilding a specific TiKV pod, I intended to check the PVC status using ctl+r and planned to execute kubectl describe pvc xxx-pvc -n xxx
. However, I mistakenly executed kubectl delete pvc xxx-pvc -n xxx
, so I had to rebuild it again.
I wanted to delete a specific pod, tikv-1, to restart it. But when copying, I copied the wrong name and ended up with tidb-1, resulting in executing kubectl delete pod tidb-1 -n xxx
. Originally, a TiKV restart would have been imperceptible to the business, but it ended up restarting a TiDB node, causing a temporary disconnection of business links.
There was once an S-level production line failure when upgrading from MySQL 5.7 to MySQL 8 due to issues with character set collation. Everyone was stunned…
Executing commands on the production database as if it were a test database.
Entered the wrong cluster and deleted the wrong database.
The WHERE condition did not match the appropriate data, resulting in a full table update.
This shouldn’t be considered a slip of the hand; this kind of issue falls under unexpected failures.
This is what a professional DBA does.
This is due to inadequate testing and research.