Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: TIDB数据库在某省妇幼业务系统应用

Business Background
The construction content of a provincial maternal and child health management system includes: maternal and child health information database; maternal and child health information collection system, maternal and child health information management and analysis system; mother and child health management APP, maternal and child health management and analysis APP, and other 62 functional modules.
Original Database Architecture
Original Technical Architecture and Pain Points
We chose Starrocks as the analysis layer and implemented real-time + offline data synchronization through the DataX + CloudCanal model. As the business iterates, this architecture is not very suitable for the development needs of maternal and child business.
The architecture is generally divided into four parts from bottom to top:
- Data Layer: The source data sources are mainly MySQL relational databases. Five sets (10 units) of master-slave setups are built to undertake 14 sub-databases and 1 main database, with over 700 tables in the MySQL sub-databases.
- Processing Layer: Using CloudCanal + DataX for real-time and offline data synchronization.
Offline: Constructing DM theme data marts for reports, large screens, and data exchange services in an offline manner using the DataX tool.
Real-time: Using CloudCanal to synchronize MySQL data 1:1 to Starrocks.
-
Analysis Layer: The analysis layer will store calculated indicator data and intermediate result data for accelerated queries.
-
Using three 32C128G machines to build an SR cluster, corresponding to report business, large screen business, data exchange services, and query acceleration.
-
Pain Points:
-
Frequent business changes: Initially, to meet business needs, database structure changes were made at least twice a week. After negotiating with the business unit, it was controlled to once a week. Changes require DBAs to adjust the 14 city MySQL databases, Starrocks, and data scheduling.
-
Server resource waste: MySQL uses a master-slave architecture for data security, and the resources of the slave database are not effectively utilized.
-
Impact of business updates on business: The application layer uses a microservice architecture, involving many developers who only focus on their own business modules. They submit DDL statements through the SQL audit platform, which are finally executed by the DBA. The DBA is not familiar with the business and cannot check whether the DDL statements meet the actual business needs. This causes frequent anomalies the day after the platform is released (some issues that cannot be reproduced in the test environment).
-
Inability to improve SQL optimization skills of R&D and DBA personnel: Since MySQL data is synchronized 1:1 to Starrocks, all complex queries are replaced by MPP. SQL optimization and reasonable table splitting are no longer focused on (previously thought to be a good thing to improve R&D efficiency). This problem will erupt uniformly after the MPP bottleneck, and can only be solved by upgrading server configurations, not fundamentally solving the problem.
-
DBAs feel overwhelmed: This architecture is a bit exhausting. At a frequency of once a week, DBAs perform DDL operations after 6 PM during business off-peak hours. DBAs need to maintain 30 sets of databases, nearly 20,000 tables. After the operation, they release the program and then follow up with testing. If problems occur, rolling back operations at midnight greatly impacts the business.
-
Database segmentation by city is not well compatible with cross-city business services, such as reports that can only be displayed after summarizing databases through wide tables (increasing number of wide tables) and duplicate records, inability to query service records across cities, etc.
-
Unable to apply automated database audit platform, complex decentralized database operations, high difficulty in automation implementation.
Based on the above pain points, we are looking for a better technical architecture…
Database Consolidation
Total Tables | Over 100,000 Tables | Over 1,000,000 Tables | Over 10,000,000 Tables |
792 | 156 | 58 | 5 |
Looking for a database solution to ensure write and Online-DDL operations after database consolidation.
Tested Polardb, TiDB, OceanBase, and finally chose TiDB
TiDB Features
- Highly compatible with MySQL, in most cases, code can be easily migrated from MySQL to TiDB without modification. Even MySQL clusters with sharding can be migrated in real-time using TiDB’s migration tools.
- Horizontal scalability, easily achieve horizontal expansion by adding new nodes, expand throughput or storage as needed, and easily handle high concurrency and massive data scenarios.
- Distributed transactions, TiDB 100% supports standard ACID transactions.
- Truly financial-grade high availability, compared to traditional master-slave (M-S) replication schemes, the majority election protocol based on Raft can provide financial-grade 100% data strong consistency guarantee, and can achieve automatic failover without losing most replicas, without manual intervention.
TiDB Database Architecture Introduction
Technical Architecture
New Database Architecture
The new architecture removes the Starrocks analysis layer, leveraging TiDB’s TiFlash component OLAP functionality. Data exchange uses CloudCanal, merging the analysis layer and business layer, with business as wide tables, uniformly accelerated through TiFlash. This brings significant improvements in operational costs and business expansion.
Introducing Yearning Automated SQL Audit Platform
Yearning has the following features:
- Automated SQL statement review, can automatically detect and execute SQL
- Automatically generate rollback statements after executing DDL/DML statements
- Review/query audit functions
- Supports LDAP login/DingTalk and email message push
- Supports custom review workflows
- Supports fine-grained permission allocation
“Subsequent special training will explain the database audit platform”
About SQL Optimization
DBAs record, track, and continuously optimize slow statements.
Benefits After Analysis Layer Transformation
After replacing Starrocks with TiFlash, the overall query efficiency (compared to Starrocks with city-based sharding) did not significantly improve. After SQL optimization, it basically meets expectations.
Comparison of common queries between TiDB and Starrocks (same data volume and structure)
Query Response Time Comparison (limited by editor SQL pasted images)
Built-in Monitoring TiDB Dashboard
QPS
SQL Statement Analysis
Slow Query
Traffic Visualization
Database/Table Deletion Recovery
Examples
- Restore the test database deleted by DROP: DROP DATABASE test; FLASHBACK DATABASE test;
- Restore the test database deleted by DROP and rename it to test1: DROP DATABASE test; FLASHBACK DATABASE test TO test1;
–Restore table deletion example
- Restore the table data deleted by DROP: DROP TABLE t; FLASHBACK TABLE t;
- Restore the table data truncated by TRUNCATE. Since the truncated table still exists, the restored table needs to be renamed, otherwise, an error will be reported that table t already exists. TRUNCATE TABLE t; FLASHBACK TABLE t TO t1;
Backup/Restore Exceeds Expectations
TiDB provides BR cluster snapshot backup function, directly backing up logs to MinIO. Currently, the provincial maternal and child health system takes snapshots twice a day at 0:00 and 12:00. Due to storage limitations, only snapshots within one day can be retained, and log backups are not done. (Full snapshot + real-time log backup) can ensure no data loss. BR restores data beyond expectations, restoring 300G data in 20 minutes (v7.1.3). The latest official version v7.6 improves BR restore capability by 10 times.
Trial of One Location Two Centers (Banking Business: At Least Two Locations Three Centers)
Considering the importance of maternal and child data, a one-location two-center setup is implemented on the government cloud. Through TiCDC, the main cluster writes data to the secondary cluster in real-time, while the secondary cluster handles report business and R&D test database environment.
pt-kill application captures running SELECT|ALTER and other DML/DDL queries consuming excessive resources, filters them, then kills them (optional) and sends email/WeChat alerts to DBAs and relevant developers to avoid prolonged slow SQL execution causing harm to the database.
https://github.com/hcymysql/pt-kill
Background operation
shell> nohup php pt-kill.php -u username -p pwd -h 10.10.159.0 -P 3306 -B 10 -I 15 --match-info=‘select|alter’ --match-user=‘dbuser’ --kill --mail --weixin --daemon 1 &
Killed slow statements will be retained in the background log file, and the next step is to optimize the SQL.
New Architecture Effect Explanation
Reasonable Utilization of Server Resources
Comparison | Data Layer | Processing Layer | Analysis Layer |
Original Architecture | 10 units | 2 units | 3 units |
New Architecture | 11 units | 1 unit | 0 units |
Release of Human Resources
In the original architecture, DBA personnel performed DDL operations and adjustments and scheduling on the database according to the release cycle in the data layer and processing layer. The new architecture eliminates the maintenance work of scheduling and introduces an SQL audit platform to achieve automated DDL. However, DBAs need to pay more attention to various indicators of TiDB.
Reduction of Operational Costs
TiDB deployment does not require the support of big data components, making deployment and operation very simple. TiDB is compatible with the MySQL ecosystem, and business use can directly connect using MySQL JDBC, without worrying about SQL syntax differences. The original MySQL master-slave architecture, SR cluster, and scheduling task maintenance work can all be eliminated.
Future Plans
Currently, we have two data architectures, MySQL + Starrocks and TiDB. Each architecture has its advantages (they can also be used in combination). Based on the needs of the business unit, we will determine which architecture to use according to the requirements of different business lines.