What configuration does a TiDB cluster need to batch process 200 million records daily?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 每日2亿数据进行跑批,tidb集群需要什么配置?

| username: huazai0803

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v6.1
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: We have a daily batch processing of 200 million data entries and have decided to use TiDB. We would like to consult with experts on cluster configuration recommendations.]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]

| username: tidb狂热爱好者 | Original post link

A data volume of 200 million is not a problem. A single-node machine with an EPYC 7542 CPU, 4TB of memory, and 10 M2 SSDs. A second-hand Dell machine costs around 3000, and you can add memory and hard drives yourself. The total cost does not exceed 20,000.

| username: tidb狂热爱好者 | Original post link

This is the most cost-effective DIY configuration. If the company has the budget, they can directly buy Dell’s pre-built machines.

| username: huazai0803 | Original post link

Preparing to purchase resources from Alibaba Cloud.

| username: alfred | Original post link

The configuration does not need to be too high.

| username: tidb狂热爱好者 | Original post link

No need, the data is not much.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.