Recompiling TiFlash source code and using TPC-H tests show significant performance differences compared to the native TiFlash process

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 重新编译tiflash源码,使用tpch测试,性能与原生tiflash进程差异很大

| username: programmer

[TiDB Usage Environment] CentOS Stream release 8/Linux 5.18.19
[TiDB Version] v6.5.0
[Reproduction Path]
Based on the tiflash v6.5.0 source code compilation, replace the native tiflash binary file generated after tiup cluster deploy, use the tiup bench tpch prepare command to import 100G data, and then use the tpch SQL statements for testing.
[Encountered Problem: Phenomenon and Impact]
The performance of the recompiled tiflash process is far inferior to the native one. Capturing flame graphs and perf top data reveals that the processing flows of the two programs are significantly different. Therefore, during the compilation of the tiflash source code, do certain parameters need to be specified to be turned on or off? Otherwise, why is there such a significant performance difference?
[Resource Configuration]
tidb cluster deployment
tidb.yaml (1.2 KB)
tiflash compilation parameters: cmake … -DCMAKE_BUILD_TYPE=DEBUG -DCMAKE_C_COMPILER=/usr/bin/clang-15 -DCMAKE_CXX_COMPILER=/usr/bin/clang+±15
Recompiled process related test data:
tpch sql1 query time: 14.20s
perf top results:

Flame graph:

Native process test data:
tpch sql1 query time: 3.57s
perf top results:

Flame graph:

| username: Billmay表妹 | Original post link

Why recompile? Isn’t the native TiFlash sufficient?

| username: programmer | Original post link

My initial goal was to introduce other compression algorithms into TiFlash to see if there would be any performance improvements. However, during the testing process, I discovered this issue. Even when using the LZ4 compression algorithm and simply recompiling TiFlash, there was such a significant performance difference. Therefore, when I later verify whether the new compression algorithm improves performance, this factor will affect the results, making it impossible to draw accurate conclusions.

| username: xfworld | Original post link

Check if there are still many optimization parameters during compilation… They can help improve the final performance.

| username: 海石花47 | Original post link

Compression algorithm… Borrowing this thread to ask a question, does TiDB not have something like the MySQL Archive engine that can compress some archive tables?

| username: flow-PingCAP | Original post link

You used the debug mode to compile. For performance testing or production, please refer to this script:

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.