Does TiKV support manually releasing the cache in the block cache?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv支持手动释放block cache里的缓存么?

| username: Lystorm

Currently, we are using Spark for big data algorithm testing, and the database is TiDB v6.1. We are using table A for algorithm testing. During the first execution of the algorithm test, there is no cached data in the cache, so the execution time is longer. We have added data to table A to expand the test data volume, but because there is cached data of table A in TiKV, the second test cannot accurately reflect the actual test time. Besides restarting the TiKV service, is there a way to manually release the TiKV cache?

| username: OnTheRoad | Original post link

Not supported

| username: TiDBer_CEVsub | Original post link

It seems like it is not supported.

| username: Lystorm | Original post link

Would it be effective if I directly clear all the caches in the Linux system?

| username: 张雨齐0720 | Original post link

It seems that there is no relevant documentation, so it should not be supported.

| username: OnTheRoad | Original post link

You can try it in a test environment, but it is not recommended to forcibly release memory in a production environment.

| username: forever | Original post link

SQL can use SQL_NO_CACHE to prevent it from being cached. I’m not very familiar with Spark, but if it also uses SQL, you can give it a try.

| username: 半瓶醋仙 | Original post link

Not supported

| username: buddyyuan | Original post link

If it’s a testing environment, I think you can try this and see.

| username: Lystorm | Original post link

Is this setting written in the TiKV configuration file or configured as a parameter in the database?

| username: wisdom | Original post link

This should not be supported.

| username: zhouzeru | Original post link

I don’t know if Spark has such an operator.

| username: forever | Original post link

In the executed SQL

| username: 人如其名 | Original post link

Boss, after I set sql_no_cache, the execution plan still hits the cache, right?

| username: 数据小黑 | Original post link

If you are using TiSpark, here are a few small suggestions:

  1. TiSpark reads data directly from TiKV, and I believe the block cache is in the TiDB Server.
  2. Using the mysql client to run explain may not represent the actual execution process in Spark, so it is recommended to use Explain in Spark.
  3. Depending on your Spark application scenario, there might be a shuffler cache, which could affect your algorithm’s computation. It is advisable to consider this.
| username: alfred | Original post link

Would reducing the block cache also achieve the goal?

| username: forever | Original post link

sql_no_cache means not to use the cache, but it will not clear the cache. From the first execution with it, subsequent executions will not use the cache either.

| username: forever | Original post link

Blockcache belongs to TiKV, or more precisely, to RocksDB.

| username: Lystorm | Original post link

Well, Spark SQL does not support this kind of syntax, but JDBC does.