Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 统计信息Count-Min Sketch描述问题

The Count-Min Sketch section of the statistics says:
- Modify the two parameters
WITH NUM CMSKETCH DEPTH
andWITH NUM CMSKETCH WIDTH
mentioned in Statistics Collection - Manual Collection. These two parameters affect the number of hash buckets and the probability of collisions. Increasing them appropriately can reduce the probability of conflicts, but it will also affect the memory usage of the statistics. You can adjust them according to the specific situation. In TiDB, the default value forDEPTH
is 5, and the default value forWIDTH
is 2048.
Question: It says here that increasing DEPTH and WIDTH will reduce the probability of hash collisions. Is there a problem with this? Normally, increasing WIDTH will reduce the probability of hash collisions, but why would increasing DEPTH reduce the probability of hash collisions? Shouldn’t this increase the probability of hash collisions?