Calculation Issues of Single Row Value Size in TiDB

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于TiDB单行value size的计算问题

| username: EricSong

[TiDB Usage Environment] Testing
[TiDB Version] v5.3.0
[Encountered Issue: Problem Phenomenon and Impact]
An entry too large error occurs during Update. After checking the documentation, it seems to be caused by the transaction data volume being too large. Even after adjusting related parameters (txn-entry-size-limit, txn-total-size-limit, max_allowed_packet), the error still occurs because the length of the Blob being written during the Update has already exceeded the maximum single transaction limit of 120MB (378224214/(210241024)= 180.3MB).

How should this be optimized and resolved in this case? Can it be achieved by modifying TiDB parameters?

Additionally, while checking the MySQL documentation, I saw the following:

  • The internal representation of a MySQL table has a maximum row size limit of 65,535 bytes, even if the storage engine is capable of supporting larger rows. BLOB and TEXT columns only contribute 9 to 12 bytes toward the row size limit because their contents are stored separately from the rest of the row.

It seems that when calculating row length, Blob files are optimized to a pointer-like variable. Does TiDB also adopt this strategy? If so, theoretically, my operation should not exceed the maximum length limit (not a wide table). If not, is there a way to enable this strategy?

[Attachment: Screenshot/Log/Monitoring]
Log

[txn_mode=PESSIMISTIC] [err="[kv:8025]entry too large, the max entry size is 125829120, the size of data is 189200959

Table Structure

CREATE TABLE `table_optimizing_process` (
  ...
  `rewrite_input` longblob DEFAULT NULL COMMENT 'rewrite files input',
  ...
  PRIMARY KEY (`process_id`) /*T![clustered_index] CLUSTERED */,
  KEY `table_index` (`table_id`,`plan_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='History of optimizing after each commit';

SQL Statement
UPDATE table_optimizing_process SET rewrite_input = ...(len: 378224214)

| username: tidb狂热爱好者 | Original post link

You are an expert who uses the database as storage.

| username: xingzhenxiang | Original post link

No, you can only modify the program.

| username: EricSong | Original post link

The business side’s requirement, wants to know if there is a possibility of a simple implementation :joy:

| username: TiDBer_jYQINSnf | Original post link

It seems that Titan is designed for this kind of situation. I haven’t studied it in detail, but you can check if it works for you.

| username: cassblanca | Original post link

Are you storing plain text here? If you are storing files, it’s better not to use LOB large fields. You can store the address in the database and use high-performance storage to save the files together. If you are storing text, you can save it in separate tables.

| username: zhanggame1 | Original post link

It can only be resolved at the application layer.

| username: redgame | Original post link

Consider splitting the data: If possible, split large Blob data into smaller parts for storage and updates. This can bypass the single-row transaction limit.

| username: 像风一样的男子 | Original post link

If you execute it manually, can you try UPDATE table_optimizing_process SET rewrite_input = xxx LIMIT 100 to update in segments?

| username: EricSong | Original post link

I have already explained to the business side, and they will split and compress the Blob. The platform side indeed cannot support it.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.