Failed to Add TiFlash

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 添加tiflash失败

| username: Hacker_Sl3AGswc

[TiDB Usage Environment] Test
[TiDB Version] v6.5.4
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact]
Error in log when adding tiflash node: [Sync schema failed by DB::Exception: Wrong precision:0]
Checked tiflash log, it shows synchronization of rep.t_2605 table, but the table was not found upon inspection.

[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]


| username: Hacker_Sl3AGswc | Original post link

Checking tiflash_error.log will consistently report:
[ERROR] [Server.cpp:1212] ["Bootstrap failed because sync schema error: DB::Exception: Wrong precision:0 we will sleep for 3 seconds and try again] [thread_id=1]

| username: dba远航 | Original post link

There was an anomaly when creating the structure of the TiFlash database in the early stages.

| username: Hacker_Sl3AGswc | Original post link

So how should this be handled? I tried removing all the TiFlash nodes and re-adding them, but the problem still persists.

| username: DBAER | Original post link

Is it based on this?

| username: Jasper | Original post link

Is it only this table that has a problem? Use admin check table to check, and also share the table structure so everyone can take a look.

| username: Hacker_Sl3AGswc | Original post link

When scaling out, I followed this method. Initially, the project team added 3 TiFlash nodes, but none of them could start and their status was abnormal. So, I used scale-in -N IP:9000 to scale them in, but when I tried to scale out the TiFlash nodes again, the same error occurred.

| username: Hacker_Sl3AGswc | Original post link

Currently, when checking the tiflash.log log, I found that after running to the line “creating table rep.t_2605,” the next line shows an [error] message. However, I couldn’t find this table in the database. Whether I use tables.tidb_table_id to search or directly use desc rep.t_2605, I couldn’t find this table.

| username: Jasper | Original post link

Let’s look at the results of these two commands.

| username: Jasper | Original post link

Is the data source for this cluster written directly by the business? Or is it synchronized to this cluster through some synchronization tool?

| username: Hacker_Sl3AGswc | Original post link

The results of these two SQL queries are both empty.

| username: Hacker_Sl3AGswc | Original post link

The business is written directly.

| username: JaySon-Huang | Original post link

It should be a partitioned table. Execute the following SQL to see which table it is, and then post the table structure information:

select table_schema,table_name,'' as partition_name,tidb_table_id from information_schema.`tables`
where tidb_table_id = 2605
union
select table_schema,table_name,partition_name,tidb_partition_id from information_schema.`partitions`
where tidb_partition_id = 2605
| username: Hacker_Sl3AGswc | Original post link

The image is not visible. Please provide the text you need translated.

| username: JaySon-Huang | Original post link

Was this cluster upgraded from an earlier version? In version 6.5.4, it should be impossible to create an illegal column of type decimal(0,0); it would be corrected to decimal(10,0) :rofl:

TiDB> create table abc (d decimal(0,0) default null);
TiDB> show create table abc;
+-------+-------------------------------------------------------------+
| Table | Create Table                                                |
+-------+-------------------------------------------------------------+
| abc   | CREATE TABLE `abc` (                                        |
|       |   `d` decimal(10,0) DEFAULT NULL                            |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+-------------------------------------------------------------+
TiDB> select tidb_version() \G
***************************[ 1. row ]***************************
tidb_version() | Release Version: v6.5.4
Edition: Community
Git Commit Hash: d7ce2f2faa1da3177a0f0a7e825f6e8fccd13ec8
Git Branch: heads/refs/tags/v6.5.4
UTC Build Time: 2023-08-23 08:32:40
GoVersion: go1.19.12
Race Enabled: false
TiKV Min Version: 6.2.0-alpha
Check Table Before Drop: false
Store: tikv
| username: JaySon-Huang | Original post link

Try changing the remain_mature_days and collbl_int_ovdue_days columns in the table structure to the decimal(10,0) type and see if TiFlash can start?
alter table tbl_name modify column col_name decimal(10,0);

| username: Hacker_Sl3AGswc | Original post link

There is no problem after removing decimal(0,0), but this cluster has not been upgraded, and these tables with decimal(0,0) were created last Friday. I don’t know how they were created.

| username: JaySon-Huang | Original post link

These tables with decimal(0,0) were created last Friday.

Can you confirm how these abnormal columns were created? For example, was decimal(0,0) specified directly when using CREATE TABLE, or was it modified through DDL modify column, or some other way?

You can refer to this document: ADMIN SHOW DDL [JOBS|JOB QUERIES] | PingCAP 文档中心

-- Find the job-id of the DDL task
ADMIN SHOW DDL JOBS [NUM] [WHERE where_condition];
-- Find the executed DDL statement through the job-id
ADMIN SHOW DDL JOB QUERIES <JOB-ID>
| username: Hacker_Sl3AGswc | Original post link

When creating a table with decimal(0), it becomes decimal(0,0).

| username: JaySon-Huang | Original post link

Reproduced, thanks! We will check if this issue can be automatically fixed when creating the table.