[TiDB Usage Environment] Poc
[TiDB Version] v7.2.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
reproduce test case:
First create tables, replicate to TiFlash, then add varchar columns, and you can query normally through TiKV/TiFlash.
create table tgbk (id int) charset = gbk;
create table tascii (id int) charset = ascii;
create table tbinary (id int) charset = binary;
create table tlatin1 (id int) charset = latin1;
create table tutf8mb4 (id int) charset = utf8mb4;
create table tutf8 (id int) charset = utf8;
create table t (id int);
alter table t set tiflash replica 1;
alter table tascii set tiflash replica 1;
alter table tbinary set tiflash replica 1;
alter table tgbk set tiflash replica 1;
alter table tlatin1 set tiflash replica 1;
alter table tutf8 set tiflash replica 1;
alter table tutf8mb4 set tiflash replica 1;
alter table t set tiflash replica 1;
alter table tascii set tiflash replica 1;
alter table tbinary set tiflash replica 1;
alter table tgbk set tiflash replica 1;
alter table tlatin1 set tiflash replica 1;
alter table tutf8 set tiflash replica 1;
alter table tutf8mb4 set tiflash replica 1;
insert t select 1;
insert tascii select 1;
insert tbinary select 1;
insert tgbk select 1;
insert tlatin1 select 1;
insert tutf8 select 1;
insert tutf8mb4 select 1;
set @@session.tidb_isolation_read_engines = 'tikv';
-- set @@session.tidb_isolation_read_engines = 'tiflash';
select * from t ;
select * from tascii ;
select * from tbinary ;
select * from tgbk ;
select * from tlatin1 ;
select * from tutf8 ;
select * from tutf8mb4 ;
set @@session.tidb_isolation_read_engines = 'tikv,tiflash';
alter table t add column c varchar(1);
alter table tascii add column c varchar(1);
alter table tbinary add column c varchar(1);
alter table tgbk add column c varchar(1);
alter table tlatin1 add column c varchar(1);
alter table tutf8 add column c varchar(1);
alter table tutf8mb4 add column c varchar(1);
update t set c = 'a';
update tascii set c = 'a';
update tbinary set c = 'a';
update tgbk set c = 'a';
update tlatin1 set c = 'a';
update tutf8 set c = 'a';
update tutf8mb4 set c = 'a';
But, first create tables with character types, insert data, then replicate to TiFlash, only the gbk table fails to create TiFlash replicas.
create schema yandb2;
use yandb2;
create table tgbk (id int, c varchar(10)) charset = gbk;
create table tascii (id int, c varchar(10)) charset = ascii;
create table tbinary (id int, c varchar(10)) charset = binary;
create table tlatin1 (id int, c varchar(10)) charset = latin1;
create table tutf8mb4 (id int, c varchar(10)) charset = utf8mb4;
create table tutf8 (id int, c varchar(10)) charset = utf8;
create table t (id int, c varchar(10));
insert t select 1,'b';
insert tascii select 1,'b';
insert tbinary select 1,'b';
insert tgbk select 1,'b';
insert tlatin1 select 1,'b';
insert tutf8 select 1,'b';
insert tutf8mb4 select 1,'b';
alter table tascii set tiflash replica 1;
alter table tbinary set tiflash replica 1;
alter table tgbk set tiflash replica 1;
alter table tlatin1 set tiflash replica 1;
alter table tutf8 set tiflash replica 1;
alter table tutf8mb4 set tiflash replica 1;
The default value of tidb_enable_clustered_index is INT_ONLY, which means that only tables with integer primary keys will use clustered indexes. If you want to enable clustered indexes for all tables, you need to set tidb_enable_clustered_index to ON.
Actually, my confusion is whether the limitations mentioned in this document are for older versions of TiDB. After TiDB started supporting GBK, the document was not updated, and I also couldn’t find where the restrictions were added in the TiFlash code.
The protocol between TiDB and TiFlash does not allow tables with GBK columns to create replicas, so there is no explicit restriction in the TiFlash code. In your example, first creating a table without string columns and then adding a TiFlash replica to this table did not report an error because the table indeed did not have GBK columns. However, the fact that no error was reported when using ALTER TABLE ADD COLUMN later can be considered as not conforming to the agreement between TiDB and TiFlash. Additionally, this restriction is not specific to older versions; TiFlash still does not support GBK as of now.
Tables with the GBK character set cannot be synchronized to TiFlash, and the following error will be reported:
ERROR 8200 (HY000): Unsupported ALTER table replica for table contain gbk charset
This means that if a table uses the GBK character set, the new feature cannot be used.
Currently, the character sets supported by TiFlash are UTF8, UTF8MB4, ASCII, Latin1, Binary.
Therefore, generally, it is not recommended to use the GBK character set when creating tables. If it is a migration or reconstruction project, it is recommended to convert the data to UTF8mb4.