Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: FlinkCDC连接tidb乱码
【TiDB Version】
6.1.0
【Problem Encountered】
new TiKVSnapshotEventDeserializationSchema() {
@Override
public void deserialize(
Kvrpcpb.KvPair record, Collector out)
throws Exception {
System.out.println(“==============”);
System.out.println(record.getValue().toStringUtf8());
int serializedSize = record.getSerializedSize();
out.collect(RowKey.decode(record.getValue().toByteArray()).toString());
}
@Override
public TypeInformation<String> getProducedType() {
return BasicTypeInfo.STRING_TYPE_INFO;
}
})
The printed statement is garbled: e !"#$&'( " @ A ] c o { � � � � � [{�������1s{��0000004c0f0446aab996f270ef2371ef�7Nantong Engineering Co., Ltd. A[“Henan Provincial Department of Transportation”] Provincial Credit Rating Enterprise Honor Credit Rating Highway and Waterway Transportation Category Henan Province�A Highway Construction Enterprise Credit Evaluation Nantong Engineering Co., Ltd. won the 2020 Credit Rating Highway and Waterway Transportation Category Af48405721ba019df0526adefbfcf94872021-04-08 P�2020�a�aHenan Provincial Department of Transportation ���Yang f���
【Reproduction Path】
Flinkcdc connects to TiDB
【Problem Phenomenon and Impact】
Need to obtain correct text or JSON format data
【Attachment】
The original version does not have transcoding operations, you can try it out.
Also, what encoding is used for the database cluster?
Yes, I also copied it directly from the tutorial. If you do it this way, the output will be like this:
.
The database encoding:
Encountered the same problem, spent the whole day without solving it, planning to try the SQL method
The SQL method works, but after reading data for a while, it reported a java.io.EOFException error. So I thought of switching to the API method to debug the cause.
You can try a few more encoding conversions inside to rule it out.
I read the data through Flink CDC’s API, and it seems to be processed by TiKV, not garbled. I am currently encountering this issue as well and do not yet know how to restore the data obtained from TiKV to its original form.
The frustrating part is that record.getValue() and record.getKey() cannot be directly parsed with toString!
Refer to the source code of RowDataTiKVSnapshotEventDeserializationSchema and make some modifications.
Object tikvValues =
decodeObjects(
record.getValue().toByteArray(),
RowKey.decode(record.getKey().toByteArray()).getHandle(),
tableInfo);
I hope the official team can fix this issue soon! The usability is quite poor.
The frustrating part is that record.getValue() and record.getKey() cannot be directly parsed with toString!
Refer to the source code of RowDataTiKVSnapshotEventDeserializationSchema and make some modifications.
Object tikvValues =
decodeObjects(
record.getValue().toByteArray(),
RowKey.decode(record.getKey().toByteArray()).getHandle(),
tableInfo);
I hope the official team can fix this issue soon! The usability is quite poor.
May I ask, did the method you posted at the end solve the garbled text issue in the title?
This is actually just Kvrpcpb.KvPair. The official documentation does not provide an explanation, so you need to refer to RowDataTiKVSnapshotEventDeserializationSchema for parsing.
How should I do this, boss? Is there an example to refer to?
Handle it like this:
Map<String, String> map = new HashMap<>();
map.put(“tikv.grpc.timeout_in_ms”, “30000”);
map.put(“tikv.grpc.keepalive_time”, “30000”);
TiConfiguration tiConfiguration = TDBSourceOptions.getTiConfiguration(“localhost:2379”, map);
TiSession session = TiSession.create(tiConfiguration);
TiTableInfo tableInfo = session.getCatalog().getTable(“database_name”, “table_name”);
Object objArray = TableCodec.decodeObjects(valueByteArray, RowKey.decode(keyByteArray).getHandle(), tableInfo);
Hello, could you please provide a more complete example? I tried it here but it didn’t work.
Great master, I’ve learned a lot.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.