Does exchSenderExec consider backpressure issues caused by slow consumption speed?

translator_bot · June 22, 2024, 4:16am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: exchSenderExec中是否考虑消费速度过慢导致的背压问题？

| username: 德布劳钦

As mentioned, after the producer and consumer establish a connection, the producer will start a coroutine to enter EstablishMPPConnectionWithStoreID, where the logic for sending data is as follows:

var sendError error
for sendError == nil {
    // Retrieve data from data ch
	chunk, err := tunnel.RecvChunk()
        ...
	if chunk == nil {
     // All data has been retrieved
		break
	}
	res := tipb.SelectResponse{
		Chunks: []tipb.Chunk{*chunk},
	}
	raw, err := res.Marshal()
	...
	sendError = server.Send(&mpp.MPPDataPacket{Data: raw})
}

Here, data is continuously read from the data channel and then sent to the receiver. If the production speed is fast and the consumption speed is slow, it will lead to data backlog at the consumer end, and if the data volume is large, it might cause an OOM (Out of Memory) issue?

How is this considered here, and is there any other mechanism in TiFlash to control the sending rate?

translator_bot · June 22, 2024, 4:16am

| username: 有猫万事足 | Original post link

The settings to speed up synchronization in the documentation. I think you can try reducing them to see if it solves the issue.

-- The default values for these two parameters are both 100MiB, meaning the maximum disk bandwidth occupied by snapshots for replica synchronization does not exceed 100MiB/s.
SET CONFIG tikv `server.snap-io-max-bytes-per-sec` = '300MiB';
SET CONFIG tiflash `raftstore-proxy.server.snap-max-write-bytes-per-sec` = '300MiB';

Especially the two parameters above.

translator_bot · June 22, 2024, 4:16am

| username: 德布劳钦 | Original post link

Thank you, but replica synchronization is not the same scenario as my issue. exchSenderExec is used to send to downstream operators in the exchange operator.

translator_bot · June 22, 2024, 4:16am

| username: 有猫万事足 | Original post link

Alright
I thought it was the same as another issue where TiFlash crashed during data import.

translator_bot · June 22, 2024, 4:16am

| username: redgame | Original post link

Set the max-server-memory parameter to limit the memory usage of the TiFlash instance.

translator_bot · June 22, 2024, 4:16am

| username: 德布劳钦 | Original post link

Is it possible to solve this issue only through operational means? Is this considered a kernel defect?

translator_bot · June 22, 2024, 4:16am

| username: knull | Original post link

First of all, this “slow processing” will inevitably lead to the OOM problem, right?
Secondly, the best way to avoid this issue is, of course, to speed up; however, if speeding up is not possible, then operational measures might be the only solution.