TiDB Node Panic

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb节点panic

| username: TiDBer_TuuZWV0A

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.3.1
[Encountered Problem] Node panic and disconnection
[Reproduction Path] Operations performed that led to the problem
[Problem Phenomenon and Impact]

Related error logs

    goroutine 146089 [running]:
runtime.throw(0x3d8e07c, 0x15)
	/usr/local/go/src/runtime/panic.go:1117 +0x72 fp=0xc0815e0728 sp=0xc0815e06f8 pc=0x12b9dd2
runtime.mapassign_faststr(0x39e8fc0, 0xc049305e60, 0x3d7c21d, 0x10, 0x6186720)
	/usr/local/go/src/runtime/map_faststr.go:291 +0x3d8 fp=0xc0815e0790 sp=0xc0815e0728 pc=0x1296118
github.com/pingcap/tidb/telemetry.BuiltinFunctionsUsage.Inc(...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/telemetry/data_window.go:128
github.com/pingcap/tidb/planner/core.(*expressionRewriter).newFunction(0xc0808cb790, 0xc013b3a610, 0xe, 0xc06c3081d8, 0xc0644ac9b0, 0x1, 0x1, 0x203012, 0x203012, 0xc072e70c60, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:1216 +0x288 fp=0xc0815e0850 sp=0xc0815e0790 pc=0x28bfe08
github.com/pingcap/tidb/planner/core.(*expressionRewriter).funcCallToExpression(0xc0808cb790, 0xc06c3081c0)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:1798 +0x8db fp=0xc0815e0990 sp=0xc0815e0850 pc=0x28c8c7b
github.com/pingcap/tidb/planner/core.(*expressionRewriter).Leave(0xc0808cb790, 0x42d1a80, 0xc06c3081c0, 0x42e93a8, 0xc07a1d4bd0, 0xc05ed6af01)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:1077 +0xa25 fp=0xc0815e0bc0 sp=0xc0815e0990 pc=0x28bdf05
github.com/pingcap/tidb/parser/ast.(*FuncCallExpr).Accept(0xc06c3081c0, 0x4295838, 0xc0808cb790, 0x4300598, 0xc0808cb6c0, 0xc0815e0cd8)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/ast/functions.go:532 +0x18e fp=0xc0815e0c20 sp=0xc0815e0bc0 pc=0x1ca044e
github.com/pingcap/tidb/planner/core.(*PlanBuilder).rewriteExprNode(0xc0808afba0, 0xc0808cb790, 0x42e9568, 0xc06c3081c0, 0xc0808cb601, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:199 +0x9c fp=0xc0815e0cb0 sp=0xc0815e0c20 pc=0x28b0c9c
github.com/pingcap/tidb/planner/core.(*PlanBuilder).rewriteWithPreprocess(0xc0808afba0, 0x42b9ca0, 0xc000052088, 0x42e9568, 0xc06c3081c0, 0x4300598, 0xc0808cb6c0, 0x0, 0x0, 0x1, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:145 +0x14c fp=0xc0815e0d30 sp=0xc0815e0cb0 pc=0x28b060c
github.com/pingcap/tidb/planner/core.(*PlanBuilder).rewrite(0xc0808afba0, 0x42b9ca0, 0xc000052088, 0x42e9568, 0xc06c3081c0, 0x4300598, 0xc0808cb6c0, 0x0, 0x1, 0x0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:113 +0xb9 fp=0xc0815e0dc8 sp=0xc0815e0d30 pc=0x28b0439
github.com/pingcap/tidb/planner/core.rewriteAstExpr(0x4303f18, 0xc02d2cf600, 0x42e9568, 0xc06c3081c0, 0xc0a4c82f50, 0xc031a486e0, 0x2b, 0x2b, 0x10, 0x3c3a2c0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/expression_rewriter.go:80 +0x36c fp=0xc0815e0f40 sp=0xc0815e0dc8 pc=0x28affec
github.com/pingcap/tidb/expression.RewriteSimpleExprWithNames(0x4303f18, 0xc02d2cf600, 0x42e9568, 0xc06c3081c0, 0xc0a4c82f50, 0xc031a486e0, 0x2b, 0x2b, 0xc02d2cf7d0, 0x1, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/expression/simple_rewriter.go:114 +0x8e fp=0xc0815e0fb0 sp=0xc0815e0f40 pc=0x23c5c4e
github.com/pingcap/tidb/expression.ParseSimpleExprsWithNames(0x4303f18, 0xc02d2cf600, 0xc0a8cf5c80, 0x1b, 0xc0a4c82f50, 0xc031a486e0, 0x2b, 0x2b, 0x7feac4be28b0, 0xc057929ce0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/expression/simple_rewriter.go:103 +0x35e fp=0xc0815e10b8 sp=0xc0815e0fb0 pc=0x23c597e
github.com/pingcap/tidb/planner/core.makePartitionByFnCol(0x4303f18, 0xc02d2cf600, 0xc031a48420, 0x2b, 0x2b, 0xc031a486e0, 0x2b, 0x2b, 0xc021764960, 0x14, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/rule_partition_processor.go:902 +0xd8 fp=0xc0815e1160 sp=0xc0815e10b8 pc=0x2a041f8
github.com/pingcap/tidb/planner/core.(*partitionProcessor).pruneRangePartition(0x6184328, 0x4303f18, 0xc02d2cf600, 0xc021151400, 0x42f5278, 0xc029421340, 0xc05c823c20, 0x5, 0x5, 0xc031a48420, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/rule_partition_processor.go:826 +0x125 fp=0xc0815e12d0 sp=0xc0815e1160 pc=0x2a03245
github.com/pingcap/tidb/planner/core.PartitionPruning(0x4303f18, 0xc02d2cf600, 0x42f5278, 0xc029421340, 0xc05c823c20, 0x5, 0x5, 0x6184328, 0x0, 0x0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/planner/core/partition_prune.go:42 +0x4b0 fp=0xc0815e13b8 sp=0xc0815e12d0 pc=0x29566d0
github.com/pingcap/tidb/executor.partitionPruning(0x4303f18, 0xc02d2cf600, 0x42f5278, 0xc029421340, 0xc05c823c20, 0x5, 0x5, 0x6184328, 0x0, 0x0, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:4474 +0x106 fp=0xc0815e1538 sp=0xc0815e13b8 pc=0x3246f66
github.com/pingcap/tidb/executor.prunePartitionForInnerExecutor(0x4303f18, 0xc02d2cf600, 0x42f20c8, 0xc029421340, 0xc05756ecd0, 0xc031a492e0, 0xc05afc0800, 0x4d, 0x80, 0x3, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:3133 +0x126 fp=0xc0815e18b0 sp=0xc0815e1538 pc=0x3235a86
github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildIndexLookUpReaderForIndexJoin(0xc058dfbfc0, 0x42b9c68, 0xc06a76f340, 0xc031a491e0, 0xc05afc0800, 0x4d, 0x80, 0xc060ca9ac0, 0x1, 0x1, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:3951 +0x376 fp=0xc0815e1a00 sp=0xc0815e18b0 pc=0x3240656
github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildProjectionForIndexJoin(0xc058dfbfc0, 0x42b9c68, 0xc06a76f340, 0xc048cbbe40, 0xc05afc0800, 0x4d, 0x80, 0xc060ca9ac0, 0x1, 0x1, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:3990 +0x148 fp=0xc0815e1b70 sp=0xc0815e1a00 pc=0x3240e08
github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildExecutorForIndexJoinInternal(0xc058dfbfc0, 0x42b9c68, 0xc06a76f340, 0x42eb7f8, 0xc048cbbe40, 0xc05afc0800, 0x4d, 0x80, 0xc060ca9ac0, 0x1, ...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:3633 +0x345 fp=0xc0815e1cc8 sp=0xc0815e1b70 pc=0x323bc05
github.com/pingcap/tidb/executor.(*dataReaderBuilder).buildExecutorForIndexJoin(...)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/builder.go:3612
github.com/pingcap/tidb/executor.(*innerWorker).fetchInnerResults(0xc083891ba0, 0x42b9c68, 0xc06a76f340, 0xc0561da100, 0xc05afc0800, 0x4d, 0x80, 0x0, 0x0)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:702 +0x165 fp=0xc0815e1e38 sp=0xc0815e1cc8 pc=0x328d405
github.com/pingcap/tidb/executor.(*innerWorker).handleTask(0xc083891ba0, 0x42b9c68, 0xc06a76f340, 0xc0561da100, 0x0, 0x0)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:541 +0x19a fp=0xc0815e1ed8 sp=0xc0815e1e38 pc=0x328b3ba
github.com/pingcap/tidb/executor.(*innerWorker).run(0xc083891ba0, 0x42b9c68, 0xc06a76f340, 0xc06c7206d0)
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/©.go:515 +0x158 fp=0xc0815e1fc0 sp=0xc0815e1ed8 pc=0x328af98
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc0815e1fc8 sp=0xc0815e1fc0 pc=0x12f5081
created by github.com/pingcap/tidb/executor.(*IndexLookUpJoin).startWorkers
	/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/index_lookup_join.go:213 +0x205
| username: xfworld | Original post link

  1. What process did it go through? What caused the panic?

  2. How many TiDB nodes are there in total? Is every node like this?

| username: TiDBer_TuuZWV0A | Original post link

  1. Various SQLs are running in the production environment, and the specific process causing the restart has not yet been identified.
  2. There are about 20 nodes in total, and most of them have restarted.
| username: xfworld | Original post link

Enable resource profiling, set execution time and memory limits, it can help you capture SQL and pinpoint where the problem is.

| username: WalterWj | Original post link

You can raise an issue on the TiDB GitHub, or try upgrading :thinking:

| username: TiDBer_CEVsub | Original post link

It is estimated that the SQL is written a bit complex, causing a deadlock.

| username: 饭光小团 | Original post link

This is a process panic, it won’t log the SQL to the logs~

| username: 饭光小团 | Original post link

Following the stack information and source code investigation, could it be related to *: fix index join on partition table data race by tiancaiamao · Pull Request #33979 · pingcap/tidb · GitHub?

| username: 饭光小团 | Original post link

Hi, could you please tell me if this issue is related to this problem?

| username: xfworld | Original post link

It’s unclear about the environment and configuration being used, so it’s impossible to make a judgment.

| username: 饭光小团 | Original post link

The environment used refers to what? My business scenario? If it’s about configuration, I can post it here.

| username: xfworld | Original post link

Please start a new thread and describe the issue according to the following format:
【TiDB Usage Environment】Production / Testing / PoC
【TiDB Version】
【Issue Description and Impact】
【Reproduction Steps】What operations were performed that led to the issue
【Resource Configuration】

Provide as much relevant background information as possible. Many issues may have different solutions depending on the scenario and business context. If you don’t explain clearly, it will be difficult for others to offer help.

| username: 饭光小团 | Original post link

Oh, I understand what you mean. Actually, I am a colleague of the owner of this current post. We have traced the issue along the stack and believe it might be related to *: fix index join on partition table data race by tiancaiamao · Pull Request #33979 · pingcap/tidb · GitHub. However, the owner of this post did not include this information in the post, so I registered an account to join the discussion.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.