Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 如何快速知道各个慢 sql 的来源
Question:
Currently, multiple teams within a department are using the same TiDB cluster. When slow SQL queries are discovered through the dashboard, it is difficult to trace which team member or program executed the slow SQL.
What solutions are generally used to resolve this issue?
Our current approach is to assign each team a user, and then check which user executed the slow SQL. However, this still feels inconvenient and doesn’t allow us to quickly identify which service executed the slow SQL.
Reply:
One common solution is to use SQL comments or hints to tag queries with additional metadata, such as the team or service name. This way, when a slow query is logged, the metadata can help quickly identify the source. Another approach is to use connection pooling with different connection strings for each service, allowing you to trace the origin of the queries more easily. Additionally, you can leverage TiDB’s built-in auditing and logging features to capture more detailed information about query execution.
Check the IP on the slow SQL page of the dashboard. It’s also in the slow SQL logs.
Because HAProxy is used as a load balancing proxy, the IP addresses are all the same.
You can configure HAProxy for pass-through, but after configuring it, you won’t be able to directly connect to the TiDB nodes. Please take note of this.
Refer to this article:
I feel that distinguishing based on usernames is the most straightforward. You can add a business prefix to the username to indicate different services.
Another method is to add your own exclusive comments to your SQL. This way, you will know exactly which business it is.
I also think it’s more convenient and straightforward to use different users.
Uh, if you know the business, wouldn’t it be easy to identify the type of business directly from the SQL content? I usually ask directly based on the business content… Of course, it’s best to first find a way to optimize the SQL, then take the optimized SQL to them and say, “Look, your SQL would be better and faster if written this way.” Generally, no one would disagree, right?
I feel this is better, as it quickly identifies which service it is. However, adding specific annotations to this, I don’t know how to manage it uniformly.
Why not use SLB? That way you can know which IP is connecting.
I feel that just knowing which IP is connected is not enough, as there may be multiple services on one machine.
Finally, I plan to do it this way: assign a unique identifier to each service. When each service executes SQL, automatically add a comment to the SQL through some middleware. This comment is the unique identifier assigned earlier. This way, when slow SQL is detected, it can be quickly identified which service executed it.
Yes, adding comments is a smart approach.
In addition to using different users as mentioned above, different applications can also use different databases. You can see that the corresponding databases are different from the table_schema.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.