How to Quickly Identify the Source of Each Slow SQL Query

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 如何快速知道各个慢 sql 的来源

| username: withseid

Currently, multiple teams within a department are using the same TiDB cluster. When slow SQL queries are discovered through the dashboard, it is difficult to trace which team member or program executed the slow SQL.
What solutions are generally used to resolve this issue?

Our current approach is to assign each team a user, and then check which user executed the slow SQL. However, this still feels inconvenient and doesn’t allow us to quickly identify which service executed the slow SQL.

One common solution is to use SQL comments or hints to tag queries with additional metadata, such as the team or service name. This way, when a slow query is logged, the metadata can help quickly identify the source. Another approach is to use connection pooling with different connection strings for each service, allowing you to trace the origin of the queries more easily. Additionally, you can leverage TiDB’s built-in auditing and logging features to capture more detailed information about query execution.

| username: h5n1 | Original post link

Check the IP on the slow SQL page of the dashboard. It’s also in the slow SQL logs.

| username: withseid | Original post link

Because HAProxy is used as a load balancing proxy, the IP addresses are all the same.

| username: caiyfc | Original post link

You can configure HAProxy for pass-through, but after configuring it, you won’t be able to directly connect to the TiDB nodes. Please take note of this.
Refer to this article:

| username: withseid | Original post link

Try this tomorrow.

| username: hey-hoho | Original post link

I feel that distinguishing based on usernames is the most straightforward. You can add a business prefix to the username to indicate different services.

| username: buddyyuan | Original post link

Another method is to add your own exclusive comments to your SQL. This way, you will know exactly which business it is.

| username: 我是咖啡哥 | Original post link

I also think it’s more convenient and straightforward to use different users.

| username: tidb菜鸟一只 | Original post link

Uh, if you know the business, wouldn’t it be easy to identify the type of business directly from the SQL content? I usually ask directly based on the business content… Of course, it’s best to first find a way to optimize the SQL, then take the optimized SQL to them and say, “Look, your SQL would be better and faster if written this way.” Generally, no one would disagree, right?

| username: withseid | Original post link

I feel this is better, as it quickly identifies which service it is. However, adding specific annotations to this, I don’t know how to manage it uniformly.

| username: 特雷西-迈克-格雷迪 | Original post link

Why not use SLB? That way you can know which IP is connecting.

| username: withseid | Original post link

I feel that just knowing which IP is connected is not enough, as there may be multiple services on one machine.

Finally, I plan to do it this way: assign a unique identifier to each service. When each service executes SQL, automatically add a comment to the SQL through some middleware. This comment is the unique identifier assigned earlier. This way, when slow SQL is detected, it can be quickly identified which service executed it.

| username: 特雷西-迈克-格雷迪 | Original post link

Yes, adding comments is a smart approach.

| username: Hacker_xUwtuKxa | Original post link

In addition to using different users as mentioned above, different applications can also use different databases. You can see that the corresponding databases are different from the table_schema.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.