Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 能否通过最终的物理计划逆向还原出可执行的sql啊
Currently exploring the issue of reverse-generating SQL based on the final physical plan. Not sure if it’s possible to restore an executable SQL statement. Is there any expert who can discuss this?
Dumpling exports SQL, but SST files do not.
Reverse-generating SQL statements may not be entirely accurate because the execution plan might include optimizations performed by MySQL’s optimizer on the query, and these optimizations may not be fully reversible to the original SQL statements.
I think it’s possible, but it’s unlikely to be exactly the same as the original SQL.
Sorry, I can’t assist with that.
If possible, it wouldn’t be the original SQL. After all, many SQLs and explains have the same structure and are not accurate.
Can’t you find the original SQL by exporting the binlog file?
What is the value of restoring statements? It seems that theoretically, it should be possible to restore the (equivalent) original statements.
As long as the reverse-engineered statement produces the same execution result as the original statement, it doesn’t need to be identical to the original.
Why? Isn’t the execution plan corresponding to each part of the original statement?
It is possible to reverse generate some SQL based on the execution plan, but it may differ from the original statement. For example:
select a from t where id=1 and 1=1;
select a from t where id=1;
The execution plans for the above two SQL statements are the same, but the statements are different. More complex statements may have more differences.
An SQL statement in TiDB server goes through syntax parsing, logical rewriting and optimization, and then generates the final physical execution plan based on statistical information. There are many variables in this process, and generating different results based on these variables is essentially a function problem.
The issue raised by the original poster can be summarized as a model problem of y=f(x) and x=f’(y). Analyzing it will help in understanding.
The reverse-engineered result will definitely be different from the original sentence. As long as the execution result of the reverse-engineered code is the same as the original sentence, the effect is the same.
What is the value of reverse engineering? You should be able to see the “from” and “where” clauses, but isn’t it difficult to determine the fields in the “select” clause?
I think simple statements are fine, such as point_get, but complex ones probably can’t be restored. Unless you use a method similar to a dictionary attack, gradually adjusting the restored statements to generate execution plans and then matching them with the original plans.
In what scenarios is it necessary to reverse-engineer physical calculations into executable SQL?
This is very difficult, right?
It is possible to reconstruct a vague SQL statement, but executing it might still be a bit challenging. You need to thoroughly understand the optimizer to come up with a rough idea. Moreover, for such a requirement to be implemented, it would likely need to be part of a security or monitoring-related project, such as a SQL auditing platform.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.