Is it possible to reverse-engineer the executable SQL from the final physical plan?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 能否通过最终的物理计划逆向还原出可执行的sql啊

| username: TiDBer_U58GZgGJ

Currently exploring the issue of reverse-generating SQL based on the final physical plan. Not sure if it’s possible to restore an executable SQL statement. Is there any expert who can discuss this?

| username: Fly-bird | Original post link

Dumpling exports SQL, but SST files do not.

| username: ti-tiger | Original post link

Reverse-generating SQL statements may not be entirely accurate because the execution plan might include optimizations performed by MySQL’s optimizer on the query, and these optimizations may not be fully reversible to the original SQL statements.

| username: zhanggame1 | Original post link

I think it’s possible, but it’s unlikely to be exactly the same as the original SQL.

| username: TiDBer_小阿飞 | Original post link

Sorry, I can’t assist with that.

| username: 大飞哥online | Original post link

If possible, it wouldn’t be the original SQL. After all, many SQLs and explains have the same structure and are not accurate.

| username: 像风一样的男子 | Original post link

Can’t you find the original SQL by exporting the binlog file?

| username: yiduoyunQ | Original post link

What is the value of restoring statements? It seems that theoretically, it should be possible to restore the (equivalent) original statements.

| username: TiDBer_U58GZgGJ | Original post link

As long as the reverse-engineered statement produces the same execution result as the original statement, it doesn’t need to be identical to the original.

| username: andone | Original post link

It probably won’t work.

| username: TiDBer_U58GZgGJ | Original post link

Why? Isn’t the execution plan corresponding to each part of the original statement?

| username: Jellybean | Original post link

It is possible to reverse generate some SQL based on the execution plan, but it may differ from the original statement. For example:

select a from t where id=1 and 1=1;
select a from t where id=1;

The execution plans for the above two SQL statements are the same, but the statements are different. More complex statements may have more differences.

An SQL statement in TiDB server goes through syntax parsing, logical rewriting and optimization, and then generates the final physical execution plan based on statistical information. There are many variables in this process, and generating different results based on these variables is essentially a function problem.

The issue raised by the original poster can be summarized as a model problem of y=f(x) and x=f’(y). Analyzing it will help in understanding.

| username: TiDBer_U58GZgGJ | Original post link

The reverse-engineered result will definitely be different from the original sentence. As long as the execution result of the reverse-engineered code is the same as the original sentence, the effect is the same.

| username: 春风十里 | Original post link

What is the value of reverse engineering? You should be able to see the “from” and “where” clauses, but isn’t it difficult to determine the fields in the “select” clause?

| username: 双开门变频冰箱 | Original post link

I don’t think so.

| username: Kongdom | Original post link

I think simple statements are fine, such as point_get, but complex ones probably can’t be restored. Unless you use a method similar to a dictionary attack, gradually adjusting the restored statements to generate execution plans and then matching them with the original plans.

| username: FutureDB | Original post link

In what scenarios is it necessary to reverse-engineer physical calculations into executable SQL?

| username: 小于同学 | Original post link

This is very difficult, right?

| username: TiDBer_5cwU0ltE | Original post link

It is possible to reconstruct a vague SQL statement, but executing it might still be a bit challenging. You need to thoroughly understand the optimizer to come up with a rough idea. Moreover, for such a requirement to be implemented, it would likely need to be part of a security or monitoring-related project, such as a SQL auditing platform.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.