Unable to call explain analyze in pyspark?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pyspark 中无法调用 explain analyze?

| username: ShawnYan

As shown in the figure, you can call explain in PySpark, but explain analyze does not work. Is there any limitation in PySpark?

>>> spark.sql("explain analyze select * from test.t1").collect()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/shawnyan/spark-3.3.1-bin-hadoop2/python/pyspark/sql/session.py", line 1034, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self)
  File "/home/shawnyan/spark-3.3.1-bin-hadoop2/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1322, in __call__
  File "/home/shawnyan/spark-3.3.1-bin-hadoop2/python/pyspark/sql/utils.py", line 196, in deco
    raise converted from None
pyspark.sql.utils.ParseException:
Syntax error at or near 'select'(line 1, pos 16)

== SQL ==
explain analyze select * from test.t1
----------------^^^

| username: Billmay表妹 | Original post link

Try using spark.sql(“explain select * from test.t1”) or spark.sql(“explain cost select * from test.t1”). This is Spark SQL, not TiDB SQL, so the syntax is different.
spark.sql(“explain select * from test.t1”)
https://spark.apache.org/docs/latest/sql-ref-syntax-qry-explain.html

| username: ShawnYan | Original post link

Yes, also add a comment in the blog.

| username: ShawnYan | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.