Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-279] Delta tables missing column schema in show table extended #295

Closed
jtcohen6 opened this issue Feb 24, 2022 · 2 comments
Closed

[CT-279] Delta tables missing column schema in show table extended #295

jtcohen6 opened this issue Feb 24, 2022 · 2 comments
Labels
bug Something isn't working Stale

Comments

@jtcohen6
Copy link
Contributor

Context: When generating the catalog for documentation, we switched to using one show table extended in databasename like '*', instead of a separate describe table extended for each individual table. See:

(We could still meaningfully speed that up, by passing the specific table names into show table extended instead of *. This will benefit users with many thousands of objects per Spark database, of which dbt is concerned with a small subset.)

Complication: We believe that open source Delta tables fail to return their column schema in show table extended. This originally came up in:

We believe this is the root cause of dbt-labs/dbt-docs#236. Columns are missing / have null names, which is not something that the dbt-docs code is expecting to be possible.

I'm not sure what action we can take here, beyond opening an issue in https://github.com/delta-io/delta. I'd prefer to avoid a return to the much slower describe table extended approach, or trying to finagle support for both. I also understand that Apache Spark will eventually add support for a real information schema :)

@zsvoboda
Copy link

zsvoboda commented Jun 28, 2022

Actually, I'm using Spark with Iceberg tables and have the same issue.

show table extended in warehouse like '*';

SQL Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#21, tableName#22, isTemporary#23, information#24]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@7929bdd7

at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:230)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:225)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:239)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

Caused by: org.apache.spark.sql.AnalysisException: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#21, tableName#22, isTemporary#23, information#24]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@7929bdd7

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

2 participants