Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adhoc metrics #30202

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

fix: adhoc metrics #30202

wants to merge 1 commit into from

Conversation

betodealmeida
Copy link
Member

SUMMARY

Rewrite has_table_query to use sqlglot instead of sqlparse.

Part of SIP-117.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A

TESTING INSTRUCTIONS

The current tests pass. Also added a regression that was not passing with sqlparse.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added the change:backend Requires changing the backend label Sep 9, 2024
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited to see this happening! Minor nit + question on some of the removed logic

@@ -1177,46 +1106,31 @@ class InsertRLSState(StrEnum):
FOUND_TABLE = "FOUND_TABLE"


def has_table_query(token_list: TokenList) -> bool:
def has_table_query(expression: str, engine: str) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd maybe using statement instead of expression in this context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it can be an actual expression, since we use for validating adhoc metrics — it could be COUNT(*), eg.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's rewrite this take a statement, and we can wrap it in a SELECT () where we have only an expression.

Comment on lines -1201 to -1217
# Recurse into child token list
if isinstance(token, TokenList) and has_table_query(token):
return True

# Found a source keyword (FROM/JOIN)
if imt(token, m=[(Keyword, "FROM"), (Keyword, "JOIN")]):
state = InsertRLSState.SEEN_SOURCE

# Found identifier/keyword after FROM/JOIN
elif state == InsertRLSState.SEEN_SOURCE and (
isinstance(token, sqlparse.sql.Identifier) or token.ttype == Keyword
):
return True

# Found nothing, leaving source
elif state == InsertRLSState.SEEN_SOURCE and token.ttype != Whitespace:
state = InsertRLSState.SCANNING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these no longer needed for injecting RLS state?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, confusingly here we're just using the InsertRLSState enum do to the table scan.

(I also have a PR almost ready where I move all the RLS functions to sqlglot.)

@betodealmeida betodealmeida changed the title fix: adhoc queries fix: adhoc metrics Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:backend Requires changing the backend preset-io size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants