Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correctly integrate cancelAllRuns audit events with job runs #1281

Open
wants to merge 15 commits into
base: 0900_release
Choose a base branch
from

Conversation

neilbest-db
Copy link
Contributor

This is a reboot of #1174 to resolve #480. After extensive refactoring in #1223, #1224, and #1253 it was necessary to abandon #1174 and bring its new logic in manually.

gueniai and others added 15 commits May 31, 2024 15:03
commit a6a13fe
Author: Neil Best <neil.best@databricks.com>
Date:   Thu May 23 16:39:58 2024 -0500

    improve TransformationDescriberTest

commit 1f145aa
Author: Neil Best <neil.best@databricks.com>
Date:   Thu May 23 15:25:29 2024 -0500

    Add descriptive job group IDs and named transformations

    This makes the Spark UI more developer-friendly when analyzing
    Overwatch runs.

    Job group IDs have the form <workspace name>:<OW module name>

    Any use of `.transform( df => df)` may be replaced with
    `.transformWithDescription( nt)` after instantiating a `val nt =
    NamedTransformation( df => df)` as its argument.

    This commit contains one such application of the new extension method.
    (See `val jobRunsAppendClusterName` in `WorkflowsTransforms.scala`.)

    Some logic in `GoldTransforms` falls through to elements of the
    special job-run-action form of Job Group IDs emitted by the platform
    but the impact is minimal relative to the benefit to Overwatch
    development and troubleshooting.  Even so this form of Job Group ID is
    still present in initial Spark events before OW ETL modules begin to
    execute.

commit da0c55a
Author: Guenia <guenia.izquierdo@databricks.com>
Date:   Wed May 8 19:43:29 2024 -0400

    Initial commit
Removed a level of indirection and unnecessary conditional branching
in definition of chained `lookupWhen` transformations.

Moved defintions to have references to `PipelineTable` objects in
scope rather than passing them by argument.

(cherry picked from commit efdd63f)
- enable auto-optimized shuffle for module 2011

- move caching action to previous `NamedTransformation` for more
  meaningful Spark UI labels
for greater visibility in Spark UI. `NamedTransformation` type name
now appears in labels' second position.
prevent certain regressions when the Job Group labels set by the platform are no longer available for parsing.  Labels set by the platform contain tokens that are necessary to preserve referential integrity under certain conditions.  (Which conditions?)
from branch `480_workflows_support_cancelAllRuns-FUBAR` that was the
original dev branch, renamed when automated merge screwed everything up.
@neilbest-db neilbest-db added enhancement New feature or request data quality There is a data quality issue here labels Aug 27, 2024
@neilbest-db neilbest-db added this to the 0.9.0.0 milestone Aug 27, 2024
Copy link

sonarcloud bot commented Aug 27, 2024

@neilbest-db neilbest-db linked an issue Aug 27, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data quality There is a data quality issue here enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEAT] - Workflows - Add support for "cancelAllRuns"
4 participants