Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sept 23 upstream 5 0.2.122 #306

Merged
merged 17 commits into from
Sep 19, 2023
Merged

Sept 23 upstream 5 0.2.122 #306

merged 17 commits into from
Sep 19, 2023

Conversation

illusional
Copy link

Depends on: #305

Last update to get up to date 😌

daniel-goldstein and others added 15 commits September 5, 2023 21:41
…13500)

CHANGELOG: In Query-on-Batch, the client-side Python code will not try
to list every job when a QoB batch fails. This could take hours for
long-running pipelines or pipelines with many partitions.

I also added API types to the list jobs end point because I have to go
hunting for this every time anyway. Seems better to have this
information at our digital fingertips.
Remove the code computing loop regions in lir. They aren't currently
being used, and the recursive function is causing stack overflows.
I also had to fix issues that pyright discovered.
CHANGELOG: The `n` parameter of `MatrixTable.tail` is deprecated in
favor of a new `n_rows` parameter.

MT.head already does this.
This appers to have been lost in hail-is#8268. There is no explicit mention of
it and it was inconsistently removed (e.g. loftee in GRCh38 still has
the `&`). I assume it was a mistake.
Consider, for example, this deploy: https://ci.hail.is/batches/7956812.
`test-dataproc-37` succeeded but `test-dataproc-38` failed (it timed out
b/c the master failed to come online).

You can see the error logs for the cluster here:
https://cloudlogging.app.goo.gl/t1ux8oqy11Ba2dih7

It states a certain file either did not exist or we did not have
permission to access it.

[`test_dataproc-37`](https://batch.hail.is/batches/7956812/jobs/193) and
[`test_dataproc-38`](https://batch.hail.is/batches/7956812/jobs/194)
started around the same time and both uploaded four files into:


gs://hail-30-day/hailctl/dataproc/ci_test_dataproc/0.2.121-7343e9c368dc/

And then set it to public read/write. The public read/write means that
permissions are not the issue.

Instead, the issue is that there must be some sort of race condition in
GCS which means that if you "patch" (aka overwrite) an existing file, it
is possible that a concurrent reader will see the file as not existing.

Unfortunately, I cannot confirm this with audit logs of the writes and
read because [public objects do not generate audit
logs](https://cloud.google.com/logging/docs/audit#data-access).
> Publicly available resources that have the Identity and Access
Management policies
[allAuthenticatedUsers](https://cloud.google.com/iam/docs/overview#allauthenticatedusers)
or [allUsers](https://cloud.google.com/iam/docs/overview#allusers) don't
generate audit logs. Resources that can be accessed without logging into
a Google Cloud, Google Workspace, Cloud Identity, or Drive Enterprise
account don't generate audit logs. This helps protect end-user
identities and information.
All this `HAIL_AZURE_OAUTH_SCOPE` and `HAIL_IDENTITY_PROVIDER_JSON` are
now injected by batch workers by default so we don't need to specify
them explicitly in `build.yaml`.

Unrelatedly, I don't think `create_dummy_oauth2_client_secret` does
anything as `auth-oauth2-client-secret` already exists inside new
namespaces.
In particular, struct field names which clash with methods on
`StructExpression`.

close hail-is#13495

CHANGELOG: Fix a bug where field names can shadow methods on the
StructExpression class, e.g. "items", "keys", "values". Now the only way
to access such fields is through the getitem syntax, e.g.
"some_struct['items']". It's possible this could break existing code
that uses such field names.
hail-is#13585)

or domain specified. The key bug here was a check for `'domain' not in
config` instead of `config['domain'] is None`
@illusional illusional changed the base branch from main to sept-23-upstream-4-dc1f086 September 8, 2023 00:42
@illusional illusional force-pushed the sept-23-upstream-5-0.2.122 branch 2 times, most recently from 9cc40a8 to a56649f Compare September 12, 2023 23:38
@jmarshall
Copy link

LGTM up until where you asked me to review it.

I don't understand what that last s/google_storage_bucket/module/g commit is about. Did we change this infra/gcp/main.tf file in our branch somewhere along the line?

@illusional
Copy link
Author

@jmarshall, broad don't use this file, just us. So they made changes reflecting what's supposed to happen, but there was a typo. I worked with Daniel to resolve this upstream

@illusional illusional marked this pull request as ready for review September 19, 2023 20:57
Base automatically changed from sept-23-upstream-4-dc1f086 to main September 19, 2023 21:21
@illusional illusional merged commit 73e49d6 into main Sep 19, 2023
5 checks passed
@illusional illusional deleted the sept-23-upstream-5-0.2.122 branch September 19, 2023 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants