Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add generate_library.sh without post processing #1916

Merged
merged 52 commits into from
Aug 28, 2023

Conversation

JoeWang1127
Copy link
Collaborator

@JoeWang1127 JoeWang1127 commented Aug 10, 2023

Fixes #1922 ☕️

@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Aug 10, 2023
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Aug 11, 2023
@JoeWang1127
Copy link
Collaborator Author

JoeWang1127 commented Aug 15, 2023

The differences in the comparison is because the entries in gapic_metadata.json and package-info.java are in the same order of proto files in proto_library target in BUILD. I sorted the proto files with respect to file name to get a deterministic result (proto_files=$(find "$proto_path" -type f -name "*.proto" | sort)).

However, in a recent submit (cl/536794833), model_garden_service.proto is added after model_service.proto in BUILD. The generate_library.sh still generates library since model_garden_service.proto is before model_service.proto upon sorted and causes the difference.

This is not a source code difference and there are some approaches

  1. change the proto files's order in google3
  2. ignore the difference in gapic_metadata.json and package-info.java, though this is risky as real differences, lack of an entry for example, will be ignored.
  3. refactor generate_library.sh such that list all proto files in proto_path as the same order in BUILD. This will increase the input list of generate_library.sh and seems redundant since we have proto_path as an input. Of course, here the assumption is all proto files in proto_path should be generated, but why put it there if it's not for code generation?
  4. modify the generator to generate gapic_metadata.json and package-info.java with sorted proto files.

@meltsufin @blakeli0 @suztomo WDYT?
cc: @diegomarquezp

@blakeli0
Copy link
Collaborator

in a recent submit (cl/536794833), model_garden_service.proto is added after model_service.proto in BUILD

Can you take a look at the commit history? It might be manually added.
Either way, if the current generation process is assuming the protos are sorted, then we should add them as input as well, because we can not control the order of the protos, but the result is dependent on them so it makes the generation process not hermetic.

@JoeWang1127
Copy link
Collaborator Author

JoeWang1127 commented Aug 15, 2023

Can you take a look at the commit history? It might be manually added.

The CL's title suggested that it's auto-generated but I think the proto file is added manually.

Either way, if the current generation process is assuming the protos are sorted, then we should add them as input as well, because we can not control the order of the protos, but the result is dependent on them so it makes the generation process not hermetic.

I think by sorting the proto files, the result is deterministic and it is also hermetic as given the same proto files and dependencies, the generated code is the same. I added an approach in my previous comment that the generator can generate the two files with sorted proto files, is that possible?

If we choose to list the proto files as BUILD does, we are tied with BUILD and we need to change the list whenever a new proto files is added (or deleted) in proto_path unless I find an example such that a proto file is not generated in proto_path.

@blakeli0
Copy link
Collaborator

change the proto files's order in google3

This is not a sustainable approach as there could be many manual changes.

ignore the difference in gapic_metadata.json and package-info.java, though this is risky as real differences, lack of an entry for example, will be ignored.
modify the generator to generate gapic_metadata.json and package-info.java with sorted proto files.

In order for us to make a decision, we need to understand the usage of gapic_metadata.json and package-info.java. If the order of entries in these two files do not matter to customers, then we can change gapic-generator-java to always generated sorted gapic_metadata.json and package-info.java.

refactor generate_library.sh such that list all proto files in proto_path as the same order in BUILD. This will increase the input list of generate_library.sh and seems redundant since we have proto_path as an input. Of course, here the assumption is all proto files in proto_path should be generated, but why put it there if it's not for code generation?

This would be the only approach if the order matters in gapic_metadata.json and package-info.java. This does not mean that we are tied to BUILD file, the input could be a list of proto files. Similar to other parameters like transport, it is from BUILD file but it does mean we are tied to it.

@JoeWang1127
Copy link
Collaborator Author

In order for us to make a decision, we need to understand the usage of gapic_metadata.json and package-info.java. If the order of entries in these two files do not matter to customers, then we can change gapic-generator-java to always generated sorted gapic_metadata.json and package-info.java.

The package-info.java has mainly comments and I guess the difference in comments do not matter.

The gapic_metadata.json lists metadata of the grpc services. I think typically json file is parsed into map-like structure so the order of the key does not matter.

Of course there maybe other use cases and I'll do some research.

library_generation/README.md Outdated Show resolved Hide resolved
library_generation/README.md Outdated Show resolved Hide resolved
library_generation/README.md Show resolved Hide resolved
library_generation/utilities.sh Show resolved Hide resolved
library_generation/README.md Outdated Show resolved Hide resolved

## Prerequisite
In order to generate a GAPIC library, you need to pull `google/` from [googleapis](https://github.com/googleapis/googleapis)
and put it into the directory containing `generate_library.sh` since protos in `google/`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to copy the whole google/ folder? Or maybe only the common protos?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some protos reference google/api, google/iam, google/type. I'm not sure whether we have a exhausted list of protos such that only those protos can be referenced. So copy google folder is a safe option.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this PR, but I think we should come up with an exhaustive list before we switch to use it in our generation process. As google/ folder contains protos from all services, which is a lot more than what we need, and it could easily confuse people in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but do we have a guidance that a proto should not reference a proto outside of a list? If not, the list is likely changing and we are back to copy google.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, the service team can reference protos from anywhere, but cross service proto depedencies are not allowed per https://google.aip.dev/213, so the service protos should only reference common protos from a few selected folders. I think we may have two services that have cross service proto dependencies but all other services should obey it.

Now regarding common protos, we don't have a guidance on where they should be and it's possible that new common protos are added to new locations. But if it happens, I think it would be better to explicitly add it, similar to explicitly declare a new dependency vs getting a new dependency transitively from a existing dependency in Maven's world.

We can discuss more in next project meeting.

library_generation/utilities.sh Show resolved Hide resolved
## An example to generate a client library
```
library_generation/generate_library.sh \
-p google/cloud/confidentialcomputing/v1 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add comments or more examples regarding proto_path that all protos referenced in the path has to be in the current working directory(assuming my understanding is correct from another comment)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example of generating showcase client.

@sonarcloud
Copy link

sonarcloud bot commented Aug 25, 2023

[gapic-generator-java-root] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@sonarcloud
Copy link

sonarcloud bot commented Aug 25, 2023

[java_showcase_integration_tests] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@sonarcloud
Copy link

sonarcloud bot commented Aug 25, 2023

[java_showcase_unit_tests] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@JoeWang1127 JoeWang1127 merged commit ffc058a into main Aug 28, 2023
39 checks passed
@JoeWang1127 JoeWang1127 deleted the feat/generate-aiplatform-without-postprocessing branch August 28, 2023 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a shell script to generate a java library hermetically (without post processing)
6 participants