Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Execution Environments #274

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

RFC: Execution Environments #274

wants to merge 5 commits into from

Conversation

hone
Copy link
Member

@hone hone commented Feb 1, 2023

@hone hone requested a review from a team February 1, 2023 08:10
@buildpack-bot
Copy link
Member

Maintainers,

As you review this RFC please queue up issues to be created using the following commands:

/queue-issue <repo> "<title>" [labels]...
/unqueue-issue <uid>

Issues

(none)

Signed-off-by: Terence Lee <hone02@gmail.com>

In order to support additional execution environments an `exec-env` key will be added to various TOML tables in the project. The value can be any string with `all` having special meaning. `all` will apply to all execution environments and will be the default if not specified. This should make it backwards compatible and optional. When `exec-env` is not set to `all`, the table settings will only be applied to that execution environment.

### Project Descriptor - `project.toml` (App Developers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of a full project.toml that is used for producing a test image and a production image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for Project Descriptor?

@samj1912
Copy link
Member

samj1912 commented Feb 1, 2023

I'm having a difficulty trying to grasp how environment is being used by different toml files. @hone could you please provide examples of places where you imagine it being used and how that information will be leveraged by buildpacks/platform?

Signed-off-by: Terence Lee <hone02@gmail.com>
Signed-off-by: Terence Lee <hone02@gmail.com>
* test
* development

### Buildpack API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps get a conceptual example of what a buildpack author might do with this new execution environment information?

I have in my head a language buildpack author may...

  • install test group dependencies
  • create ruby-tests process / ruby-tests-verbose process
  • Skip cleaning up things they might clean up otherwise for production images
  • Set env vars to test mode operation (RAILS_ENV or similar)

@loewenstein
Copy link
Contributor

Frankly, I don't get this proposal aligned with my mental model of what buildpacks do and how they help.

For me, container images are just another software artefact nowadays produced to ship software. Like a while back one was to produce JAR and/or WAR files that got delivered and finally deployed into some execution environment.
Buildpacks are a great tool to create those, but I don't see how or why I would create a test container that was separate from the production ones. I would want my tests to validate exactly the artefact that I am about to deliver. Otherwise, how do I know I checked the right thing.

From my POV buildpacks should focus on that task, creating the container image, and leave the other CI tasks to specialized tools.

Please help me to adjust my mental model, why's this added complexity in the buildpack spec worth it?

@jabrown85
Copy link
Contributor

Please help me to adjust my mental model, why's this added complexity in the buildpack spec worth it?

I'm not the author but I can speak for my own mental model.

One use case is from the app developer side. Today they can pack build <img> to get a production artifact.

If we imagine the resulting image is ruby + nodejs, the developer has to setup both of those environments locally to develop and AGAIN on say GitHub Actions. The way those installations happen can differ. The GitHub action may install a different minor version of node or ruby to run the tests for instance.

Setting up CI with buildpacks could make this experience easier. pack test would use the same buildpacks and therefore test a more prod-like experience as the buildpack would install the same versions. This is especially important if your buildpack has options like DO_THIS_GARBAGE_COLLECTION_SETTING that you would have to know and replicate in your CI environment.

Another future-proofing thing here may be around future ARM support. Building and running an ARM test image via pack test seems tractable.

# Unresolved Questions
[unresolved-questions]: #unresolved-questions

- "env" is overloaded as a word since we also use it for environment variables. Is there a better word here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "mode"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like "context", but that's just as overloaded. How about "purpose = test" or "intent = test"?


### `builder.toml` (Builder Authors)

The only table that `exec-env` will be added to is `[[order.group.env]]`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean [[order.group]]?

With the test OCI Image, a platform can execute the tests in the pipeline as they see fit. This means a bulk of the responsibilities are platform concerns:

- Set which environment to build for
- Decide which buildpacks to execute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a lifecycle concern? i.e., when processing a group within an order, should it skip buildpacks that do not declare exec-env matching the desired env?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When talking to @jabrown85, I thought the Platform/Builder provide the order.toml to lifecycle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From WG, lifecycle will decide which buildpacks to run based on the execution environment being passed along by the platform.

- How to execute the tests
- What is the test result format like [TAP](https://en.wikipedia.org/wiki/Test_Anything_Protocol)?
- How to process the test results
- What to do with the results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever something we would want to spec? Having a consistent way for buildpacks to e.g., dump test output could help ensure portability across platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see us wanting to do that, but I wasn't sure how much we wanted to impose standards there.

- Should the execution environments be an enum or flexible as a string?
- enums will help encourage standardization across buildpacks and platforms.
- strings can help account for use cases we haven't thought of yet.
- Should buildpacks be allowed specify allowlist execution environments?
Copy link
Member

@natalieparellano natalieparellano Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any downsides to doing this? It would be more flexible and possibly avoid some duplication within orders (we could keep all as a special value)

## Development Environments
The specifics of creating development enviroments are out of scope of this RFC, but it's not hard to extrapolate how these kind of changes can assist in creating Buildpacks for development environments.

# How it Works
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see the pack flow as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add anything to the app image to designate it as being built for a particular environment? To avoid users accidentally deploying a test image in production...

I could see folks wanting to use the same tag when re-building a test image for production, in order to use previously cached dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natalieparellano that's a good question. Do you think cached dependencies should be shared b/t different execution environments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a label the best place to designate the execution environment?

@kanaksinghal
Copy link

I have a use-case for this.
In our company we use a variety of languages/stack and I wanted to write a generic CI pipeline where buildpacks can help us build the container image by detecting the stack but then to run unit test, I need to detect the stack again myself and run specific commands for nodejs/go/java etc.
In a typical CI pipeline we would do unit-test, run sonar quality checks, and build container image.

@jama22
Copy link

jama22 commented Feb 9, 2023

Really excited for this proposal, and I think there's a lot of interesting ways to make use of it.

In the context of a self-hosted PaaS
I've seen a common dynamic where the "operator" needs to establish fine grained controls over what the final containers look like. These configurations are also prone to change as the artifact itself moves into different environments (service connectors, certificates, environment variables, access to secrets managers, etc.)

On the developer side, most folks don't ever see production-like environments, so being able to debug their applications in production-like environments becomes incredibly difficult. In the past, I've seen some of these per-environment configurations being managed through GitOps and config files. This gives the user some control over managing the movement of the workload through the system, but even still they may not be able to build that container and interact with it directly

Moving the environment configuration opens up a lot of possibilities in this use case. I could image an operator encoding their per-environment configurations into the builder, and giving the developer the ability to simulate how their application will change in behavior across configurations.

For CI/CD
I think there have been some interesting ideas already discussed above and in the proposal. But I think a common use case that i can come up with is if you're building a library with buildpacks, you'll want to test on multiple OS-distros and architecture combinations (e.g. x86 vs win vs arm, ubuntu vs debian vs rhel). Those env configurations are commonly configured in CI itself or possibly externalized to some other config. Moving them back into the buildpack may provide for an improved UX

Changing what "prod-like" means
I think there's a pretty reasonable use case here where operators may want to give developers the ability to change what "prod-like" means. For example, the GCP buildpacks has GOOGLE_DEVMODE that can be enabled for faster changes. I can see it being used as a developer-oriented configuration while maintaining guardrails in prod-like envs


There were some flaws in this design. Though it's clean to separate production and test code paths, they end up sharing a lot of code. Many of the bash based Heroku buildpacks would just [call `bin/compile`](https://github.com/heroku/heroku-buildpack-nodejs/blob/main/bin/test-compile#L24) with different parameters/env vars.

## [GOOGLE_DEVMODE](https://cloud.google.com/docs/buildpacks/service-specific-configs#google_devmode)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to devmode, I just ran across live reloading in the Paketo python buildpacks https://paketo.io/docs/howto/python/#using-bp_live_reload_enabled


## `exec-env` key in TOML

In order to support additional execution environments an `exec-env` key will be added to various TOML tables in the project. The value can be any string with `all` having special meaning. `all` will apply to all execution environments and will be the default if not specified. This should make it backwards compatible and optional. When `exec-env` is not set to `all`, the table settings will only be applied to that execution environment.
Copy link
Member

@joshwlewis joshwlewis Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would consume this new key? If a platform or user ran pack build -e CNB_EXEC_ENV=test my-app:test, a buildpack might contribute new launch entries flagged for test, but how would a platform or user determine which of the launch entries were all or test? My guess is at the very least, we'd want this key to show up in the io.buildpacks.build.metadata process-types label data?

hone and others added 2 commits April 12, 2023 13:37
Co-authored-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Terence Lee <hone02@gmail.com>
Co-authored-by: Josh W Lewis <josh.w.lewis@gmail.com>
Signed-off-by: Terence Lee <hone02@gmail.com>
@cz4rny
Copy link

cz4rny commented Jul 11, 2023

Two questions:

  1. How would one extract artifacts from the test env? Running unit tests, or any other static-analysis tools produces outputs separate from the end OCI image. Currently pack performs the build in a tmp directory, which is cleaned out at the end. How would one access the results of all the checks done in the dev env?

  2. How would I ensure, that the code produced in another pack built, to the production environment is tested? Having the CNB_EXEC_ENV set could introduce a completely different set of layers and hence a different output. So I would test the source code and the binary produced from the test environment, but that doesn't mean I've tested the exact same binary, source code, and artifacts produced in the production env.

@jabrown85
Copy link
Contributor

Two questions:

  1. How would one extract artifacts from the test env? Running unit tests, or any other static-analysis tools produces outputs separate from the end OCI image. Currently pack performs the build in a tmp directory, which is cleaned out at the end. How would one access the results of all the checks done in the dev env?

This RFC aims to only produce a test image artifact. A test platform would then execute the image and collect the results (TAP/json/etc).

  1. How would I ensure, that the code produced in another pack built, to the production environment is tested? Having the CNB_EXEC_ENV set could introduce a completely different set of layers and hence a different output. So I would test the source code and the binary produced from the test environment, but that doesn't mean I've tested the exact same binary, source code, and artifacts produced in the production env.

There would be no such guarantee. Production images vary quite a bit from stack to stack and there is no one size fits all solution for testing. For a practical example, a ruby or node app will often have test environment specific dependencies that should not make their way into the final production images. Another example is a simple go app on a scratch/tiny stack. The go tooling itself, go test, should not be packaged into the final production image.

Thinking out loud, a test pipeline could grab the SBOM/digests/shasum of the things it cares about and compare them to the production image that was built.

@cz4rny
Copy link

cz4rny commented Jul 13, 2023

Right, so right now the image produced is a self-executing app built, whether it's npm, or go or whatever.

The test image would not execute anything but would contain the app with all of the, keeping to the npm example, dev dependencies. Maybe it's even built. And the pipeline would use that image, run its tools on top of that (static analysis, test, etc.), grab the produced outputs, and publish them.

@jabrown85
Copy link
Contributor

The way I understand this is the produced image would have test processes contributed by buildpacks.

A golang buildpack would create a layer that contains go testing tools and maybe even run a go build ./... during the build phase so you can fail early. The go buildpack could contribute test process to launch.toml that runs go test ./.... That would be the default process launched when the image is executed in a testing pipeline. As you said, the pipeline could also choose to run static analysis as well as any other process on the image (maybe lint in this example) and capture the results.

@keskad
Copy link

keskad commented Oct 6, 2023

Hi,

I wanted to comment the concept with a use-case. So, I don't see the point in creating a production images in environment X, while testing in environment Y. That's not consistent way of doing CD pipelines (current behavior) 🙂

I give buildpacks to teams, they set e.g. BP_NODE_VERSION 16.1 and it will execute with Node 16.1, while in tests it will still execute with hardcoded Node 16.0 because I have to maintain additional images. Using buildpacks in this scenario is loosing sense for me 🙂 More consistent way is to just use the Dockerfile with base image used in tests, because it has consistent Maven and Java version inside.

Due to nature of buildpacks - preparing the environment (by installing required tooling, setting up those tools) - I think the test phase is even a neccesary scenario to implement a valid CD pipeline in order to gain reproducible environment everytime. The tests needs to be running on something that is CLOSE TO PRODUCTION 🙂

So I see that build phase even as neccessary to implement a valid CD pipeline, so I like this proposal.

The case is similar to the principle, where we should not build artifact twice if promoting to next environment e.g. dev -> prod, but just reuse the artifact. There we should use the same build parameters, same tooling.

Its difficult to maitain e.g. Node 14, Node 15, Java 8, Java 11, and 15 other base images just for tests, while giving teams a possibility to set a Node or Java version in buildpacks.

A testing phase in my opinion would benefit in:

  • not maintaing the base images internally just for running tests (in bigger organizations its a huge effort)
  • have a consistent way of setting up both test & prod tooling (versions and parameters - e.g. team sets Node v16.1, then it is working in both test & prod)
  • forget about testing command for each project (buildpack test phase would know a testing command for each technology just as it knows how to run the build)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.