Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new changelog process #67335

Closed
5 tasks done
pugnascotia opened this issue Jan 12, 2021 · 9 comments
Closed
5 tasks done

Implement new changelog process #67335

pugnascotia opened this issue Jan 12, 2021 · 9 comments
Assignees
Labels
:Delivery/Build Build or test infrastructure >enhancement Team:Delivery Meta label for Delivery team v8.2.0

Comments

@pugnascotia
Copy link
Contributor

pugnascotia commented Jan 12, 2021

Summary

This issue concerns implementing a new process and tooling for generating the release notes for an Elasticsearch release. Instead of pulling data from GitHub directly, each PR will add a file to the repository that contains all the required data.

  • Document in the team repo:
    • The new process for generating the notes
    • Changes to the version bump / release day processes
    • The changelog file format
  • Build new tooling to verify the changelog files and generate the outputs

Background

Our current process for generating the release notes / changelog for Elasticsearch relies on pulling information directly from GitHub. There are issues with this approach:

  • If a change is backed out of a release branch but we forget to remove a label from the original PR, then the release notes can include a change that was actually removed
  • There's no opportunity to review the associated release notes, release highlights or breaking changes description while reviewing the code changes in a PR
  • PRs can be missed from the release notes if they are merged after the release notes are generated and we don't back back to add the PR

Some years ago, we experimented with including a changelog file in the ES repository, which was updated with each PR. This approach was quickly abandoned due to the number of merge conflicts that it generated.

Instead, we now propose that each release branch has a dedicated directory (for example changelog but it doesn't have to be that) which is populated with a file per PR and contains all the information necessary to generate the release notes. This needs to use a structured format so that it can easily processed with tools.

File format

Each changelog file must contain all the information required to generate the notes. This probably includes, but is not limited to, the following.

  • PR number
  • Associated issues numbers
  • Type of change e.g enhancement, feature, bugfix etc
  • The change area e.g Core/Features or Search/Mapping
  • Whether it is a breaking change
  • One-line summary of the change

We could simply have a labels field that mirrors the PR's GitHub labels. However, the current process uses the GitHub labels directly, which means the ES release point person often has to decide what area and change type to select for a number of PRs, where they are labelled for multiple areas and change types. It would be better to move this burden to PR authors, who are better placed to make these decisions.

The obvious file formats are JSON or YAML. YAML has the advantage of being easier for humans to read and edit.

We should enforce the existence and validity of changelog entries for PRs whose labels do not exempt them from the changelog (e.g. test fixes, build-related work).

A prototype generator exists, written in Python. We should rewrite this in Java as a Gradle task, so that any team member can update it. This task must carefully verify the input files to ensure that all requires fields are present and there are no typos in the key names.

Open questions

  1. How should we handle backports? Should the changelog entries go only to the earliest release branch? Or should each changelog file contain the list of release at which the PR was targeted, so that the generation process can omit changelog entries that have already been released?
  2. When should the process for generating the asciidoc documents from the changelog files be run?
    • We could, for instance, check in the precommit task that running the generation step results in no changes in the current checkout. This would require whoever adds a changelog file to also run the generation step and commit the result, but this will result in merge conflicts in the generated outputs, leaving us in the same position as having a single changelog file.
    • We could run the generation step as part of the unified build. This would mean that the build was committing to the repo, and therefore changing the commit hash that is released.
    • The unified build could regenerate the release notes and check they are unmodified - if they are, the build would fail and someone would need to regenerate and commit the files.
    • Note that the prototype generator appears to delete changelog files after generating the release notes. This would make it impossible to re-run the generator e.g. on successive build candidates. We should instead remove all changelog files from a branch once further commits for a given version are impossible. This applies to the development branch(es) after a release branch is taken, and to release branches once a release has happens and the git tag is pushed.

Possible file example

pr: 63899
issues: [63055]
area: Machine Learning
type: enhancement
summary: Add new flag `exclude_generated` that removes generated fields in GET config APIs
version: [7.11.0, 8.0.0]

highlight:
  notable: true
  title: New API flag `exclude_generated` when fetching ML configs
  body: |
    When exporting and cloning ML configurations in a cluster it can be
    frustrating to remove all the fields that were generated by
    the plugin. Especially as the number of these fields change
    from version to version.

    This flag, `exclude_generated`, allows the GET config APIs to return
    configurations with these generated fields removed.
  
breaking:
  area: ML API changes
  title: `for_export` parameter changed to `exclude_generated`
  notable: true
  anchor: get_config_param_exclude_generated
  body: |
    Some descriptive text about the change of parameter name. Blah blah
    blah, you know, for release notes.
@pugnascotia pugnascotia added >enhancement :Delivery/Build Build or test infrastructure v8.0.0 labels Jan 12, 2021
@elasticmachine elasticmachine added the Team:Delivery Meta label for Delivery team label Jan 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@pugnascotia pugnascotia self-assigned this Jan 12, 2021
@nrichers
Copy link
Contributor

nrichers commented Jan 12, 2021

Subscribing, since this is of interest to Cloud as well. We currently generate release notes from PRs, including the actual content from a section in the root comment, and collate them by category in individual files. It's somewhat similar to what you seem to be planning, with some differences since our repo is private.

  • If a change is backed out of a release branch but we forget to remove a label from the original PR, then the release notes can include a change that was actually removed

We're susceptible to this same problem where changes are backed but can still appear in the release notes.

@Aran-K something for us to chat about when you're back from PTO.

EDIT: Also of interest to @gtback

@gtback
Copy link
Member

gtback commented Jan 12, 2021

++ I'm also interested in this.

I am working to help standardize the changelog/release notes across products to help present them in a unified interface. My goal is to define a simple JSON format that all teams can use (and generate however they'd like) for some automation to slurp up and put in an App Search instance. I'm hoping to have a version ready to share shortly.

I'm equally fine with YAML, though I don't necessarily think the format you're describing is the one we'll use directly (for example, we'll probably want to convert the PR number into a full URL).

I don't have a strong preference between one-item-per-file and multiple items in one file.

@pugnascotia
Copy link
Contributor Author

I updated the example in the description following some work to prototype the process in the Elasticsearch repo. Most notably, I stopped defining multiple documents in the YAML as this turned out to actually make things harder.

@jrodewig
Copy link
Contributor

@rjernst raised an idea I wanted to share here in case it could be incorporated.

We currently add a coming::[n.x.x] tag to release notes for unreleased versions. This creates a minor but rote cleanup chore of removing the tag after release. For example, see #68779.

It would be great if the new generator and/or process avoided this cleanup.

pugnascotia added a commit that referenced this issue Jul 28, 2021
Part of #67335.

Add tasks for generating release notes, using information stored in files
in the repository:

   * `generateReleaseNotes` - generates new release notes, release
     highlights and breaking changes
   * `validateChangelogs` - validates that all the changelog YAML files are
     well-formed (confirm to schema, have required fields depending on the
     `type` value)

I also changed `Version` to allow a `v` prefix in relaxed mode
pugnascotia added a commit that referenced this issue Jul 28, 2021
Part of #67335.

Add tasks for generating release notes, using information stored in files
in the repository:

   * `generateReleaseNotes` - generates new release notes, release
     highlights and breaking changes
   * `validateChangelogs` - validates that all the changelog YAML files are
     well-formed (confirm to schema, have required fields depending on the
     `type` value)

I also changed `Version` to allow a `v` prefix in relaxed mode
ywangd pushed a commit to ywangd/elasticsearch that referenced this issue Jul 30, 2021
Part of elastic#67335.

Add tasks for generating release notes, using information stored in files
in the repository:

   * `generateReleaseNotes` - generates new release notes, release
     highlights and breaking changes
   * `validateChangelogs` - validates that all the changelog YAML files are
     well-formed (confirm to schema, have required fields depending on the
     `type` value)

I also changed `Version` to allow a `v` prefix in relaxed mode
@arteam arteam added v8.1.0 and removed v8.0.0 labels Jan 12, 2022
@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@jirihradil
Copy link

Hi guys, thank you for this gem! However, not having a simple CHANGELOG.md file in this gem's root is just weird and we're always scared about what could (potentially) break. For example, when we see today's version 8 release, the first thing we did was to ensure we won't upgrade :)
Just my 2 cents.

@pugnascotia
Copy link
Contributor Author

@jirihradil if you are looking for a changelog for the Ruby client, you can find one in the dedicated repository:

https://github.com/elastic/elasticsearch-ruby/blob/main/CHANGELOG.md

@jirihradil
Copy link

@pugnascotia Thank you for both the link and PR!

@pugnascotia
Copy link
Contributor Author

We're using the new process for 8.1.0, so I think we can close this.

I filed a follow-up for possibly also adopting this setup in elastic/ml-cpp#2217.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure >enhancement Team:Delivery Meta label for Delivery team v8.2.0
Projects
None yet
Development

No branches or pull requests

8 participants