Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc: forward path to a future builder #8190

Merged
merged 5 commits into from
May 24, 2022

Conversation

humitos
Copy link
Member

@humitos humitos commented May 18, 2021

This document is a continuation of Santos' work about "Explicit Builders" (see #8103), decisions about the final goal, proposing a clear direction to move forward with intermediate steps keeping backward and forward compatibility.

Most of this is a dump of my talk to Anthony about these ideas and some extra work done on the whiteboard after receiving some feedback from him.

The most important feedback here would be to know if this is a place where we would like to be sooner than later and in that case, make sure that what we are deciding and building on #8103 is compatible with the direction and final goal proposed here.

Rendered version: https://docs--8190.org.readthedocs.build/en/8190/development/design/future-builder.html

This document is a continuation of Santos' work about "Explicit Builders" (see
decisions about the final goal, proposing a clear direction to move forward with
intermediate steps keeping backward and forward compatibility.

Most of this is a dump of my talk to Anthony about these ideas and some extra
work done on the whiteboard after receiving some feedback from him.

The most important feedback here would be to know if this is a place where we
would like to be sooner than later and in that case, make sure that what we are
deciding and building on #8103 is compatible with the direcion and final goal
proposed here.

.. code:: yaml

# metadata.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like all this information can be in the rtd config file, I don't see why you would want to make this dynamically generated to change at runtime.

Copy link
Member Author

@humitos humitos May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put some static data here just as an example. However, I'm reserving this file to act as a place to specify dynamic metadata that's only known at build time by the doctool: something that it's known after the tool has built the documentation or while it's building it.

I'm proposing to create the pipeline in a generic way to be prepared for these cases. It may not be needed to generate this file dynamically for some tools/users but it must be required for other tools (*).

Besides, I want to separate it from readthedocs.yaml since I imagine this could eventually be useful for uploading pre-built documentation where you upload a project-html.zip + metadata.yaml and Read the Docs gets everything needed from the metadata.yaml without the needing of a readthedocs.yaml (which is currently only useful to configure the build process and we won't have to build the docs in this case).

I also envision readthedocs.yaml as a way to "configure the build process in the platform" (in the future we may be able to configure other platform features than just the build, tho). On the other hand, I see metadata.yaml as an "integration point between the build's output and Read the Docs' features/integrations (flyout, search, warning banners, hosting, etc)"

(*) For example, the PDF filename in Sphinx is generated based on the latex_documents config (and -jobname argument in latexmk command) and it would be handy to modify it in one place and keep everything working without manually updating the readthedocs.pdf_output field in the metadata.yaml as well.

#. Define the environment variables required (e.g. some from ``html_context``) and execute all commands with them
#. Build documentation using this contract
#. Leave ``readthedocs-sphinx-ext`` as the only package installed and extension install in ``conf.py.tmpl``
#. Add ``build.builder: 2`` config without any *magic*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sold on the idea of versioning this, as it would be confusing having two versions (one for the config file and other for how the docs are build?). Also, I'd call it build.version as builder: 2 looks like you are requiring more builders.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I understand that it may be confusing having the global version and build.version. However, we need a way to differentiate the "builder that does some magic" (current behavior) from the "builder that does nothing". What do you think could be better for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to resolve this conversation, I believe we settled on not versioning this. If the user does nothing, they get our normal build commands. If the user wants to alter the pre/post hooks, they use build.jobs, and if the user needs to override a build command, they are explicitly opting into our new builder. There doesn't need to be a version in this case.

Comment on lines 91 to 93
html_output: ./_build/html/
pdf_output: ./_build/pdf/myproject.pdf
epub_output: ./_build/pdf/myproject.epub
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify this structure to be like the following to support other formats like man (see #4458):

readthedocs:
  output:
    html:
    pdf:
    epub:
    man:

Besides, each of these keys could also be a list so we can support multiple PDF output (see #2045):

readthedocs:
  output:
    pdf:
      - ./_build/pdf/myproject.pdf
      - ./_build/pdf/tutorial.pdf
      - ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on both counts. I like the nested structure more and the list of files is a feature I've wanted for a while.

* Keep the current builder working as-is
* Keep backward and forward (with intermediate steps) compatibility
* Define a clear support for newbie, intermediate and advanced users
* Allow users to override a command, run pre/post hook commands or define all commands by themselves
Copy link
Member Author

@humitos humitos Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there are people already doing this in hacky ways: #6662 (comment)

@mcarans
Copy link

mcarans commented Feb 23, 2022

It looks like this will cover running custom steps with dependencies. Essentially I want to be able to do something like the following:

.readthedocs.yml

build:
  builder: 2
  jobs:
    pre_build: 
      - pip install pydoc-markdown
      - pydoc-markdown --build --site-dir "$PWD/_build/html"

Is there a way to specify requirements in the new builder syntax or must they be pip installed as above? eg.

python:
  version: 3.7
  install:
  - packages:
    - pydoc-markdown

The above would replace what I currently do using the readthedocs-custom-steps library as follows:

.readthedocs.yml

version: 2
mkdocs: {}  # tell readthedocs to use mkdocs
python:
  version: 3.7
  install:
  - requirements: docs/.readthedocs-requirements.txt

docs/.readthedocs-requirements.txt

pydoc-markdown
readthedocs-custom-steps

pyroject.toml:

[tool.readthedocs-custom-steps]
script = """
pydoc-markdown --build --site-dir "$PWD/_build/html"
"""

@stsewd
Copy link
Member

stsewd commented Mar 10, 2022

Noting a couple of things from this doc and the implementation from #9002:

  • post checkout steps should be run in the container where the other build steps are done, not on the container used to clone the repo (that other container will be lost, could be confusing for the user).
  • Some steps seen to be a little redundant/are executed in the same "step", like: post checkout / pre system dependencies, post system dependencies / pre create environment, post create environment / pre install. We could expose just a few to start, like: pre-system-dep, pre-install, pre-build, post-build.
  • Are we allowing users to override the build html/pdf/etc commands? Looking at the code looks like yes, users may be loosing some features or some things may not work as expected, since we are overriding things and generating some additional metadata in our builds (our sphinx extension specially). Maybe this could be allowed later.
  • Would it make more sense for the users to just use this new option instead of combining it with python.install? that would reduce the number of "hooks" (could also be confusing that we mention both options, and we will need to do more validation so users don't combine both options).
  • I also like the way github/circle do things, they allow the user to use "keywords" for the "built-in steps" (like checkout), so instead of defining hooks, users would just define "steps" and they will decide the order. This could also be a little more error-prone, users omitting a required step or putting the order wrong, so not really sure about that implementation.

@humitos
Copy link
Member Author

humitos commented Mar 10, 2022

@stsewd thanks for the feedback! I will reply to each point below:

post checkout steps should be run in the container where the other build steps are done, not on the container used to clone the repo (that other container will be lost, could be confusing for the user).

Good point! We can avoid pre_/post_ checkout in the initial implementation to keep things simple. pre_checkout is not currently possible, so I'm fine skipping post_checkout as well for now.

Some steps seen to be a little redundant/are executed in the same "step", like: post checkout / pre system dependencies, post system dependencies / pre create environment, post create environment / pre install. We could expose just a few to start, like: pre-system-dep, pre-install, pre-build, post-build.

I thought about this already and I think that, even if it feels redundant now, I think it's better to have "post command" associated with the proper step than with the "pre command" of the following step. This will allow us to re-order the command in the future if we have to. For example, instead of system_dependencies followed by create_environment we could do it in the other way.

To me, it's also better UX when writing the config file. You don't need to think about the order of the steps, you just know that you want to do something at post_create_environment and you don't care too much if that's pre_system_dependencies or pre_build.

Are we allowing users to override the build html/pdf/etc commands? Looking at the code looks like yes, users may be loosing some features or some things may not work as expected, since we are overriding things and generating some additional metadata in our builds (our sphinx extension specially). Maybe this could be allowed later.

We are not allowing that in the initial implementation. Overriding our own commands is a lot more work and we thought that it could be better to go directly with the build.commands for that case. However, I'm not sure about this yet but I'm not thinking about implementing this possibility for now due to its complications.

To allow this, I think we will require the metadata.yaml contract established and defined first, which we suppose it's the following step after build.jobs

Would it make more sense for the users to just use this new option instead of combining it with python.install? that would reduce the number of "hooks" (could also be confusing that we mention both options, and we will need to do more validation so users don't combine both options).

By "this new option" do you mean the "overwriting the install step"? In that case, we are thinking that build.jobs and build.commands are mutually exclusive and they can't be used together. So, you "override all the commands" (build.commands), or you "override just the ones you need to" (build.jobs)

I also like the way github/circle do things, they allow the user to use "keywords" for the "built-in steps" (like checkout), so instead of defining hooks, users would just define "steps" and they will decide the order. This could also be a little more error-prone, users omitting a required step or putting the order wrong, so not really sure about that implementation.

I think this is good, but I'm not sure what would be the use case to change the order of the steps? However, I like the idea of implementing a mixture of "keywords for built-in steps" and custom commands. In that case, those could go on build.commands. Something like:

build:
  commands:
    - checkout
    - echo 12345 # execute my custom command here
    - create_environment
    - install
    - sed ... # change something in my files
    - build

In any case, this won't be part of the implementation of build.jobs and we can come to this later when working on build.commands I guess.

@stsewd
Copy link
Member

stsewd commented Mar 10, 2022

In that case, we are thinking that build.jobs and build.commands are mutually exclusive and they can't be used together.

I mean the current setting we have https://docs.readthedocs.io/en/stable/config-file/v2.html#python-install, since users could just put a pip install -r requirements.txt line now.

@agjohnson
Copy link
Contributor

I'd like to start wrapping up this design doc and close up conversations here. We had a meeting on this, and I'm not sure the hackmd notes even made their way back to this draft. Is there anything besides technical design discussion that is holding this design doc back?

I've opened some issues for continuing technical discussion on granular points of this implementation, but we'll need some more issues on top I think.

This PR is still in draft, but and we're already on to implementation, so it feels like this draft will linger on if we don't finalize discussion soon. After our last discussion, it sounded like we agreed on the overall high level decisions, and need to worry more about technical implementation now.

Anyone feel strongly that we can't proceed with what we've discussed? I'm sure we have more discussion on technical pieces, but can probably put that discussion on to individual issues instead.

@agjohnson
Copy link
Contributor

I'm pushing design discussion to:

Can we wrap up and merge all of these design docs and meeting notes?

Link to other issues with more accurate discussions and up-to-date information.
@humitos humitos marked this pull request as ready for review May 23, 2022 13:35
@humitos humitos requested a review from a team as a code owner May 23, 2022 13:35
@humitos
Copy link
Member Author

humitos commented May 23, 2022

I added a note at the top of the document commenting that many things have changed since this document was written with a lot of links with the discussions and the current state of this. I'll merge it once tests pass.

@humitos humitos enabled auto-merge May 23, 2022 13:36
@humitos humitos disabled auto-merge May 24, 2022 08:17
@humitos humitos merged commit 1d32da6 into main May 24, 2022
@humitos humitos deleted the humitos/design-doc-future-builder branch May 24, 2022 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needed: design decision A core team decision is required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants