Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] sitemap contains redirect page entries #8821

Closed
filzrev opened this issue Jun 2, 2023 · 3 comments
Closed

[Bug] sitemap contains redirect page entries #8821

filzrev opened this issue Jun 2, 2023 · 3 comments
Labels
bug A bug to fix static-site Produce static HTML output ready for publishing to hosts like GitHub pages

Comments

@filzrev
Copy link
Contributor

filzrev commented Jun 2, 2023

Describe the bug

Sitemap.xml contains redirect page entries.
Redirect page should not be included in sitemap's entry.

To Reproduce

Steps to reproduce the behavior:

  1. Gets sitemap.xml content(https://dotnet.github.io/docfx/sitemap.xml)
  2. Check redirected page(https://dotnet.github.io/docfx/tutorial/walkthrough/walkthrough_create_a_docfx_project_2.html) exists.

Expected behavior

redirect page is not appear in sitemap entries.

How to fix this problem

It seems SitemapGenerator PostProcessor can't access FileModel metadata directly as described below.

https://dotnet.github.io/docfx/tutorial/howto_add_a_customized_post_processor.html#step3-process-all-the-files-generated-by-docfx

Post-processor aims to process the output files, so the FileModel can't be accessed in this phase. If some metadata is needed here,
an option is to save it in FileModel.ManifestProperties in build phase, then access it through ManifestItem.Metadata.
Another option is to save it somewhere in output files, like HTML's Tag.


I want to making modifications in the following directions, but if you have a better idea, please let us know.

  1. Add new HtmlPostProcessor handler to check <meta http-equiv="refresh" tag existence. and save metadata to OutputFileInfo.
  2. Then SitemapGenerator use this metadata to filter redirect page.

Pros

  • No need to modify DocumentProcessors.
  • No need to open HTML file at SitemapGenerator.

Cons

@filzrev filzrev added the bug A bug to fix label Jun 2, 2023
@yufeih
Copy link
Contributor

yufeih commented Jun 2, 2023

I'd prefer we check redirect_url metadata over <meta http-equiv, as many redirections are implemented server side.

The SitemapGenerator does not have access to metadata, may be we can change the DocumentType of redirections to Redirection instead of Conceptual here?

@filzrev
Copy link
Contributor Author

filzrev commented Jun 2, 2023

Thanks for the reply.

we can change the DocumentType of redirections to Redirection instead of Conceptual

If DocumentType can changed to Redirection.
It'll be a clean and more extensible architecture.

@yufeih yufeih added the static-site Produce static HTML output ready for publishing to hosts like GitHub pages label Jun 5, 2023
@filzrev
Copy link
Contributor Author

filzrev commented Jul 1, 2023

This issue is fixed by #8892

@filzrev filzrev closed this as completed Jul 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug to fix static-site Produce static HTML output ready for publishing to hosts like GitHub pages
Projects
None yet
Development

No branches or pull requests

2 participants