Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom sitemap.xml generated by the user #6938

Closed
humitos opened this issue Apr 21, 2020 · 5 comments
Closed

Add support for custom sitemap.xml generated by the user #6938

humitos opened this issue Apr 21, 2020 · 5 comments
Labels
Accepted Accepted issue on our roadmap Good First Issue Good for new contributors Improvement Minor improvement to code Sprintable Small enough to sprint on

Comments

@humitos
Copy link
Member

humitos commented Apr 21, 2020

Our current sitemap.xml covers the most basic case. Although, there are users with other needs that require a more customized sitemap.xml.

We should implement the same mechanism that we do for custom 404.html, where we provide the most basic one (Maze found) with the ability to change it by adding your own 404.html to the output generated by Sphinx.

So, we could check first for sitemap.xml in the default version and serve that one as is if it exists. Here, extensions like sphinx-sitemap may be a great help for the user.

This would be a better direction (an easier to implement) than lock all the users to have exactly the same sitemap.xml.

Reference: #6903 #5391 #6841

@eliasdabbas
Copy link

I just tried it, looks good but doesn't produce the <lastmod> tag :)
There is an open issue for it...

@humitos
Copy link
Member Author

humitos commented Apr 22, 2020

I just realized that having the possibility to customize the robots.txt, you can change the Sitemap: entry in that file to point the robots to a customized sitemap.xml if you prefer. This does not fix this issue, but at least it's a workaround.

@humitos humitos added Accepted Accepted issue on our roadmap Good First Issue Good for new contributors Sprintable Small enough to sprint on labels Apr 22, 2020
@eliasdabbas
Copy link

This is not a workaround. It's the more comprehensive solution actually :)

Customization of crawling happens in robots.txt where you signal to different bots different rules, link to sitemaps, exclude URL patterns, set crawl delays, etc. And sitemaps are simple lists of URLs with pretty much no functionality other than informing search engines about the URLs and some meta data.

So, my updated recommendation is to include all the URLs in the sitemap, while giving users the option to customize the robots.txt file.

I shared a suggested approach in #6903
Please let me know if that makes sense, and if you have any other thoughts.

@Pradhvan
Copy link
Contributor

Pradhvan commented Feb 24, 2021

@humitos if no one is working on this? I would love to pick this up. 😄

Update just checked the #6903 IDK if the issue still relevant?

@humitos
Copy link
Member Author

humitos commented Feb 25, 2021

@Pradhvan thanks for your interest in here! Yeah, considering that we already support custom robots.txt and that you can change the sitemap.xml location from there, I don't think this issue is still relevant. In the end, if you want a custom sitemap right now, you can just modify the robots and use a third-party extension. I think there is no need to add another feature to RTD to do exactly the same.

I'm closing this issue for now and we can re-visit if someone comes with more strong opinions about why this is needed and using the robots.txt file is not enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Good First Issue Good for new contributors Improvement Minor improvement to code Sprintable Small enough to sprint on
Projects
None yet
Development

No branches or pull requests

3 participants