Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metadata intrasite links #2288

Merged
merged 4 commits into from
Feb 9, 2018
Merged

Fix metadata intrasite links #2288

merged 4 commits into from
Feb 9, 2018

Conversation

charlesfleche
Copy link
Contributor

Squashed version of previous PR #2226

@charlesfleche
Copy link
Contributor Author

Ah, just noticed I probably could have pushed to the previous branch with git push origin +name-of-branch...

@justinmayer justinmayer changed the title Fix metadata intrasite links (squashed) Fix metadata intrasite links Feb 9, 2018
@justinmayer justinmayer added this to the 3.8.0 milestone Feb 9, 2018
@justinmayer
Copy link
Member

Hehe. No worries. I was just about to let you know that you could have pushed to the previous branch/PR, but it seems you noticed that on your own.

Thanks again for the contribution!

@justinmayer justinmayer merged commit 2d24d6b into getpelican:master Feb 9, 2018
@mosra mosra mentioned this pull request Jul 23, 2018
if 'summary' in metadata:
self._summary = metadata['summary']
if len(self._context.get('filenames', [])) > 0:
self.refresh_metadata_intersite_links()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesfleche @justinmayer As I commented in #2162 (comment), this line causes link replacement to be dependent on the order in which files are processed -- because it's done during parsing of a single article/page and not later when everything is ready (so e.g. links to pages in metadata fields that were not yet processed are reported as Unable to find).

I investigated further and realized refresh_metadata_intersite_links() is called again later below, after all articles/pages are processed -- and in that second step the links that were previously Unable to find are then replaced correctly. I don't know the code very well, but in my case when I commented the above two lines out, the warnings were gone (and links in metadata fields still correctly replaced). I imagine removing the above might cause some plugin to cease working (since e.g. the summary metadata field won't be present yet), so not really sure how to fix that. On the other hand, I don't think calling refresh_metadata_intersite_links() twice on everything is a good idea (at least from a perf PoV, but I could imagine some potential issues with {attach} as well).

Any ideas how to fix this?

Copy link
Member

@avaris avaris Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is not ideal.

Removing these won't make summary vanish, but it'll be unprocessed if one accesses before everything is processed. And that is actually fine, since accessing .summary in a plugin earlier than all_generators_finalized is a "bug" in itself for this same reason (it'll spew Unable to find warnings).

Another issue is that, this removed _summary. Even though it's "internal", some plugins used that.

From what I can see, this modification is all for preventing summary being processed twice.
Edit: Everything is processed twice anyway. I don't know what this is for.

I'd say, reverting this bit and get_summary should be enough.

And maybe excluding summary from refresh_metadata_intersite_links, if one is worried enough about the double processing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally got back to this (sorry for the delay), the proposed change is in #2406.

And maybe excluding summary from refresh_metadata_intersite_links, if one is worried enough about the double processing.

Not sure what you mean, reverting the above bit is (as far as I can see) enough to stop the double processing.

Copy link
Member

@avaris avaris Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, if you revert the above part and get_summary then summary will be processed twice.

  • Once when refresh_metadata_intersite_links is called
  • Once when content.summary is accessed (or any other way .get_summary is called).

Furthermore, reverting only this part is pointless, since you end up with a ._summary attribute that's not used anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, I'm afraid I lost all the context over the time we're discussing this 🙈

Furthermore, reverting only this part is pointless, since you end up with a ._summary attribute that's not used anywhere.

I did that only to fix this:

Another issue is that, this removed _summary. Even though it's "internal", some plugins used that.

So, what I ended up doing in #2406 is:

  • keeping _summary, because some plugins used that, as you said
  • not calling refresh_metadata_intersite_links too soon

I could also just delete this part altogether (and thus _summary would be no longer). Can we move the discussion to #2406?

Copy link
Member

@avaris avaris Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but _summary is not used in get_summary anymore. This PR also changes that. I mean those should be reverted together.

Plugins modifying _summary do so in order to change .summary, since you can't directly access .summary too early. Now they will access and change something irrelevant. Modifying _summary will leave .summary unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Now I get you. I thought _summary was accessed only for read, never for write. Now it all makes sense, sorry for my denseness.

#2406 is updated (the CI is failing for an unrelated reason, same on master), but instead of just reverting the second part I realized the changes to _summary won't be picked up by refresh_metadata_intersite_links() and the _summary handling would need to creep there as well. So now I introduced a _summary property getter/setter that's basically just a "view" onto metadata["summary"]. Also, "because some plugins used that" I assume it's not a workflow worth preserving so the getter/setter now prints a deprecation warning when the property is used.

I hope this is finally getting closer to how it should be :) And sorry for the delays, I'm extremely busy.

@avaris
Copy link
Member

avaris commented Oct 2, 2018

Also... Did anyone test this metadata intrasite links with RELATIVE_URLS = True? I have a strong suspicion that it won't work...

That's practically the whole reason for these 'link-containing' fields (content and summary) are dynamic and made to be properties rather than simple attributes.

@mosra
Copy link
Contributor

mosra commented Oct 3, 2018

You mean because now it's processed only once (and exactly once) while before it was processed on each access and thus being able to generate different values based on where the content/summary is accessed from?

(Disclaimer: I personally only tried enabling RELATIVE_URLS once (long before this PR was merged) and realized it didn't work well in my case where I have URLs like /blog/category/article/: it didn't generate proper amount of ../ parts in the relative URLs and I didn't bother further.)

@avaris
Copy link
Member

avaris commented Oct 3, 2018

Exactly.

PS: If you're getting invalid links with RELATIVE_URLS = True, either there is a config error or a bug. You should report if it is the latter :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants