Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Wordpress caption to figure #3141

Merged
merged 1 commit into from
Oct 24, 2023

Conversation

mart-e
Copy link
Contributor

@mart-e mart-e commented Jun 4, 2023

In Wordpress, inserting image with a caption can look like

[caption id="attachment_42" caption="Image Description"]<a ...><img ... /></a>[/caption]
[caption id="attachment_42"]<a ...><img ... /></a> Image Description[/caption]
[caption id="attachment_42"]<img ... > Image Description[/caption]

Replace by an HTML <figure> tag

Pull Request Checklist

  • Ensured tests pass and (if applicable) updated functional test output
  • Conformed to code style guidelines by running appropriate linting tools
  • Added tests for changed code
  • Updated documentation for changed code

PS: feel free to suggest a better regex, this was tested with a few examples taken from my blog.

@mart-e mart-e force-pushed the wp-caption-to-figure branch 12 times, most recently from 09653f1 to 467e3da Compare June 4, 2023 16:10
@mart-e
Copy link
Contributor Author

mart-e commented Jun 4, 2023

Don't get why the build is failing... When raising an error to see ouput, it's as if 1b360ac was not merged 🤔
https://github.com/getpelican/pelican/actions/runs/5169977273/jobs/9312580636

@justinmayer
Copy link
Member

It seems clear that 1b360ac was indeed merged. It appears in master as well as in the branch attached to this PR: https://github.com/mart-e/pelican/commits/wp-caption-to-figure

@mart-e
Copy link
Contributor Author

mart-e commented Jun 5, 2023

@justinmayer ok after a lot more debugging, I finally understood, it's due to the pandoc version used in github actions (2.9.2) being different than the one on my machine (3.0.1).

If you check the output in the action below, you see that

<p><figure>
<img src="/theme/img/xpelican.png.pagespeed.ic.Rjep0025-y.png"/>
<figcaption>This is a pelican</figcaption>
</figure></p>

is converted using pandoc -f html+raw_html --to=gfm-smart --wrap=none and results in:

![This is a pelican](/theme/img/xpelican.png.pagespeed.ic.Rjep0025-y.png)

https://github.com/mart-e/pelican/actions/runs/5174375780/jobs/9320618059#step:8:78

I may have been a bit optimistic on #3114 solving all issues.

Only pandoc 3.0 has a proper support for <figure> tags:
https://pandoc.org/releases.html#pandoc-3.0-2023-01-18

Markdown writer: figures are output as implicit figures if possible, via HTML if the raw_html extension is enabled, and as Div elements otherwise.
HTML reader: <figure> elements are parsed as figures, with the caption taken from the respective <figcaption> elements.

The easiest for my issue would be to upgrade the required vrsion of pandoc but it's probably not the best as even the latest ubuntu is still in 2.17.

I see better solutions:

  • modify my test to be a bit less strict and just check that [caption] tags are converted (to whatever pandoc allows)
  • instead of using <figure>...<figcaption>Caption Text</figcaption></figure>, I convert to an <p><img /></p><p>Caption Text</p>
  • adapt the test to fit the 2.X behaviour and adapt the test once we upgrade to pandoc 3.X (but tests will fail locally for users with a different pandoc)

I tend to prefer the first one but it brings a different result based on the version of pandoc installed.

Let me know if ok with you.

edit: did that in the second commit, will merge them if ok

@mart-e mart-e marked this pull request as ready for review June 5, 2023 10:27
@justinmayer
Copy link
Member

I agree that the first option sounds better. 👍

In Wordpress, inserting image with a caption can look like:

[caption id="attachment_42" caption="Image Description"]<a ...><img ... /></a>[/caption]
[caption id="attachment_42"]<a ...><img ... /></a> Image Description[/caption]
[caption id="attachment_42"]<img ... > Image Description[/caption]

Replace by an HTML figure tag
Copy link
Member

@justinmayer justinmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for your work on this, Martin. We can revisit this topic once Pandoc 3.0+ is ubiquitous, but until then your solution strikes a good balance. 👍

@justinmayer justinmayer merged commit 620139c into getpelican:master Oct 24, 2023
10 checks passed
@mart-e mart-e deleted the wp-caption-to-figure branch October 24, 2023 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants