Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: words missing using text-align: justify in pdfbox 2.0.21 #543

Closed
lagar84 opened this issue Aug 28, 2020 · 3 comments
Closed

Comments

@lagar84
Copy link

lagar84 commented Aug 28, 2020

Hello,

I noticed that "open-dev-v1" upgraded PDFBOX to version 2.0.21

I was upgrading some libraries, then so give PDFBOX 2.0.21 a try too.

Unfortunately, I found a regression.

I created a reproducer to illustrate the issue:

minimal_reproducer_missing_words.txt

Using openhtml 1.0.4 with PDFBOX 2.0.20, works:
correct_output_v1 04_pdfbox_v2 0 20

But using openhtml 1.0.4 with PDFBOX 2.0.21, not all words appear on generated PDF:
wrong_output_v1 04_pdfbox_v2 0 21

I am not sure if this issue is on openhtml or PDFBOX side, so I am describing it here in the hope that someone with proper knowledge could wheighting in.

Thanks for this great library. Hope you can continue this great work.

Best wishes,
lagar84.

danfickle added a commit that referenced this issue Aug 29, 2020
PDFBOX 2.0.21 is now reporting nbsp as having zero width (in built-in fonts) which means line or justification involving this character will be wrong.

Previously, in 20.0.20 it was reporting nbsp as not existing so we replace it with a normal space.

With test kindly provided by @lagar84
@danfickle
Copy link
Owner

Huge thanks to @lagar84.

It turns out that PDFBOX 2.0.21 is reporting non-breaking space as having zero width. This means that our project thinks the missing words should fit on the first line. When the text is actually output the nbsp takes width and pushes the missing words into the page margin.

I have filed issue PDFBOX-4944: Built-in fonts are reporting nbsp char as having zero width. so that the PDFBOX team can address this issue. In the meantime, I have downgraded to PDFBOX 2.0.20 and added your test to our project.

Thanks again. You can leave this issue open until we come up with a new version of PDFBOX.

@danfickle
Copy link
Owner

I have confirmed this is fixed with 2.0.22-SNAPSHOT.

@danfickle
Copy link
Owner

Fixed with release of 1.0.6 and PDFBOX 2.0.22. Thanks again @lagar84.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants