Skip to content

Commit

Permalink
fix: RTL unicode issue in PDF (#884)
Browse files Browse the repository at this point in the history
  • Loading branch information
kesara authored Aug 29, 2022
1 parent 2c9dfaf commit 9821dc6
Show file tree
Hide file tree
Showing 28 changed files with 247 additions and 167 deletions.
6 changes: 6 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,7 @@ def _pdfwriter(path):
except Exception as e:
print(e)
raise
cls.pdf_writer = elements_writer
cls.elements_root = elements_writer.root
cls.elements_pdfxml = xmldoc(None, bytes=elements_pdfdoc)

Expand All @@ -516,5 +517,10 @@ def test_included_fonts(self):
family = xml2rfc.util.fonts.get_noto_serif_family_for_script(script)
self.assertIn(family, font_families, 'Missing font match for %s' % script)

def test_flatten_unicode_spans(self):
input_html = '<body><p>f<span class="unicode">o</span>o<span class="unicode">ba</span>r</p></body>'
output_html = self.pdf_writer.flatten_unicode_spans(input_html)
self.assertEqual(output_html, '<body><p>foobar</p></body>')

if __name__ == '__main__':
unittest.main()
12 changes: 6 additions & 6 deletions tests/valid/docfile.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@
<meta charset="utf-8">
<meta content="Cherokee,Common,Greek,Latin" name="scripts">
<meta content="initial-scale=1.0" name="viewport">
<title>Xml2rfc Vocabulary Version 3 Schema xml2rfc release 3.13.1</title>
<title>Xml2rfc Vocabulary Version 3 Schema xml2rfc release 3.14.1</title>
<meta content="xml2rfc(1)" name="author">
<meta content="
This document provides information about the XML schema implemented in this release of xml2rfc, and the individual elements of that schema. The document is generated from the RNG schema file that is part of the xml2rfc distribution, so schema information in this document should always be in sync with the schema in actual use. The textual descriptions depend on manual updates in order to reflect the implementation.
" name="description">
<meta content="xml2rfc 3.13.1" name="generator">
<meta content="xml2rfc-docs-3.13.1" name="ietf.draft">
<meta content="xml2rfc 3.14.1" name="generator">
<meta content="xml2rfc-docs-3.14.1" name="ietf.draft">
<link href="tests/out/docfile.xml" rel="alternate" type="application/rfc+xml">
<link href="#copyright" rel="license">
<link href="xml2rfc.css" rel="stylesheet">
Expand Down Expand Up @@ -45,7 +45,7 @@
</dd>
</dl>
</div>
<h1 id="title">Xml2rfc Vocabulary Version 3 Schema<br>xml2rfc release 3.13.1</h1>
<h1 id="title">Xml2rfc Vocabulary Version 3 Schema<br>xml2rfc release 3.14.1</h1>
<section id="section-abstract">
<h2 id="abstract"><a href="#abstract" class="selfRef">Abstract</a></h2>
<p id="section-abstract-1">
Expand Down Expand Up @@ -367,7 +367,7 @@ <h2 id="name-introduction">
<p id="section-1-5">
The latest version of this documentation is available in HTML form at <span><a href="https://ietf-tools.github.io/xml2rfc/">https://ietf-tools.github.io/xml2rfc/</a></span>.<a href="#section-1-5" class="pilcrow"></a></p>
<p id="section-1-6">
This documentation applies to xml2rfc version 3.13.1.<a href="#section-1-6" class="pilcrow"></a></p>
This documentation applies to xml2rfc version 3.14.1.<a href="#section-1-6" class="pilcrow"></a></p>
</section>
<section id="section-2">
<h2 id="name-schema-version-3-elements">
Expand Down Expand Up @@ -6351,7 +6351,7 @@ <h2 id="name-xml2rfc-documentation-templ">
<p id="appendix-D-1">

The following variables are available for use in an xml2rfc
manpage Jinja2 template, as of xml2rfc version 3.13.1:<a href="#appendix-D-1" class="pilcrow"></a></p>
manpage Jinja2 template, as of xml2rfc version 3.14.1:<a href="#appendix-D-1" class="pilcrow"></a></p>
<span class="break"></span><dl class="dlNewline" id="appendix-D-2">
<dt id="appendix-D-2.1">{{ bare_latin_tags }}:</dt>
<dd style="margin-left: 1.5em" id="appendix-D-2.2"></dd>
Expand Down
8 changes: 6 additions & 2 deletions tests/valid/draft-miek-test.html
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
This version is adapted to work with "xml2rfc" version 2.x.
' name="description">
<meta content="xml2rfc 3.13.1" name="generator">
<meta content="xml2rfc 3.14.1" name="generator">
<meta content="RFC" name="keyword">
<meta content="Request for Comments" name="keyword">
<meta content="I-D" name="keyword">
Expand All @@ -26,7 +26,7 @@
<meta content="Extensible Markup Language" name="keyword">
<meta content="draft-gieben-writing-rfcs-pandoc-02" name="ietf.draft">
<!-- Generator version information:
xml2rfc 3.13.1
xml2rfc 3.14.1
Python 3.9.13
appdirs 1.4.4
ConfigArgParse 1.5.3
Expand Down Expand Up @@ -184,6 +184,7 @@
margin: 1em 0;
}
.alignCenter > *:first-child {
display: table;
border: none;
margin: 0 auto;
}
Expand Down Expand Up @@ -1078,6 +1079,9 @@
td {
border-top: 1px solid #ddd;
}
tr {
break-inside: avoid;
}
tr:nth-child(2n+1) > td {
background-color: #f8f8f8;
}
Expand Down
20 changes: 9 additions & 11 deletions tests/valid/draft-template-old.exp.xml
Original file line number Diff line number Diff line change
Expand Up @@ -474,17 +474,15 @@ main(int argc, char *argv[])
<references title="Normative References">
<!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml" quote-title="true">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S" surname="Bradner"/>
<date month="March" year="1997"/>
<abstract>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="2119"/>
<seriesInfo name="DOI" value="10.17487/RFC2119"/>
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials="S." surname="Bradner" fullname="S. Bradner"><organization/></author>
<date year="1997" month="March"/>
<abstract><t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="2119"/>
<seriesInfo name="DOI" value="10.17487/RFC2119"/>
</reference>

<reference anchor="min_ref" quote-title="true">
Expand Down
12 changes: 7 additions & 5 deletions tests/valid/draft-template-old.prepped.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<?xml version='1.0' encoding='utf-8'?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3" category="info" consensus="false" docName="draft-ietf-xml2rfc-template-05" indexInclude="true" ipr="trust200902" prepTime="2022-08-09T04:13:58" scripts="Common,Latin" sortRefs="true" submissionType="IETF" symRefs="true" tocDepth="4" tocInclude="true" xml:lang="en">
<!-- xml2rfc v2v3 conversion 3.13.1 -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3" category="info" consensus="false" docName="draft-ietf-xml2rfc-template-05" indexInclude="true" ipr="trust200902" prepTime="2022-08-29T03:59:06" scripts="Common,Latin" sortRefs="true" submissionType="IETF" symRefs="true" tocDepth="4" tocInclude="true" xml:lang="en">
<!-- xml2rfc v2v3 conversion 3.14.1 -->



Expand Down Expand Up @@ -553,10 +553,12 @@ main(int argc, char *argv[])
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml" quoteTitle="true" derivedAnchor="RFC2119">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S" surname="Bradner"/>
<date month="March" year="1997"/>
<author initials="S." surname="Bradner" fullname="S. Bradner">
<organization showOnFrontPage="true"/>
</author>
<date year="1997" month="March"/>
<abstract>
<t indent="0">In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
<t indent="0">In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
Expand Down
10 changes: 6 additions & 4 deletions tests/valid/draft-template-old.v2v3.xml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
<?rfc multiple-initials="yes" ?>
<!-- end of list of popular I-D processing instructions -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-ietf-xml2rfc-template-05" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="4" symRefs="true" sortRefs="true" version="3">
<!-- xml2rfc v2v3 conversion 3.13.1 -->
<!-- xml2rfc v2v3 conversion 3.14.1 -->
<?v3xml2rfc silence=".*[Pp]ostal address" ?>
<?v3xml2rfc silence="The document date .*? is more than 3 days away from today's date" ?>
<?v3xml2rfc silence="Found SVG with width or height specified" ?>
Expand Down Expand Up @@ -452,10 +452,12 @@ main(int argc, char *argv[])
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S" surname="Bradner"/>
<date month="March" year="1997"/>
<author initials="S." surname="Bradner" fullname="S. Bradner">
<organization/>
</author>
<date year="1997" month="March"/>
<abstract>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
Expand Down
8 changes: 5 additions & 3 deletions tests/valid/draft-template.exp.xml
Original file line number Diff line number Diff line change
Expand Up @@ -417,10 +417,12 @@ main(int argc, char *argv[])
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S" surname="Bradner"/>
<date month="March" year="1997"/>
<author initials="S." surname="Bradner" fullname="S. Bradner">
<organization/>
</author>
<date year="1997" month="March"/>
<abstract>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
<t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
Expand Down
8 changes: 6 additions & 2 deletions tests/valid/draft-template.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@
Insert an abstract: MANDATORY. This template is for creating an
Internet Draft.
" name="description">
<meta content="xml2rfc 3.13.1" name="generator">
<meta content="xml2rfc 3.14.1" name="generator">
<meta content="template" name="keyword">
<meta content="draft-ietf-xml2rfc-template-05" name="ietf.draft">
<!-- Generator version information:
xml2rfc 3.13.1
xml2rfc 3.14.1
Python 3.9.13
appdirs 1.4.4
ConfigArgParse 1.5.3
Expand Down Expand Up @@ -173,6 +173,7 @@
margin: 1em 0;
}
.alignCenter > *:first-child {
display: table;
border: none;
margin: 0 auto;
}
Expand Down Expand Up @@ -1067,6 +1068,9 @@
td {
border-top: 1px solid #ddd;
}
tr {
break-inside: avoid;
}
tr:nth-child(2n+1) > td {
background-color: #f8f8f8;
}
Expand Down
12 changes: 7 additions & 5 deletions tests/valid/draft-template.prepped.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<?xml version='1.0' encoding='utf-8'?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3" category="info" consensus="false" docName="draft-ietf-xml2rfc-template-05" indexInclude="true" ipr="trust200902" prepTime="2022-08-09T04:25:24" scripts="Common,Latin" sortRefs="true" submissionType="IETF" symRefs="true" tocDepth="4" tocInclude="true">
<!-- xml2rfc v2v3 conversion 3.13.1 -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3" category="info" consensus="false" docName="draft-ietf-xml2rfc-template-05" indexInclude="true" ipr="trust200902" prepTime="2022-08-29T03:58:42" scripts="Common,Latin" sortRefs="true" submissionType="IETF" symRefs="true" tocDepth="4" tocInclude="true">
<!-- xml2rfc v2v3 conversion 3.14.1 -->



Expand Down Expand Up @@ -553,10 +553,12 @@ main(int argc, char *argv[])
<reference anchor="RFC2119" target="https://www.rfc-editor.org/info/rfc2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml" quoteTitle="true" derivedAnchor="RFC2119">
<front>
<title>Key words for use in RFCs to Indicate Requirement Levels</title>
<author fullname="S. Bradner" initials="S" surname="Bradner"/>
<date month="March" year="1997"/>
<author initials="S." surname="Bradner" fullname="S. Bradner">
<organization showOnFrontPage="true"/>
</author>
<date year="1997" month="March"/>
<abstract>
<t indent="0">In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
<t indent="0">In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
</abstract>
</front>
<seriesInfo name="BCP" value="14"/>
Expand Down
12 changes: 6 additions & 6 deletions tests/valid/elements.bom.text
Original file line number Diff line number Diff line change
Expand Up @@ -1327,8 +1327,8 @@ CMS References
Levkowetz, H., "Implementation notes for RFC7991,", Work
in Progress, Internet-Draft, draft-levkowetz-xml2rfc-v3-
implementation-notes-13, 16 September 2021,
<https://www.ietf.org/archive/id/draft-levkowetz-xml2rfc-
v3-implementation-notes-13.txt>.
<https://datatracker.ietf.org/api/v1/doc/document/draft-
levkowetz-xml2rfc-v3-implementation-notes/>.

[RFC5083] Housley, R., "Cryptographic Message Syntax (CMS)
Authenticated-Enveloped-Data Content Type", RFC 5083,
Expand Down Expand Up @@ -1420,10 +1420,9 @@ Internet-Draft Xml2rfc Vocabulary V3 Elements July 2018

[REPUBLIC] Πλάτων (Plato), "Πολιτεία", 375 BC.

[RFC0952] Harrenstien, K., Stahl, M K., and E J. Feinler, "DoD
Internet host table specification", RFC 952,
DOI 10.17487/RFC0952, October 1985,
<https://www.rfc-editor.org/info/rfc952>.
[RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet
host table specification", RFC 952, DOI 10.17487/RFC0952,
October 1985, <https://www.rfc-editor.org/info/rfc952>.

Appendix A. Some Back Matter

Expand Down Expand Up @@ -1453,6 +1452,7 @@ A.1. Contributors




Author, et al. Expires 13 January 2019 [Page 26]

Internet-Draft Xml2rfc Vocabulary V3 Elements July 2018
Expand Down
12 changes: 6 additions & 6 deletions tests/valid/elements.pages.text
Original file line number Diff line number Diff line change
Expand Up @@ -1327,8 +1327,8 @@ CMS References
Levkowetz, H., "Implementation notes for RFC7991,", Work
in Progress, Internet-Draft, draft-levkowetz-xml2rfc-v3-
implementation-notes-13, September 16, 2021,
<https://www.ietf.org/archive/id/draft-levkowetz-xml2rfc-
v3-implementation-notes-13.txt>.
<https://datatracker.ietf.org/api/v1/doc/document/draft-
levkowetz-xml2rfc-v3-implementation-notes/>.

[RFC5083] Housley, R., "Cryptographic Message Syntax (CMS)
Authenticated-Enveloped-Data Content Type", RFC 5083,
Expand Down Expand Up @@ -1420,10 +1420,9 @@ Internet-Draft Xml2rfc Vocabulary V3 Elements July 2018

[REPUBLIC] Πλάτων (Plato), "Πολιτεία", 375 BC.

[RFC0952] Harrenstien, K., Stahl, M K., and E J. Feinler, "DoD
Internet host table specification", RFC 952,
DOI 10.17487/RFC0952, October 1985,
<https://www.rfc-editor.org/info/rfc952>.
[RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet
host table specification", RFC 952, DOI 10.17487/RFC0952,
October 1985, <https://www.rfc-editor.org/info/rfc952>.

Appendix A. Some Back Matter

Expand Down Expand Up @@ -1453,6 +1452,7 @@ A.1. Contributors




Author, et al. Expires January 13, 2019 [Page 26]

Internet-Draft Xml2rfc Vocabulary V3 Elements July 2018
Expand Down
Loading

0 comments on commit 9821dc6

Please sign in to comment.