Skip to content

Commit

Permalink
Fix #290
Browse files Browse the repository at this point in the history
Fixes include - correcting code to deal with the difference in unicode strings on Python 2.7 vs Python 3.4
Build the abstract when doing the indexing pass so that any references in it will be included both times through
Add the start of a unicode test file.
 - Legacy-Id: 2013
  • Loading branch information
jimsch committed Jul 18, 2015
1 parent 2a13e1d commit f6c0d33
Show file tree
Hide file tree
Showing 3 changed files with 57 additions and 1 deletion.
49 changes: 49 additions & 0 deletions cli/tests/input/unicode.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
<?xml version="1.0" encoding="UTF-8"?>
<rfc category="info" docName="draft-sample-input-00"
ipr="trust200902" submissionType="IETF">
<front>
<title abbrev="Abbreviated Title">Put Your Internet Draft Title</title>

<author fullname="John Doe" initials="J." role="editor"
surname="Doe">
<organization abbrev="Company">Company</organization>

<address>
<postal>
<street></street>
<city>Springfield</city>
<region>IL</region>
<country>US</country>
</postal>

<email>jdoe@example.com</email>
</address>
</author>

<date month="December" year="2010" day="10"/>

<abstract>
<t>Insert an abstract: MANDATORY. This template is for creating an
Internet-Draft. With some out of scope characters
in Chinese, by Xing Xing, 这里是中文译本
</t>
</abstract>
</front>

<middle>
<section title="Some unicode strings">
<t>
Text body needs to deal with funny characters
</t>
<t>
Pure out of scope 这里是中文译本
</t>
<t>
Some re-mapped characters are ¢ or ©
</t>
<t>
More re-mapped characters are ˜ and € and &#94;
</t>
</section>
</middle>
</rfc>
2 changes: 1 addition & 1 deletion cli/xml2rfc/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ def _replace_unicode_characters(str):
if match.group(1) in _unicode_replacements:
str = re.sub(match.group(1), _unicode_replacements[match.group(1)], str)
else:
entity = match.group(1).encode('ascii', 'xmlcharrefreplace')
entity = match.group(1).encode('ascii', 'xmlcharrefreplace').decode('ascii')
str = re.sub(match.group(1), entity, str)
xml2rfc.log.warn('Illegal character replaced in string: ' + entity)

Expand Down
7 changes: 7 additions & 0 deletions cli/xml2rfc/writers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1022,6 +1022,13 @@ def _build_index(self):
self.eref_count = 0
self.pis = self.xmlrfc.getpis()

# Abstract
abstract = self.r.find('front/abstract')
if abstract is not None:
self.write_heading('Abstract', autoAnchor='rfc.abstract')
for t in abstract.findall('t'):
self.write_t_rec(t)

# Middle sections
middle = self.r.find('middle')
if middle is not None:
Expand Down

0 comments on commit f6c0d33

Please sign in to comment.