Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sub/sup regression in 3.7.0 TXT #634

Open
ietf-svn-bot opened this issue May 5, 2021 · 17 comments
Open

sub/sup regression in 3.7.0 TXT #634

ietf-svn-bot opened this issue May 5, 2021 · 17 comments
Labels
medium text Issues in text output under_review

Comments

@ietf-svn-bot
Copy link

type_defect | by cabo@tzi.org


Foobarbaz

renders correctly in HTML

had a recognizable surrogate in TXT in 3.5.0: Foo_(bar)baz

is munched up in TXT in 3.7.0: Foo_barbaz


Issue migrated from trac:634 at 2022-02-08 07:15:03 +0000

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


RFC 8949 says:

   superscript notation denotes exponentiation.  For example, 2 to the
   power of 64 is notated: 2^(64).  In the plain-text version of this
   specification, superscript notation is not available and therefore is
   rendered by a surrogate notation.  That notation is not optimized for
   this RFC; it is unfortunately ambiguous with C's exclusive-or (which
   is only used in the appendices, which in turn do not use
   exponentiation) and requires circumspection from the reader of the
   plain-text version.

which now would changed in a re-rendering.

E.g.,

   *  an integer in the range -2^(64)..2^(64)-1 inclusive

would become

   *  an integer in the range -2^64..2^64-1 inclusive

which may or may not mean the same thing to readers.

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org changed component from v3 vocabulary to Version_3_cli_txt

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


(fixed component not to be vocabulary, which doesn't seem to get worked on)

@ietf-svn-bot
Copy link
Author

@rjsparks@nostrum.com changed status from new to under_review

@ietf-svn-bot
Copy link
Author

@rjsparks@nostrum.com commented


Can you provide a real example where this has been a problem? This is a result of a requested change, driven by Martin and Lars, which was accepted by the CMT to simplify the text rendering. See #590. The reaction to this change has been positive.

This may be an edge case that we need to consider creating different behavior (when text immediately follows the sub).

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


The problem is that we have a canonical form that makes the structure perfectly clear, and an HTML/PDF rendering that is not much worse.

So the TXT form is always going to be an afterthought, and each time you tweak it in one direction, it gets worse for something else.

Changing the canonical form/HTML/PDF to fix the TXT is a non-starter.
The only real solution will be adding a capability for the author to tweak the .TXT. That is against current RFCXML ideology.

@ietf-svn-bot
Copy link
Author

@martin.thomson@gmail.com commented


That idealogy has not been strictly adhered to. Why bother pretending that it needs to. There are a few other places where XML contains instructions that are only executed for the text rendering. The same can apply here. <sup paren="true">...</sup> or equivalent seem totally inoffensive next to <ul indent="5">.

@ietf-svn-bot
Copy link
Author

@martin.thomson@gmail.com changed _comment0 which not transferred by tractive

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


Nice. Maybe more like <sup txtl="**" txtr=""> or <sup txtl="^(" txtr=")"> -- better give the author full control over the weirdness they need. Default could stay "^" and "" (new) or "^(" and ")" (old).

@ietf-svn-bot
Copy link
Author

@martin.thomson@gmail.com commented


I would not include the caret in txtl, or move it to a different attribute (with a default value).

Otherwise this seems like a good direction to me, though I caution that the defaults are not as simple as that. 2<sup>64</sup> renders as 2^64, but 2<sup>n+r</sup> renders as 2^(n+r). That means there can't be a fixed default for any attributes.

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


OK, <sup op="^" l="(" r=")">, where the default of op is "^" (for sup, and "_" for sub) and the default of l and r is #implied (i.e., to be computed in the preptool based on the complexity of the element content).

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org changed _comment0 which not transferred by tractive

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


The brokenness of #590 is already being discussed there under "space".
It is not just the element content of the sub/sup that can cause a need for paren packing, it is also the incompatibility between the content and the right context.
That seems extremely simple to fix in the code that guesses the default values for l/r, without the (also desirable) vocabulary fixes that are being discussed here.

@ietf-svn-bot
Copy link
Author

@rjsparks@nostrum.com commented


Per CMT discussion today, we'll pursue the simple paren="true" mod on the existing implementation, but not try to go down the path of finer-grained control.

@ietf-svn-bot
Copy link
Author

@mahoney@nostrum.com commented


RFC 9043 (currently in AUTH48) would benefit from this paren="true" enhancement.

In RFC 9043, a_b is constructed ab and represents the value of a sequence. slice_x is a variable name. It is unclear in the text file which is a subscript and which is not. (I also mentioned this in ticket #574).

https://www.rfc-editor.org/v3test/rfc9043.xml
https://www.rfc-editor.org/v3test/rfc9043.txt

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


Re the CMT result (comment 11):

op="**" would have completely resolved the ugliness in RFC 8949.

I cannot agree with the decision not to provide that.

@ietf-svn-bot
Copy link
Author

@cabo@tzi.org commented


Re RFC-to-be 9043:
You undo the paren regression by inserting a U+2009 (zero-width space) into the subscript.
No paren= needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium text Issues in text output under_review
Projects
None yet
Development

No branches or pull requests

1 participant