-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render integer values in <sup> simply #590
Comments
@rjsparks@nostrum.com commented Thanks for the patch. |
@martin.thomson@gmail.com uploaded file Render integer subscripts simply |
@martin.thomson@gmail.com commented Ahh, I didn't see that one. I can easily change the pattern matching here, but it's not clear what the rules would be for deciding. Are you thinking |
@lars@eggert.org commented Could we make the same change to |
@martin.thomson@gmail.com uploaded file Simple rendering for super- and sub-scripts |
@jennifer@painless-security.com changed status from |
@jennifer@painless-security.com changed owner from `` to |
@jennifer@painless-security.com commented One exceptional case that jumps out at me is the case where someone explicitly wants parentheses in the HTML output. E.g., As this is a problem with the current text renderer and solving this as a general problem is tricky, perhaps that's best left as a separate issue. (Or put aside entirely.) |
@jennifer@painless-security.com commented How would you feel if I simplify the pattern to |
@jennifer@painless-security.com changed _comment0 which not transferred by tractive |
@jennifer@painless-security.com commented Very sorry for the spam, but I just ran the tests with patched code and I'm not enamored of the results:
and
The examples are contrived, so in actual use things might turn out clearer. Looking at the subscript examples lars@eggert.org pointed to,
I wonder if it might be preferable to keep the parentheses except for integers. I'm happy to do it either way, just wanted to point this out to be sure the effect is what's desired. |
@martin.thomson@gmail.com commented I find that the W_cubic example is better, but it isn't clear why the values for t and RTT are underlined in that way. Mixing subscripts and other underscores in that way ends up looking odd, but that might be something Lars can work through. Changing the W_max example is probably something Lars can do though. This works very nicely for the numbers in QUIC. Much better than with the parentheses. I do think that maybe we could remove '_' from the set of characters that was otherwise in \w to avoid creating confusion in rendering, but otherwise, I think that this is good. I think that authors will simply need to be aware of how this renders in text and adjust. Just like they probably shouldn't mix a literal '^' and ''. |
@lars@eggert.org commented It's I'd prefer that italics in text form didn't get rendered with underscores and instead simply became plain text, but that needs a separate issue filed. |
@lars@eggert.org changed _comment0 which not transferred by tractive |
@lars@eggert.org changed _comment1 which not transferred by tractive |
@jennifer@painless-security.com commented This sounds good - makes sense that people will need to be careful, since there's only so much that can be done to typeset things unambiguously. I agree that keeping parentheses if the expression includes an underscore is a good idea. I think that the pattern |
@jennifer@painless-security.com changed status from |
@jennifer@painless-security.com changed resolution from `` to |
@jennifer@painless-security.com commented Fixed in 65f2676: Simplify text rendering of super/subscripts. Based on patch submitted by martin.thomson@gmail.com. Fixes #590. Commit ready for merge. |
@martin.thomson@gmail.com commented Hi Jennifer, You have:
I don't think that is good as it allows for some weird patterns. Like '^+.word', '^+', '^23.stuff', or the empty string: '^'. I would have thought that it would be better to keep numbers and words distinct and require at least one character:
This doesn't allow for an empty digit string in any position for a number, nor does it allow for the string overall to be empty as your pattern did. Not using \w means that this loses the ability to have a unicode character in super-/sub-script, which is probably worth noting. |
@jennifer@painless-security.com commented Yes, the empty string should be rejected. The other examples are wonky, but seem contrived. If someone is using notation like that, adding parentheses to the mix is as likely to confuse the meaning as to clarify it. The reason I accept those is because it also accepts things like So I think we should perhaps take a step back, decide what we would like to accept as a token first, then implement to that. A few cases that have come up - I'd appreciate your thoughts on these or any I've overlooked. Ones we seem to agree clearly do not need parentheses:
Things that may or may not need parentheses (but we don't clearly agree):
Things we seem to agree clearly do need parentheses:
Regarding unicode, I'm inclined to keep the parentheses - I'm not sure that there's a good way to know that a character is going to be confusing without them, so it seems prudent to assume the worst. It might be nice to handle common cases, such as Greek characters, but that seems like a big project to handle well. For decimal points without digits on one side, my inclination is to keep them. They're poor style, but I don't know that they are any less readable without the parentheses. I don't feel terribly strongly about this, though. I do think accepting signs for things like Sorry for the long message - I don't mean to draw this out, but it's a tricky feature and I think being deliberate will avoid revisiting it more than necessary. |
@martin.thomson@gmail.com commented Thanks Jennifer, that makes sense. On your questionable ones:
Prefer parens, I think, but only weakly.
Prefer no parens, yeah.
Prefer no parens on -, don't care about + (it's weird, so I'm OK either way).
Prefer no parens; we could just filter out underscore. The reason is to deal with the math stuff Lars is doing, where Does that help? |
@martin.thomson@gmail.com changed _comment0 which not transferred by tractive |
@lars@eggert.org commented Replying to ietf-svn-conversion/xml2rfc#590 (comment:13):
Given that |
@jennifer@painless-security.com commented Thanks for your thoughts. I'm sold on parenthesizing the bare decimal points and on accepting unicode words. I think accepting plus signs is worthwhile - it's not common, but comes up sometimes and basing the rule on its being a sign character seems to me less likely to be surprising. Rather than trying to write all this in a RE pattern, I've expanded the I have added a check that avoids doubling up if the expression is already delimited by parentheses (so that
To give you an idea of what this does, for the following input
it renders to
What do you think? |
@martin.thomson@gmail.com commented Love it. Thanks for doing this. Given the leading +/- check, why not this ordering?
That would change |
@jennifer@painless-security.com commented I went back and forth on that. I'm happy to do it the other way. However, one thing I've realized while thinking about that is that we need to think about spaces. The issue:
becomes
which, in addition to looking like strange ascii art, is pretty ambiguous. I'm not sure how to handle this. The simple thing would be to change the
(spaces between sub/sup and before the sentence period) I suppose this is another case where we could leave it to the author to know that spaces are needed - certainly that'd be understood by LaTeX users. |
@jennifer@painless-security.com commented Ok - I had a look at the output of the HTML writer and found that its results without a space between factors also look a bit odd. With a space, they are much more readable. Based on that, I'm not going to worry about the lack of a trailing space in the text writer and leave it to the author to insert one. |
@jennifer@painless-security.com changed _comment0 which not transferred by tractive |
@jennifer@painless-security.com commented FYI, the additional work has now been committed in 28d2f44 |
@martin.thomson@gmail.com commented Thanks Jennifer, this is a nice improvement. |
@rjsparks@nostrum.com commented Fixed in 0979a66: Merged in 65f2676 and 28d2f44 from jennifer@painless-security.com:\n Simplify text rendering of super/subscripts. Based on patch submitted by martin.thomson@gmail.com and refinement from subsequent list discussion. Fixes #590. |
The attachments for these issues were lost in trac before the transition to github, and cannot be recovered. If the issue is still relevant, and the attachments can be reconstructed, please add them as new comments. |
owner:jennifer@painless-security.com
resolution_fixed
type_enhancement
| by martin.thomson@gmail.comWe have a number of places in QUIC that we are using
2^15
and similar. Using2<sup>15</sup>
makes the HTML rendering much nicer, but the text then renders as2^(15)
.A small tweak might improve rendering with no real loss of fidelity. Patch inbound.
Issue migrated from trac:590 at 2022-02-08 07:12:21 +0000
The text was updated successfully, but these errors were encountered: