Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: CRs are Converted to LFs When Pasting Text #931

Closed
JMichaelTX opened this issue Oct 26, 2017 · 5 comments
Closed

Bug Report: CRs are Converted to LFs When Pasting Text #931

JMichaelTX opened this issue Oct 26, 2017 · 5 comments

Comments

@JMichaelTX
Copy link

JMichaelTX commented Oct 26, 2017

I believe I have found a bug when pasting text that contains the CR character (ASCII 13) into the Test String panel for a RegEx101.com snippet.

Running Google Chrome 61.0.3163.100 (3163.100) on macOS 10.11.6.

When text is pasted into this panel, it evidently converts all CR (ASCII 13) to LF (ASCII 10).
I have confirmed the proper data is being placed on the Clipboard by AppleScript, by using the following:
• Clipboard Viewer.app
• BBEdit.app
• Keyboard Maestro.app

The simple AppleScript is:

set sourceStr to "CR" & return & "LF" & linefeed & "Some Text with LF" & linefeed
set the clipboard to sourceStr

Note the AppleScript "return" command generates an ASCII 13 character, confirmed by clipboard viewer.

Please see this RegEx101.com snippet:
https://regex101.com/r/nxsiWT/1

TIA for resolving this issue.

@OnlineCop
Copy link
Collaborator

OnlineCop commented Oct 26, 2017

#442 #257

According to the w3 spec:

When a textarea is mutable, its raw value should be editable by the user: the user agent should allow the user to edit, insert, and remove text, and to insert and remove line breaks in the form of "LF" (U+000A) characters.
...
For historical reasons, the element's value is normalised in three different ways for three different purposes. The raw value is the value as it was originally set. It is not normalized. The API value is the value used in the value IDL attribute. It is normalized so that line breaks use "LF" (U+000A) characters. Finally, there is the form submission value. It is normalized so that line breaks use U+000D CARRIAGE RETURN "CRLF" (U+000A) character pairs, and in addition, if necessary given the element's wrap attribute, additional line breaks are inserted to wrap the text at the given width.

Are you aware of any other web forms (a textarea specifically) that retains the CR and LF that you copy into it? I'd like to see how they are handling various line endings (I'm thinking that this makes it entirely the browser's fault for the CRLF/CR/LF changes)

@TWiStErRob
Copy link
Collaborator

@OnlineCop I'm a bit confused, that's a CodeMirror div as well, where's the textarea?

@OnlineCop
Copy link
Collaborator

This may be relevant: codemirror/codemirror5#3395

@JMichaelTX
Copy link
Author

Thanks for reviewing and for the info guys.
So, if this (changing of CR to LF) is some kind of w3 and/or browser standard, that means we can never test for CR using RegEx101.com, right?

How about this: Would the CRs be retained if they were in a file? If so, then could RegEx101 allow for setting the source test string from a file?

@firasdib
Copy link
Owner

firasdib commented Nov 1, 2017

@TWiStErRob Codemirror is backed by a hidden text area

You can not currently test CR on regex101.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants