diff --git a/infra.bs b/infra.bs index dd44555..8d0b9b7 100644 --- a/infra.bs +++ b/infra.bs @@ -26,6 +26,7 @@ Boilerplate: omit conformance, omit feedback-header, omit idl-index
 urlPrefix: https://tc39.github.io/ecma262/; spec: ECMA-262; type: dfn
     text: List; url: sec-list-and-record-specification-type
+    text: The String Type; url: sec-ecmascript-language-types-string-type
 
@@ -252,8 +253,11 @@ in parentheses. [[!UNICODE]]

In certain contexts code points are prefixed with "0x" instead of "U+". -

A scalar value is a code point that is not in the range -U+D800 to U+DFFF, inclusive. +

A surrogate code point is a code point that is in the range U+D800 to +U+DFFF, inclusive. + +

A scalar value is a code point that is not a +surrogate code point.

An ASCII code point is a code point in the range U+0000 to U+007F, inclusive. @@ -294,11 +298,44 @@ inclusive.

Strings

-

A string is a sequence of code points. Strings are denoted by double -quotes and monospace font. +

A JavaScript string is a sequence of unsigned 16-bit integers, also known as +code units. + +

This is different from how the Unicode Standard defines "code unit". In particular it +refers exclusively to how the Unicode Standard defines it for Unicode 16-bit strings. [[UNICODE]] + +

A JavaScript string can also be interpreted as containing code points, per the +conversion defined in The String Type section of the JavaScript specification. [[!ECMA-262]] + +

This conversion process converts surrogate pairs into their corresponding +scalar value and maps isolated surrogates to their corresponding code point, leaving +them effectively as-is. + +

A scalar value string is a sequence of scalar values. + +

A scalar value string is useful for any kind of I/O or other kind of operation +where UTF-8 encode comes into play. + + +

String can be used to refer to either a JavaScript string or +scalar value string, when it is clear from the context which is meant or when the distinction +is immaterial. Strings are denoted by double quotes and monospace font.

"Hello, world!" is a string. +

To convert a JavaScript string into a +scalar value string, replace any surrogate code points with U+FFFD. +Per definition these are isolated surrogates. + + +

A scalar value string can always be used as JavaScript string implicitly +since it is a subset. The reverse is only possible if the JavaScript string is known to not +contain surrogate code points. (An implementation likely has to perform explicit conversion, +depending on how it actually ends up representing JavaScript and +scalar value strings. It is even fairly typical for implementations to have multiple +implementations of just JavaScript strings for performance reasons and reducing memory +usage.) +

An ASCII string is a string whose code points are all ASCII code points.