-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix line folding 2 #88
Changes from 1 commit
f3886e6
d46f402
c8c1695
7850c65
009c714
e047567
d94ba04
3837fec
c21c9ba
5c2f5c1
82a9796
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,32 +8,21 @@ import 'package:yaml/yaml.dart'; | |
import 'utils.dart'; | ||
|
||
/// Given [value], tries to format it into a plain string recognizable by YAML. | ||
/// If it fails, it defaults to returning a double-quoted string. | ||
/// | ||
/// Not all values can be formatted into a plain string. If the string contains | ||
/// an escape sequence, it can only be detected when in a double-quoted | ||
/// sequence. Plain strings may also be misinterpreted by the YAML parser (e.g. | ||
/// ' null'). | ||
String? _tryYamlEncodePlain(Object? value) { | ||
if (value is YamlNode) { | ||
AssertionError( | ||
'YamlNodes should not be passed directly into getSafeString!', | ||
); | ||
} | ||
|
||
assertValidScalar(value); | ||
|
||
if (value is String) { | ||
/// If it contains a dangerous character we want to wrap the result with | ||
/// double quotes because the double quoted style allows for arbitrary | ||
/// strings with "\" escape sequences. | ||
/// | ||
/// See 7.3.1 Double-Quoted Style | ||
/// https://yaml.org/spec/1.2/spec.html#id2787109 | ||
return isDangerousString(value) ? null : value; | ||
} | ||
|
||
return value.toString(); | ||
/// | ||
/// Returns `null` if [value] cannot be encoded as a plain string. | ||
String? _tryYamlEncodePlain(String value) { | ||
/// If it contains a dangerous character we want to wrap the result with | ||
/// double quotes because the double quoted style allows for arbitrary | ||
/// strings with "\" escape sequences. | ||
/// | ||
/// See 7.3.1 Double-Quoted Style | ||
/// https://yaml.org/spec/1.2/spec.html#id2787109 | ||
return isDangerousString(value) ? null : value; | ||
} | ||
|
||
/// Checks if [string] has unprintable characters according to | ||
|
@@ -48,41 +37,6 @@ bool _hasUnprintableCharacters(String string) { | |
return false; | ||
} | ||
|
||
/// Checks if a [string] has any unprintable characters or characters that | ||
/// should be explicitly wrapped in double quotes according to | ||
/// [unprintableCharCodes] and [doubleQuoteEscapeChars] respectively. | ||
/// | ||
/// It should be noted that this check excludes the `\n` (line break) | ||
/// character as it is encoded correctly when using [ScalarStyle.LITERAL] or | ||
/// [ScalarStyle.FOLDED]. | ||
bool _shouldDoubleQuote(String string) { | ||
if (string.isEmpty || string.trimLeft().length != string.length) return true; | ||
|
||
final codeUnits = string.codeUnits; | ||
|
||
return doubleQuoteEscapeChars.keys | ||
.whereNot((charUnit) => charUnit == 10) // Anything but line breaks | ||
.any(codeUnits.contains); | ||
} | ||
|
||
/// Returns the correct block chomping indicator for [ScalarStyle.FOLDED] | ||
/// and [ScalarStyle.LITERAL]. | ||
/// | ||
/// See https://yaml.org/spec/1.2.2/#8112-block-chomping-indicator | ||
String _getChompingIndicator(String string) { | ||
/// By default, we apply an indent to the string after every new line. | ||
/// | ||
/// Apply the `keep (+)` chomping indicator for trailing whitespace to be | ||
/// treated as content. | ||
/// | ||
/// [NOTE]: We only check for trailing whitespace rather than `\n `. This is | ||
/// a coin-toss approach. If there is a new line after this, it will be kept. | ||
/// If not, nothing happens. | ||
if (string.endsWith(' ') || string.endsWith('\n')) return '+'; | ||
|
||
return '-'; | ||
} | ||
|
||
/// Generates a YAML-safe double-quoted string based on [string], escaping the | ||
/// list of characters as defined by the YAML 1.2 spec. | ||
/// | ||
|
@@ -119,6 +73,24 @@ String? _tryYamlEncodeSingleQuoted(String string) { | |
return '\'$result\''; | ||
} | ||
|
||
/// Returns the correct block chomping indicator for [ScalarStyle.FOLDED] | ||
/// and [ScalarStyle.LITERAL]. | ||
/// | ||
/// See https://yaml.org/spec/1.2.2/#8112-block-chomping-indicator | ||
String _getChompingIndicator(String string) { | ||
/// By default, we apply an indent to the string after every new line. | ||
/// | ||
/// Apply the `keep (+)` chomping indicator for trailing whitespace to be | ||
/// treated as content. | ||
/// | ||
/// [NOTE]: We only check for trailing whitespace rather than `\n `. This is | ||
/// a coin-toss approach. If there is a new line after this, it will be kept. | ||
/// If not, nothing happens. | ||
if (string.endsWith(' ') || string.endsWith('\n')) return '+'; | ||
|
||
return '-'; | ||
} | ||
|
||
/// Attempts to encode a [string] as a _YAML folded string_ and apply the | ||
/// appropriate _chomping indicator_. | ||
/// | ||
|
@@ -140,7 +112,12 @@ String? _tryYamlEncodeSingleQuoted(String string) { | |
/// | ||
/// See: https://yaml.org/spec/1.2.2/#813-folded-style | ||
String? _tryYamlEncodeFolded(String string, int indentSize, String lineEnding) { | ||
if (_shouldDoubleQuote(string)) return null; | ||
if (_hasUnprintableCharacters(string)) return null; | ||
// A string that starts with space or newline followed by space can't be | ||
// encoded in folded mode. | ||
if (string.startsWith(' ') && string.startsWith('\n ')) return null; | ||
|
||
// TODO: Are there other strings we can't encode in folded mode? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There must be other things we can't represent in folded mode, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The leading space was the main culprit. Also the |
||
|
||
final indent = ' ' * indentSize; | ||
|
||
|
@@ -195,22 +172,27 @@ String? _tryYamlEncodeFolded(String string, int indentSize, String lineEnding) { | |
/// my literal | ||
/// string | ||
/// | ||
/// # With the "keep" chomping indicator | ||
/// key: |+ | ||
/// # Without chomping indicator | ||
/// key: | | ||
/// my literal | ||
/// string | ||
/// ``` | ||
/// | ||
/// See: https://yaml.org/spec/1.2.2/#812-literal-style | ||
String? _tryYamlEncodeLiteral( | ||
String string, int indentSize, String lineEnding) { | ||
if (_shouldDoubleQuote(string)) return null; | ||
if (_hasUnprintableCharacters(string)) return null; | ||
// A string that starts with space or newline followed by space can't be | ||
// encoded in literal mode. | ||
if (string.startsWith(' ') && string.startsWith('\n ')) return null; | ||
|
||
// TODO: Are there other strings we can't encode in literal mode? | ||
|
||
final indent = ' ' * indentSize; | ||
|
||
/// Simplest block style. | ||
/// * https://yaml.org/spec/1.2.2/#812-literal-style | ||
return '|${_getChompingIndicator(string)}\n$indent' | ||
return '|${string.endsWith('\n') ? '' : '-'}\n$indent' | ||
'${string.replaceAll('\n', lineEnding + indent)}'; | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jonasfj I finally got around to look into this. Why the change? Literal strings are also subject to chomping as indicated here This causes anything with more than one trailing
That's why I went for a wildcard approach. We chomp so that it's always parsed correctly by |
||
|
@@ -222,22 +204,23 @@ String? _tryYamlEncodeLiteral( | |
String _yamlEncodeFlowScalar(YamlScalar yamlScalar) { | ||
final YamlScalar(:value, :style) = yamlScalar; | ||
|
||
final isString = value is String; | ||
if (value is! String) { | ||
return value.toString(); | ||
} | ||
|
||
switch (style) { | ||
/// Only encode as double-quoted if it's a string. | ||
case ScalarStyle.DOUBLE_QUOTED when isString: | ||
case ScalarStyle.DOUBLE_QUOTED: | ||
return _yamlEncodeDoubleQuoted(value); | ||
|
||
case ScalarStyle.SINGLE_QUOTED when isString: | ||
case ScalarStyle.SINGLE_QUOTED: | ||
return _tryYamlEncodeSingleQuoted(value) ?? | ||
_yamlEncodeDoubleQuoted(value); | ||
|
||
/// Cast into [String] if [null] as this condition only returns [null] | ||
/// for a [String] that can't be encoded. | ||
default: | ||
return _tryYamlEncodePlain(value) ?? | ||
_yamlEncodeDoubleQuoted(value as String); | ||
return _tryYamlEncodePlain(value) ?? _yamlEncodeDoubleQuoted(value); | ||
} | ||
} | ||
|
||
|
@@ -253,41 +236,37 @@ String _yamlEncodeBlockScalar( | |
final YamlScalar(:value, :style) = yamlScalar; | ||
assertValidScalar(value); | ||
|
||
final isString = value is String; | ||
|
||
if (isString && _hasUnprintableCharacters(value)) { | ||
return _yamlEncodeDoubleQuoted(value); | ||
if (value is! String) { | ||
return value.toString(); | ||
} | ||
|
||
switch (style) { | ||
/// Prefer 'plain', fallback to "double quoted". Cast into [String] if | ||
/// null as this condition only returns [null] for a [String] that can't | ||
/// be encoded. | ||
case ScalarStyle.PLAIN: | ||
return _tryYamlEncodePlain(value) ?? | ||
_yamlEncodeDoubleQuoted(value as String); | ||
return _tryYamlEncodePlain(value) ?? _yamlEncodeDoubleQuoted(value); | ||
|
||
// Prefer 'single quoted', fallback to "double quoted" | ||
case ScalarStyle.SINGLE_QUOTED when isString: | ||
case ScalarStyle.SINGLE_QUOTED: | ||
return _tryYamlEncodeSingleQuoted(value) ?? | ||
_yamlEncodeDoubleQuoted(value); | ||
|
||
/// Prefer folded string, try literal as fallback | ||
/// otherwise fallback to "double quoted" | ||
case ScalarStyle.FOLDED when isString: | ||
case ScalarStyle.FOLDED: | ||
return _tryYamlEncodeFolded(value, indentation, lineEnding) ?? | ||
_yamlEncodeDoubleQuoted(value); | ||
|
||
/// Prefer literal string, try folded as fallback | ||
/// otherwise fallback to "double quoted" | ||
case ScalarStyle.LITERAL when isString: | ||
case ScalarStyle.LITERAL: | ||
return _tryYamlEncodeLiteral(value, indentation, lineEnding) ?? | ||
_yamlEncodeDoubleQuoted(value); | ||
|
||
/// Prefer plain, fallback to "double quoted" | ||
default: | ||
return _tryYamlEncodePlain(value) ?? | ||
_yamlEncodeDoubleQuoted(value as String); | ||
return _tryYamlEncodePlain(value) ?? _yamlEncodeDoubleQuoted(value); | ||
} | ||
} | ||
|
||
|
@@ -298,16 +277,16 @@ String _yamlEncodeBlockScalar( | |
/// string scalar that starts with '>', a child having a block style | ||
/// parameters), in which case we will produce [value] with default styling | ||
/// options. | ||
String yamlEncodeFlowString(YamlNode value) { | ||
String yamlEncodeFlow(YamlNode value) { | ||
if (value is YamlList) { | ||
final list = value.nodes; | ||
|
||
final safeValues = list.map(yamlEncodeFlowString); | ||
final safeValues = list.map(yamlEncodeFlow); | ||
return '[${safeValues.join(', ')}]'; | ||
} else if (value is YamlMap) { | ||
final safeEntries = value.nodes.entries.map((entry) { | ||
final safeKey = yamlEncodeFlowString(entry.key as YamlNode); | ||
final safeValue = yamlEncodeFlowString(entry.value); | ||
final safeKey = yamlEncodeFlow(entry.key as YamlNode); | ||
final safeValue = yamlEncodeFlow(entry.value); | ||
return '$safeKey: $safeValue'; | ||
}); | ||
|
||
|
@@ -318,14 +297,14 @@ String yamlEncodeFlowString(YamlNode value) { | |
} | ||
|
||
/// Returns [value] with the necessary formatting applied in a block context. | ||
String yamlEncodeBlockString( | ||
String yamlEncodeBlock( | ||
YamlNode value, | ||
int indentation, | ||
String lineEnding, | ||
) { | ||
const additionalIndentation = 2; | ||
|
||
if (!isBlockNode(value)) return yamlEncodeFlowString(value); | ||
if (!isBlockNode(value)) return yamlEncodeFlow(value); | ||
|
||
final newIndentation = indentation + additionalIndentation; | ||
|
||
|
@@ -337,8 +316,7 @@ String yamlEncodeBlockString( | |
final children = value.nodes; | ||
|
||
safeValues = children.map((child) { | ||
var valueString = | ||
yamlEncodeBlockString(child, newIndentation, lineEnding); | ||
var valueString = yamlEncodeBlock(child, newIndentation, lineEnding); | ||
if (isCollection(child) && !isFlowYamlCollectionNode(child)) { | ||
valueString = valueString.substring(newIndentation); | ||
} | ||
|
@@ -353,10 +331,10 @@ String yamlEncodeBlockString( | |
return value.nodes.entries.map((entry) { | ||
final MapEntry(:key, :value) = entry; | ||
|
||
final safeKey = yamlEncodeFlowString(key as YamlNode); | ||
final safeKey = yamlEncodeFlow(key as YamlNode); | ||
final formattedKey = ' ' * indentation + safeKey; | ||
|
||
final formattedValue = yamlEncodeBlockString( | ||
final formattedValue = yamlEncodeBlock( | ||
value, | ||
newIndentation, | ||
lineEnding, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm realizing that we might be missing the point with folded mode.
The point with folded mode is that we can replace
a b
witha\nb
and still get the same value.If the string contains newlines then this becomes complicated, because a newline essentially have to be replaced by an empty-line, right?
But also, rather importantly: the point of a folded string is to be able to wrap long lines without affecting the end-result.
In this case (A) and (B) are the same. The reason for using folded strings is that you don't want VERY long lines.
So if we want to support folded string, I guess we ought to maybe wrap strings at 80 characters, or something like that?
This is perhaps something we can do in a follow up PR, if we just make the folded string logic work correctly, then perhaps improving the formatting is a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fairly tricky (but achievable) implementation that's why I initially asked this in the comment. It seems I didn't frame it correctly. I meant it like,
"Should we leave it as-is with the formatting we do or mutate it to match
yaml
while still holding the same meaning" when parsed.\n
have to be\n\n
in folded to hold the same meaning unless the next line starts with leading space then we can let it slide.☝🏽 That makes it a bit hard to apply to the first line that starts with a leading space. Since at this point
package: yaml
won't have the indent information. That's why explicitly fall back todoubleQuoted
. Will work on it next.