Skip to content

Commit

Permalink
Tokenizer/PHP: retokenize the match double arrow to T_MATCH_ARROW
Browse files Browse the repository at this point in the history
The double arrow in PHP is used in a number of contexts:
* Long/short arrays with keys;
* Long/short lists with keys;
* In the `as` part of `foreach()` statements with keys;
* For `yield` statements with keys;
* For arrow functions as the scope opener;
* And now for `match` expressions to separate the cases from the return value (body).

As most of these constructs can be nested in each other - an arrow function in an array value, a match expression in a list key -, every sniff handling any of these constructs has to take a lot of care when searching for the double arrow for the construct they are handling, to prevent matching a double arrow belonging to another type of construct nested in the target construct.

This type of detection and special handling has to be done in each individual sniff which in one way or another has to deal with the `T_DOUBLE_ARROW` token and can cause quite some processing overhead.

With that in mind, the double arrow as a scope opener for arrow functions has previously already been retokenized to `T_FN_ARROW`.

Following the same reasoning, I'm proposing to retokenize the double arrow which separates `match` case expressions from the body expression to `T_MATCH_ARROW`.

This should make life easier for any sniff dealing with any of the above constructs and will prevent potential false positives being introduced for sniffs currently handling any of these constructs, but not yet updated to allow for match expressions.

Includes a set of dedicated unit tests verifying the tokenization of the double arrow operator in all currently supported contexts, including in combined (nested) contexts.
  • Loading branch information
jrfnl committed Feb 22, 2021
1 parent 0fa0e01 commit 8353436
Show file tree
Hide file tree
Showing 5 changed files with 502 additions and 0 deletions.
6 changes: 6 additions & 0 deletions package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,8 @@ http://pear.php.net/dtd/package-2.0.xsd">
<file baseinstalldir="" name="BitwiseOrTest.php" role="test" />
<file baseinstalldir="" name="DefaultKeywordTest.inc" role="test" />
<file baseinstalldir="" name="DefaultKeywordTest.php" role="test" />
<file baseinstalldir="" name="DoubleArrowTest.inc" role="test" />
<file baseinstalldir="" name="DoubleArrowTest.php" role="test" />
<file baseinstalldir="" name="GotoLabelTest.inc" role="test" />
<file baseinstalldir="" name="GotoLabelTest.php" role="test" />
<file baseinstalldir="" name="NamedFunctionCallArgumentsTest.inc" role="test" />
Expand Down Expand Up @@ -2112,6 +2114,8 @@ http://pear.php.net/dtd/package-2.0.xsd">
<install as="CodeSniffer/Core/Tokenizer/BitwiseOrTest.inc" name="tests/Core/Tokenizer/BitwiseOrTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/DefaultKeywordTest.php" name="tests/Core/Tokenizer/DefaultKeywordTest.php" />
<install as="CodeSniffer/Core/Tokenizer/DefaultKeywordTest.inc" name="tests/Core/Tokenizer/DefaultKeywordTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/DoubleArrowTest.php" name="tests/Core/Tokenizer/DoubleArrowTest.php" />
<install as="CodeSniffer/Core/Tokenizer/DoubleArrowTest.inc" name="tests/Core/Tokenizer/DoubleArrowTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/GotoLabelTest.php" name="tests/Core/Tokenizer/GotoLabelTest.php" />
<install as="CodeSniffer/Core/Tokenizer/GotoLabelTest.inc" name="tests/Core/Tokenizer/GotoLabelTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/NamedFunctionCallArgumentsTest.php" name="tests/Core/Tokenizer/NamedFunctionCallArgumentsTest.php" />
Expand Down Expand Up @@ -2192,6 +2196,8 @@ http://pear.php.net/dtd/package-2.0.xsd">
<install as="CodeSniffer/Core/Tokenizer/BitwiseOrTest.inc" name="tests/Core/Tokenizer/BitwiseOrTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/DefaultKeywordTest.php" name="tests/Core/Tokenizer/DefaultKeywordTest.php" />
<install as="CodeSniffer/Core/Tokenizer/DefaultKeywordTest.inc" name="tests/Core/Tokenizer/DefaultKeywordTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/DoubleArrowTest.php" name="tests/Core/Tokenizer/DoubleArrowTest.php" />
<install as="CodeSniffer/Core/Tokenizer/DoubleArrowTest.inc" name="tests/Core/Tokenizer/DoubleArrowTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/GotoLabelTest.php" name="tests/Core/Tokenizer/GotoLabelTest.php" />
<install as="CodeSniffer/Core/Tokenizer/GotoLabelTest.inc" name="tests/Core/Tokenizer/GotoLabelTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/NamedFunctionCallArgumentsTest.php" name="tests/Core/Tokenizer/NamedFunctionCallArgumentsTest.php" />
Expand Down
51 changes: 51 additions & 0 deletions src/Tokenizers/PHP.php
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,7 @@ class PHP extends Tokenizer
T_LOGICAL_OR => 2,
T_LOGICAL_XOR => 3,
T_MATCH => 5,
T_MATCH_ARROW => 2,
T_MATCH_DEFAULT => 7,
T_METHOD_C => 10,
T_MINUS_EQUAL => 2,
Expand Down Expand Up @@ -1371,6 +1372,15 @@ protected function tokenize($string)
&& is_array($tokens[$x]) === true
&& $tokens[$x][0] === T_DOUBLE_ARROW
) {
// Modify the original token stack for the double arrow so that
// future checks can disregard the double arrow token more easily.
// For match expression "case" statements, this is handled
// in PHP::processAdditional().
$tokens[$x][0] = T_MATCH_ARROW;
if (PHP_CODESNIFFER_VERBOSITY > 1) {
echo "\t\t* token $x changed from T_DOUBLE_ARROW to T_MATCH_ARROW".PHP_EOL;
}

$newToken = [];
$newToken['code'] = T_MATCH_DEFAULT;
$newToken['type'] = 'T_MATCH_DEFAULT';
Expand Down Expand Up @@ -2450,6 +2460,47 @@ protected function processAdditional()
echo "\t\t* cleaned parenthesis of token $i *".PHP_EOL;
}
}
} else {
// Retokenize the double arrows for match expression cases to `T_MATCH_ARROW`.
$searchFor = [
T_OPEN_CURLY_BRACKET => T_OPEN_CURLY_BRACKET,
T_OPEN_SQUARE_BRACKET => T_OPEN_SQUARE_BRACKET,
T_OPEN_PARENTHESIS => T_OPEN_PARENTHESIS,
T_OPEN_SHORT_ARRAY => T_OPEN_SHORT_ARRAY,
T_DOUBLE_ARROW => T_DOUBLE_ARROW,
];
$searchFor += Util\Tokens::$scopeOpeners;

for ($x = ($this->tokens[$i]['scope_opener'] + 1); $x < $this->tokens[$i]['scope_closer']; $x++) {
if (isset($searchFor[$this->tokens[$x]['code']]) === false) {
continue;
}

if (isset($this->tokens[$x]['scope_closer']) === true) {
$x = $this->tokens[$x]['scope_closer'];
continue;
}

if (isset($this->tokens[$x]['parenthesis_closer']) === true) {
$x = $this->tokens[$x]['parenthesis_closer'];
continue;
}

if (isset($this->tokens[$x]['bracket_closer']) === true) {
$x = $this->tokens[$x]['bracket_closer'];
continue;
}

// This must be a double arrow, but make sure anyhow.
if ($this->tokens[$x]['code'] === T_DOUBLE_ARROW) {
$this->tokens[$x]['code'] = T_MATCH_ARROW;
$this->tokens[$x]['type'] = 'T_MATCH_ARROW';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
echo "\t\t* token $x changed from T_DOUBLE_ARROW to T_MATCH_ARROW".PHP_EOL;
}
}
}//end for
}//end if

continue;
Expand Down
1 change: 1 addition & 0 deletions src/Util/Tokens.php
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
define('T_FN_ARROW', 'PHPCS_T_FN_ARROW');
define('T_TYPE_UNION', 'PHPCS_T_TYPE_UNION');
define('T_PARAM_NAME', 'PHPCS_T_PARAM_NAME');
define('T_MATCH_ARROW', 'PHPCS_T_MATCH_ARROW');
define('T_MATCH_DEFAULT', 'PHPCS_T_MATCH_DEFAULT');

// Some PHP 5.5 tokens, replicated for lower versions.
Expand Down
221 changes: 221 additions & 0 deletions tests/Core/Tokenizer/DoubleArrowTest.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
<?php

function simpleLongArray($x) {
return array(
/* testLongArrayArrowSimple */
0 => 'Zero',
);
}

function simpleShortArray($x) {
return [
/* testShortArrayArrowSimple */
0 => 'Zero',
];
}

function simpleLongList($x) {
list(
/* testLongListArrowSimple */
0 => $a,
) = $x;
}

function simpleShortList($x) {
[
/* testShortListArrowSimple */
0 => $a,
] = $x;
}

function simpleYield($x) {
$i = 0;
foreach (explode("\n", $x) as $line) {
/* testYieldArrowSimple */
yield ++$i => $line;
}
}

function simpleForeach($x) {
/* testForeachArrowSimple */
foreach ($x as $k => $value) {}
}

function simpleMatch($x) {
return match ($x) {
/* testMatchArrowSimpleSingleCase */
0 => 'Zero',
/* testMatchArrowSimpleMultiCase */
2, 4, 6 => 'Zero',
/* testMatchArrowSimpleSingleCaseWithTrailingComma */
1, => 'Zero',
/* testMatchArrowSimpleMultiCaseWithTrailingComma */
3, 5, => 'Zero',
};
}

function simpleArrowFunction($y) {
/* testFnArrowSimple */
return fn ($y) => callMe($y);
}

function matchNestedInMatch() {
$x = match ($y) {
/* testMatchArrowNestedMatchOuter */
default, => match ($z) {
/* testMatchArrowNestedMatchInner */
1 => 1
},
};
}

function matchNestedInLongArrayValue() {
$array = array(
/* testLongArrayArrowWithNestedMatchValue1 */
'a' => match ($test) {
/* testMatchArrowInLongArrayValue1 */
1 => 'a',
/* testMatchArrowInLongArrayValue2 */
2 => 'b'
},
/* testLongArrayArrowWithNestedMatchValue2 */
$i => match ($test) {
/* testMatchArrowInLongArrayValue3 */
1 => 'a',
},
);
}

function matchNestedInShortArrayValue() {
$array = [
/* testShortArrayArrowWithNestedMatchValue1 */
'a' => match ($test) {
/* testMatchArrowInShortArrayValue1 */
1 => 'a',
/* testMatchArrowInShortArrayValue2 */
2 => 'b'
},
/* testShortArrayArrowWithNestedMatchValue2 */
$i => match ($test) {
/* testMatchArrowInShortArrayValue3 */
1 => 'a',
},
];
}

function matchNestedInLongArrayKey() {
$array = array(
match ($test) { /* testMatchArrowInLongArrayKey1 */ 1 => 'a', /* testMatchArrowInLongArrayKey2 */ 2 => 'b' }
/* testLongArrayArrowWithMatchKey */
=> 'dynamic keys, woho!',
);
}

function matchNestedInShortArrayKey() {
$array = [
match ($test) { /* testMatchArrowInShortArrayKey1 */ 1 => 'a', /* testMatchArrowInShortArrayKey2 */ 2 => 'b' }
/* testShortArrayArrowWithMatchKey */
=> 'dynamic keys, woho!',
];
}

function arraysNestedInMatch() {
$matcher = match ($x) {
/* testMatchArrowWithLongArrayBodyWithKeys */
0 => array(
/* testLongArrayArrowInMatchBody1 */
0 => 1,
/* testLongArrayArrowInMatchBody2 */
'a' => 2,
/* testLongArrayArrowInMatchBody3 */
'b' => 3
),
/* testMatchArrowWithShortArrayBodyWithoutKeys */
1 => [1, 2, 3],
/* testMatchArrowWithLongArrayBodyWithoutKeys */
2 => array( 1, [1, 2, 3], 2, 3),
/* testMatchArrowWithShortArrayBodyWithKeys */
3 => [
/* testShortArrayArrowInMatchBody1 */
0 => 1,
/* testShortArrayArrowInMatchBody2 */
'a' => array(1, 2, 3),
/* testShortArrayArrowInMatchBody3 */
'b' => 2,
3
],
/* testShortArrayArrowinMatchCase1 */
[4 => 'a', /* testShortArrayArrowinMatchCase2 */ 5 => 6]
/* testMatchArrowWithShortArrayWithKeysAsCase */
=> 'match with array as case value',
/* testShortArrayArrowinMatchCase3 */
[4 => 'a'], /* testLongArrayArrowinMatchCase4 */ array(5 => 6),
/* testMatchArrowWithMultipleArraysWithKeysAsCase */
=> 'match with multiple arrays as case value',
};
}

function matchNestedInArrowFunction($x) {
/* testFnArrowWithMatchInValue */
$fn = fn($x) => match(true) {
/* testMatchArrowInFnBody1 */
1, 2, 3, 4, 5 => 'foo',
/* testMatchArrowInFnBody2 */
default => 'bar',
};
}

function arrowFunctionsNestedInMatch($x) {
return match ($x) {
/* testMatchArrowWithFnBody1 */
1 => /* testFnArrowInMatchBody1 */ fn($y) => callMe($y),
/* testMatchArrowWithFnBody2 */
default => /* testFnArrowInMatchBody2 */ fn($y) => callThem($y)
};
}

function matchShortArrayMismash() {
$array = [
match ($test) {
/* testMatchArrowInComplexShortArrayKey1 */
1 => [ /* testShortArrayArrowInComplexMatchValueinShortArrayKey */ 1 => 'a'],
/* testMatchArrowInComplexShortArrayKey2 */
2 => 'b'
/* testShortArrayArrowInComplexMatchArrayMismash */
} => match ($test) {
/* testMatchArrowInComplexShortArrayValue1 */
1 => [ /* testShortArrayArrowInComplexMatchValueinShortArrayValue */ 1 => 'a'],
/* testMatchArrowInComplexShortArrayValue1 */
2 => /* testFnArrowInComplexMatchValueInShortArrayValue */ fn($y) => callMe($y)
},
];
}


function longListInMatch($x, $y) {
return match($x) {
/* testMatchArrowWithLongListBody */
1 => list('a' => $a, /* testLongListArrowInMatchBody */ 'b' => $b, 'c' => list('d' => $c)) = $y,
/* testLongListArrowInMatchCase */
list('a' => $a, 'b' => $b) = $y /* testMatchArrowWithLongListInCase */ => 'something'
};
}

function shortListInMatch($x, $y) {
return match($x) {
/* testMatchArrowWithShortListBody */
1 => ['a' => $a, 'b' => $b, 'c' => /* testShortListArrowInMatchBody */ ['d' => $c]] = $y,
/* testShortListArrowInMatchCase */
['a' => $a, 'b' => $b] = $y /* testMatchArrowWithShortListInCase */ => 'something'
};
}

function matchInLongList() {
/* testMatchArrowInLongListKey */
list(match($x) {1 => 1, 2 => 2} /* testLongListArrowWithMatchInKey */ => $a) = $array;
}

function matchInShortList() {
/* testMatchArrowInShortListKey */
[match($x) {1 => 1, 2 => 2} /* testShortListArrowWithMatchInKey */ => $a] = $array;
}
Loading

0 comments on commit 8353436

Please sign in to comment.