Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I18n: Add check to detect string wrapped in HTML #1856

Merged
merged 11 commits into from
Feb 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 34 additions & 2 deletions WordPress/Sniffs/WP/I18nSniff.php
Original file line number Diff line number Diff line change
Expand Up @@ -630,11 +630,43 @@ protected function check_text( $context ) {
*
* Strip placeholders and surrounding quotes.
*/
$non_placeholder_content = trim( $this->strip_quotes( $content ) );
$non_placeholder_content = preg_replace( self::SPRINTF_PLACEHOLDER_REGEX, '', $non_placeholder_content );
$content_without_surrounding_quotes = trim( $this->strip_quotes( $content ) );
$non_placeholder_content = preg_replace( self::SPRINTF_PLACEHOLDER_REGEX, '', $content_without_surrounding_quotes );

if ( '' === $non_placeholder_content ) {
$this->phpcsFile->addError( 'Strings should have translatable content', $stack_ptr, 'NoEmptyStrings' );
ockham marked this conversation as resolved.
Show resolved Hide resolved
return;
}

/*
* NoHtmlWrappedStrings
*
* Strip surrounding quotes.
*/
$reader = new \XMLReader();
$reader->XML( $content_without_surrounding_quotes, 'UTF-8', LIBXML_NOERROR | LIBXML_ERR_NONE | LIBXML_NOWARNING );

// Is the first node an HTML element?
if ( ! $reader->read() || \XMLReader::ELEMENT !== $reader->nodeType ) {
return;
}

// If the opening HTML element includes placeholders in its attributes, we don't warn.
// E.g. '<option id="%1$s" value="%2$s">Translatable option name</option>'.
for ( $i = 0; $attr = $reader->getAttributeNo( $i ); $i++ ) {
if ( preg_match( self::SPRINTF_PLACEHOLDER_REGEX, $attr ) ) {
return;
}
}

// We don't flag strings wrapped in `<a href="...">...</a>`, as the link target might actually need localization.
if ( 'a' === $reader->name && $reader->getAttribute( 'href' ) ) {
return;
}

// Does the entire string only consist of this HTML node?
if ( $reader->readOuterXml() === $content_without_surrounding_quotes ) {
$this->phpcsFile->addWarning( 'Strings should not be wrapped in HTML', $stack_ptr, 'NoHtmlWrappedStrings' );
}
}

Expand Down
32 changes: 32 additions & 0 deletions WordPress/Tests/WP/I18nUnitTest.1.inc
Original file line number Diff line number Diff line change
Expand Up @@ -183,5 +183,37 @@ _n_noop($singular); // Bad x 3.
// This test is needed to verify that the missing plural argument above does not cause an internal error, stopping the run.
_n_noop( 'I have %d cat.', 'I have %d cats.' ); // Bad.

// HTML wrapped strings.
__( '<address>123 Fake Street</address>', 'my-slug' ); // Bad, string shouldn't be wrapped in HTML.
__( 'I live at <address>123 Fake Street</address>', 'my-slug' ); // Okay, no consensus on HTML tags within strings.
__( '<address>123 Fake Street</address> is my home', 'my-slug' ); // Okay, no consensus on HTML tags within strings.
__( '<span>Text</span> More text <span>Text</span>', 'my-slug' ); // Good, we're not wrapping
__( '<div class="my-class">Translatable content</div>', 'my-slug' ); // Bad
__( '<option id="%1$s" value="%2$s">Translatable option name</option>', 'my-slug' ); // Wrapping is okay, since there are placeholders
__( '<b><i>Foo</i></b>', 'my-slug' ); // Bad
__( '<foo>Foo</foo>', 'my-slug' ); // Good

// Strings wrapped in `<a href="...">...</a>` aren't flagged, since the link target might require localization.
__( '<a href="http://wordpress.org">WordPress</a>', 'my-slug' ); // Good
__( '<a id="anchor">translatable text</a>', 'my-slug' ) // Bad, since no href
__( '<a>translatable text</a>', 'my-slug' ) // Bad, since no href
__( '<a href="%s">translatable text</a>', 'my-slug' ) // Good

// Strings containing malformed XML, to make sure we're able to parse them. No warnings should be thrown.
__( '<div>text to translate', 'my-slug' );
__( 'text to <div>translate', 'my-slug' );
__( 'text < to translate', 'my-slug' );
__( 'text to > translate', 'my-slug' );
__( '<div data-type=">foo">my address</div>', 'my-slug' );
__( '<div>text to translate<div>', 'my-slug' );
__( '<>text to translate</>', 'my-slug' );
__( '<>text to translate<>', 'my-slug' );
__( '<div><div>', 'my-slug' );
__( '<<>>text to translate<<>', 'my-slug' );
__( '<<>>123</<>', 'my-slug' );
__( '<b><i>Foo</b></i>', 'my-slug' );
__( '<b><i>Foo</b></i></b>', 'my-slug' );
__( '<foo>Foo</bar>', 'my-slug' );

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to see some additional tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added your example and another variation of an existing one, but seem to struggle to come up with meaningful ones overall. Could you give me some pointers as to what you're thinking of?

Copy link
Member

@jrfnl jrfnl Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, try this as a good starting point for finding more tests: try running the sniff over WP core and see which false positives it throws up for this new error code. Those cases will all need to be accounted for and the patterns found which caused the false positives should be covered by unit tests.

Also see the link in the original issue to Core trac where there are also quite some examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, try this as a good starting point for finding more tests: try running the sniff over WP core and see which false positives it throws up for this new error code. Those cases will all need to be accounted for and the patterns found which caused the false positives should be covered by unit tests.

Ah yeah, makes sense, thanks. wp-includes doesn't yield any occurrences, wp-content has the following:

https://github.com/WordPress/WordPress/blob/c289bb59acdd3217cdb81e15b580dc3d7cd6a007/wp-content/themes/twentythirteen/image.php#L30

https://github.com/WordPress/WordPress/blob/c289bb59acdd3217cdb81e15b580dc3d7cd6a007/wp-content/themes/twentyeleven/inc/theme-options.php#L121-L122

Those seem legit, but I think we've covered that class of issue with our current unit tests, no?

wp-admin has quite a few, e.g.

https://github.com/WordPress/WordPress/blob/c289bb59acdd3217cdb81e15b580dc3d7cd6a007/wp-admin/comment.php#L56-L57

https://github.com/WordPress/WordPress/blob/c289bb59acdd3217cdb81e15b580dc3d7cd6a007/wp-admin/edit-comments.php#L194-L197

https://github.com/WordPress/WordPress/blob/c289bb59acdd3217cdb81e15b580dc3d7cd6a007/wp-admin/edit-form-advanced.php#L313-L314

and so on.

A lot of these are <a /> links where only the link text needs translating, and the actual <a /> could be moved out of the __() call.

Also see the link in the original issue to Core trac where there are also quite some examples.

Uh, not sure which issue you mean 😅 Maybe this list (which #1419 (comment) refers to)? I checked a few of those.

<span>Previewing:</span> %s
Plugin <strong>resumed</strong>.
<span class="screen-reader-text">link to view ratings opens in a new tab</span>

These also seem pretty much covered I'd say 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these are links where only the link text needs translating, and the actual could be moved out of the __() call.

Sorry, but that's not true. Those are legit translation calls including the tag as there may be a localized page available for that URL and part of translating is changing the URL to the localized version.

I think it would be a good idea to allow for those, i.e. if the tag is an a AND has a href with a hard-coded URL, there should be no warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these are links where only the link text needs translating, and the actual could be moved out of the __() call.

Sorry, but that's not true. Those are legit translation calls including the tag as there may be a localized page available for that URL and part of translating is changing the URL to the localized version.

Right, I didn't think of that.

I think it would be a good idea to allow for those, i.e. if the tag is an a AND has a href with a hard-coded URL, there should be no warning.

I'm not totally sure it should be generalized that much though? This seems like a fairly 'semantic' addition to the sniff that implies quite some background information (there can be localized link targets), which might not be true for all codebases alike. One could argue that the localized link targets are more the exception than the rule, that the sniff doesn't necessarily need to know about them, and that it'd be fine to require a sniff ignore to be added by the author in case it's needed 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the warning would be kept for link tags with hard-coded URLs, I'd like to see a separate error code for those.
That would at least allow codebases - including WP Core - which do have localized URLs and therefore do this with good reason to exclude the warning for those particular text strings from their custom ruleset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added per 1294413

// phpcs:set WordPress.WP.I18n text_domain[]
// phpcs:set WordPress.WP.I18n check_translator_comments true
32 changes: 32 additions & 0 deletions WordPress/Tests/WP/I18nUnitTest.1.inc.fixed
Original file line number Diff line number Diff line change
Expand Up @@ -183,5 +183,37 @@ _n_noop($singular); // Bad x 3.
// This test is needed to verify that the missing plural argument above does not cause an internal error, stopping the run.
_n_noop( 'I have %d cat.', 'I have %d cats.' ); // Bad.

// HTML wrapped strings.
__( '<address>123 Fake Street</address>', 'my-slug' ); // Bad, string shouldn't be wrapped in HTML.
__( 'I live at <address>123 Fake Street</address>', 'my-slug' ); // Okay, no consensus on HTML tags within strings.
__( '<address>123 Fake Street</address> is my home', 'my-slug' ); // Okay, no consensus on HTML tags within strings.
__( '<span>Text</span> More text <span>Text</span>', 'my-slug' ); // Good, we're not wrapping
__( '<div class="my-class">Translatable content</div>', 'my-slug' ); // Bad
__( '<option id="%1$s" value="%2$s">Translatable option name</option>', 'my-slug' ); // Wrapping is okay, since there are placeholders
__( '<b><i>Foo</i></b>', 'my-slug' ); // Bad
__( '<foo>Foo</foo>', 'my-slug' ); // Good

// Strings wrapped in `<a href="...">...</a>` aren't flagged, since the link target might require localization.
__( '<a href="http://wordpress.org">WordPress</a>', 'my-slug' ); // Good
__( '<a id="anchor">translatable text</a>', 'my-slug' ) // Bad, since no href
__( '<a>translatable text</a>', 'my-slug' ) // Bad, since no href
__( '<a href="%s">translatable text</a>', 'my-slug' ) // Good

// Strings containing malformed XML, to make sure we're able to parse them. No warnings should be thrown.
__( '<div>text to translate', 'my-slug' );
__( 'text to <div>translate', 'my-slug' );
__( 'text < to translate', 'my-slug' );
__( 'text to > translate', 'my-slug' );
__( '<div data-type=">foo">my address</div>', 'my-slug' );
__( '<div>text to translate<div>', 'my-slug' );
__( '<>text to translate</>', 'my-slug' );
__( '<>text to translate<>', 'my-slug' );
__( '<div><div>', 'my-slug' );
__( '<<>>text to translate<<>', 'my-slug' );
__( '<<>>123</<>', 'my-slug' );
__( '<b><i>Foo</b></i>', 'my-slug' );
__( '<b><i>Foo</b></i></b>', 'my-slug' );
__( '<foo>Foo</bar>', 'my-slug' );

// phpcs:set WordPress.WP.I18n text_domain[]
// phpcs:set WordPress.WP.I18n check_translator_comments true
9 changes: 9 additions & 0 deletions WordPress/Tests/WP/I18nUnitTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ public function setCliValues( $testFile, $config ) {
// Test overruling the text domain from the command line for one test file.
if ( 'I18nUnitTest.3.inc' === $testFile ) {
$config->setConfigData( 'text_domain', 'something', true );
} else {
// Delete the text domain option so it doesn't persist for subsequent test files.
$config->setConfigData( 'text_domain', null, true );
}
}

Expand Down Expand Up @@ -161,6 +164,12 @@ public function getWarningList( $testFile = '' ) {
154 => 1,
158 => 1,
159 => 1,
187 => 1,
191 => 1,
193 => 1,
194 => 1,
198 => 1,
199 => 1,
);

case 'I18nUnitTest.2.inc':
Expand Down