Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser: Hide the core namespace in serialised data #2950

Merged
merged 7 commits into from
Oct 12, 2017

Conversation

pento
Copy link
Member

@pento pento commented Oct 10, 2017

Description

To make the serialised data format a little easier to read, this PR removes the core/ namespace from serialised core blocks.

Implementation Notes

This is a sizeable PR for a relatively small change. We don't allow block registration without a namespace, so serializer.js can simply remove the core/ namespace from any blocks that have it.

Similarly, to maintain backwards compatibility with blocks stored prior to this change, post.pegjs can simply assume that any block name without a namespace should have the core/ namespace.

Everything else is updated tests, and the freshly generated parser.php.

@pento pento added the [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f label Oct 10, 2017
@pento pento self-assigned this Oct 10, 2017
@codecov
Copy link

codecov bot commented Oct 10, 2017

Codecov Report

Merging #2950 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2950      +/-   ##
==========================================
+ Coverage   34.42%   34.47%   +0.04%     
==========================================
  Files         196      196              
  Lines        5783     5787       +4     
  Branches     1021     1023       +2     
==========================================
+ Hits         1991     1995       +4     
  Misses       3206     3206              
  Partials      586      586
Impacted Files Coverage Δ
blocks/api/registration.js 100% <100%> (ø) ⬆️
blocks/api/serializer.js 98.24% <100%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5fab1d7...0183c18. Read the comment docs.

Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like this change, but will also defer to @mtias and @aduth here

@@ -223,7 +223,7 @@ describe( 'block parser', () => {
);
expect( block.name ).toBe( 'core/unknown-block' );
expect( block.attributes.fruit ).toBe( 'Bananas' );
expect( block.attributes.content ).toContain( 'core/test-block' );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is kind of weird, I don't understand why we were receiving core/test-block in the content attributes and why we're receiving wp:test-block right now. (Both seem odd if we look at the definition of the unknown block)

Copy link
Member Author

@pento pento Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour was added in #2513.

@mtias
Copy link
Member

mtias commented Oct 10, 2017

I am in favor—proposed this to both @aduth and @pento. Let's discuss the approach a bit more, though. The goal is to keep the markup as clean as possible, and core/ is only adding noise there. A <!-- wp:gallery --> is much nicer.

In terms of registering a block, it still makes sense to keep core/ as a way for running our validation and warnings against the namespace.

Some thoughts on implementation:

  • Register with core/, keep warnings in place.
  • Do not serialize core/.
  • Add core/ if needed internally during initial parsing to blocks missing a namespace.
  • Try to avoid touching the grammar.

If this makes sense, where do we do this on the server? Handling it on the grammar increases its complexity, but also prevents us from having to handle it on the client and the server.

@mtias
Copy link
Member

mtias commented Oct 10, 2017

Also worth keeping the className implementation in mind, as it does the same thing (drops "core" from the name).

@pento
Copy link
Member Author

pento commented Oct 10, 2017

Re-adding core/ on the server seems necessary, so that plugin authors will see the same name in the JS and the PHP. Here's a small example of how the block name may be relied on in the PHP.

gutenberg_parse_blocks() and do_blocks() both seem like the wrong place to implement this special case. The former feels like a weird hack in a simple function, the latter feels too high level - it only applies to plugins that are operating inside the block loop.

I'm wondering if gutenberg_parse_blocks() should take the array that Gutenberg_PEG_Parser::parse() returns, and transform each element into a theoretical WP_Block object. WP_Block would be a good place to recognise the difference between a namespaced and non-namespaced block. It also potentially changes the object model around a little bit - for example, in do_blocks(), the loop would likely call WP_Block::render(), instead of WP_Block_Type::render().

@@ -226,7 +226,7 @@ describe( 'block serializer', () => {
''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add tests for the non-core/ namespaced case now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@aduth
Copy link
Member

aduth commented Oct 10, 2017

I like the idea of cleaner markup, but the introduced inconsistency is a bit worrying, at least with the challenge it poses the implementation. The fact that #2950 (comment) is mentioned at all serves as a reminder that intentional inconsistencies can be difficult to maintain.

Try to avoid touching the grammar.

What would be the reason to avoid introducing this in the grammar? It seems appealing to normalize this as early as possible. @pento's suggestion of interacting with blocks as a class object seems like the next best thing for normalizing.

@mtias
Copy link
Member

mtias commented Oct 10, 2017

It seems like knowledge the parser shouldn't have to care about, but I'm ok with trying it there if we can retain clarity.

blockName = 'core/' + blockName;
}
return blockName;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to encode this here we might want to consider making these separate tokens in the grammar instead of adding more actual custom code in the parser

WP_Block_Name
  = WP_Core_Block_Name
  / WP_Extra_Block_Name

WP_Core_Block_Name
  = name:$(ASCII_Letter ASCII_AlphaNumeric*)
  { /** <?php return "core/$name" ?> */ return 'core/' + name }

WP_Extra_Block_Name
  = $(ASCII_Letter (ASCII_AlphaNumeric / "/" ASCII_AlphaNumeric)*)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shuffled it around a bit in b925d43. This also explicitly parses zero or one /s.

@dmsnell
Copy link
Member

dmsnell commented Oct 10, 2017

Is this core/ prefix adding much noise? Especially in comparison to the comments themselves? To me it seems like this PR is just adding complexity and inconsistency into the project for very small gains.

Did we have people complain about the core/ prefix?

If someone were to look at the demo block and see all core blocks, what impression would they get that there is a necessary namespace prefix for their own blocks?

@mtias
Copy link
Member

mtias commented Oct 10, 2017

I see it as a QoL improvement to what we save. It is of no semantic value to the user and adds confusion—"what is this core thing?". These are also valid terms in database searches, so keeping it as minimal as possible is good. I'd even be ok with dropping it entirely from the picture, but I think it holds some value in the context of registration.

I like the idea of a different token: WP_Namespaced_Block_Name.

@mtias
Copy link
Member

mtias commented Oct 10, 2017

Incidentally, I think we can look at demoting <!-- more --> from the parser entirely and saving as <!-- wp:more -->. We can then use the legacy-pasting mechanism to transform if needed.

@dmsnell
Copy link
Member

dmsnell commented Oct 10, 2017

These are also valid terms in database searches, so keeping it as minimal as possible is good.

On that note, seems like core/paragraph and core/list would be infinitely more searchable than paragraph and list etc…

@pento pento added the [Feature] Block API API that allows to express the block paradigm. label Oct 11, 2017
@pento pento added this to the Beta 1.5 milestone Oct 11, 2017
@pento
Copy link
Member Author

pento commented Oct 11, 2017

<!--more--> handling tracked in #2973.

namespace = 'core/';
}
return namespace + name;
}
Copy link
Member

@dmsnell dmsnell Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still missing some of the value of the parser: a separation of code from grammar. The more we can eliminate processing from inside of here the more declarative it is and the easier it will be to recreate in other contexts.

WP_Block_Name
  = $(WP_Block_Namespace "/" WP_Namespaced_Block_name)
  / name:$(WP_Namespaced_Block_Name)
  { /** <?php return "core/$name" ?> */ return 'core/' + name }

Although here I miss some of the semantic focusing we had with a separate token for core vs. extra blocks. In this case we're making some imperative logical decisions based on the presence of a / instead of stating two branches in a parse tree based on the identity of the tokens.

WP_Block_Type
  = WP_Namespaced_Block_Type
  / WP_Core_Block_type

WP_Namespaced_Block_Type
  = $(ASCII_Letter ASCII_AlphaNumeric* "/" ASCII_Letter ASCII_AlphaNumeric*)

WP_Core_Block_Type
  = type:$(ASCII_Letter ASCII_AlphaNumeric*)
  { /** <?php return "core/$type" ?> */ return 'core/' + type }

In this construction we still do have some basic processing but it's nothing more than adding a core/ prefix - no conditionals are happening - instead they end up in the generated parser. We can talk about optimizations if they arise, but on the other hand, this is generating a parser which memoizes failed branches so we have a very low penalty for backing up and trying the second branch.

The ultimate goal is generating a grammar with zero code in it - of course we aren't there now, but it's good to highly vet new code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you're saying. That's a much nicer way of doing it, added in 5ebaf95.

Also, add a test that the parser parses non-namespaced block names.
@@ -178,6 +178,10 @@ export function getCommentDelimitedContent( blockName, attributes, content ) {
? serializeAttributes( attributes ) + ' '
: '';

if ( blockName.startsWith( 'core/' ) ) {
blockName = blockName.substr( 5 );
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you be happy using a style that doesn't mutate our variables? that is, instead of changing mid-function we start with a parameter and acknowledge that it needs processing via an appropriate name, then binding once the final value for that name?

getCommentDelimitedContent( rawBlockName, attributes, content ) {
	
	// strip core blocks of their namespace prefix
	const blockName = rawBlockName.startsWith( 'core/' )
		? rawBlockName.substr( 5 )
		: rawBlockName;
	
}

also, there's nothing wrong with it, though it seems like I'm not as used to seeing String.prototype.substr() as frequently as String.prototype.slice() - and I think the behaviors are the same when only one parameter is supplied. thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, I'm cool with these changes.

I probably defaulted to substr() because of PHP, but slice() is clearly the default option in the codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pento - I'm a bit intimidated asking for a style change, so thank for being so gracious with it!

@dmsnell
Copy link
Member

dmsnell commented Oct 11, 2017

@mtias did we have people experience QoL issues when working with the core/ prefix? I'm still struggling to see how stripping it out is a clear win. We lose searchability of the block types and we introduce a major inconsistency in the save format: my first assumption when writing my own block would be to leave out my own namespace and create what would be treated by the parser as a core block…

@mtias
Copy link
Member

mtias commented Oct 11, 2017

We lose searchability of the block types and we introduce a major inconsistency in the save format.

How so? You can easily search for wp:gallery.

Regarding the developer experience, that is a separate aspect. We are already treating core blocks differently in the className generation. For creating blocks, we have warnings if you don't use a namespace. If we have a tool to generate blocks from something like the command line, we'd handle it there as well.

Basically, core/ namespace is good from a developer point of view (and if you look at the code for our blocks, it'd be present) but it's noise for the user content. And we should optimize for that, specially if it's only at the expense of some more complexity on our side.

@dmsnell
Copy link
Member

dmsnell commented Oct 11, 2017

How so? You can easily search for wp:gallery.

oh yeah, good point!

For creating blocks, we have warnings if you don't use a namespace.

If this message is discoverable and clearly written then that alleviates my concern. I would want to make that a priority though so when someone starts playing around they don't have to go looking around for why their thing is broken.

on that note, and marginally related to this point: I recently realized that it was confusing to me what "category"s were available and what "icon"s - would be cool to have more interactive/helpful warnings "from the system" while developing - a live linter for stuff like this and more

@pento
Copy link
Member Author

pento commented Oct 12, 2017

Minor inconsistency I noticed - the parser allows for _ to be in the block name, but both the regex in both registerBlockType() and WP_Block_Type_Registry::register() do not. Is there any particular reason for this difference?

@dmsnell
Copy link
Member

dmsnell commented Oct 12, 2017

the parser allows for _ to be in the block name

there wasn't a decision about this; it's there because we didn't have a norm. let's follow-up in a separate PR so we can establish what actually ought to be there. let's find some way to document that rule somewhere other than the code itself, keep from more "the implementation is the specification"

good catch @pento

Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from reviewing the grammar - I love having more people working on it - thanks a bunch for doing this @pento!

@pento pento merged commit d715d61 into master Oct 12, 2017
@pento pento deleted the try/hiding-core-namespace branch October 12, 2017 04:06
DevinWalker added a commit to DevinWalker/gutenberg that referenced this pull request Oct 12, 2017
* master: (45 commits)
  Parser: Hide the core namespace in serialised data (WordPress#2950)
  fix: Undefined index warning (WordPress#2982)
  navigation via keydown
  PHPCS improvements (WordPress#2914)
  Update/html block description (WordPress#2917)
  Framework: Merge EditorSettingsProvider into EditorProvider
  Framework: Move the store initialization to a dedicated component
  Rotate header ellipsis to match block ellipsis
  add/459: Added support for ins, sub and sup
  Refresh nonce with heartbeat (WordPress#2790)
  Paste: add table support
  Bump version to 1.4.0. (WordPress#2954)
  Blocks: The custom classname should be persisted properly in the block's comment
  Editor: Fix the scroll position when reordering blocks
  Polish spacing in block ellipsis menu (WordPress#2955)
  Editor: HTML Editor per block (WordPress#2797)
  Show most frequently used blocks next to inserter (WordPress#2877)
  add/459: Added light grey background to selected inline boundaries
  Extend inline boundaries to other elements
  Move markdown fix to own plugin-compatibility file
  ...
DevinWalker added a commit to DevinWalker/gutenberg that referenced this pull request Oct 12, 2017
* master: (45 commits)
  Parser: Hide the core namespace in serialised data (WordPress#2950)
  fix: Undefined index warning (WordPress#2982)
  navigation via keydown
  PHPCS improvements (WordPress#2914)
  Update/html block description (WordPress#2917)
  Framework: Merge EditorSettingsProvider into EditorProvider
  Framework: Move the store initialization to a dedicated component
  Rotate header ellipsis to match block ellipsis
  add/459: Added support for ins, sub and sup
  Refresh nonce with heartbeat (WordPress#2790)
  Paste: add table support
  Bump version to 1.4.0. (WordPress#2954)
  Blocks: The custom classname should be persisted properly in the block's comment
  Editor: Fix the scroll position when reordering blocks
  Polish spacing in block ellipsis menu (WordPress#2955)
  Editor: HTML Editor per block (WordPress#2797)
  Show most frequently used blocks next to inserter (WordPress#2877)
  add/459: Added light grey background to selected inline boundaries
  Extend inline boundaries to other elements
  Move markdown fix to own plugin-compatibility file
  ...
DevinWalker added a commit to DevinWalker/gutenberg that referenced this pull request Oct 12, 2017
* master: (45 commits)
  Parser: Hide the core namespace in serialised data (WordPress#2950)
  fix: Undefined index warning (WordPress#2982)
  navigation via keydown
  PHPCS improvements (WordPress#2914)
  Update/html block description (WordPress#2917)
  Framework: Merge EditorSettingsProvider into EditorProvider
  Framework: Move the store initialization to a dedicated component
  Rotate header ellipsis to match block ellipsis
  add/459: Added support for ins, sub and sup
  Refresh nonce with heartbeat (WordPress#2790)
  Paste: add table support
  Bump version to 1.4.0. (WordPress#2954)
  Blocks: The custom classname should be persisted properly in the block's comment
  Editor: Fix the scroll position when reordering blocks
  Polish spacing in block ellipsis menu (WordPress#2955)
  Editor: HTML Editor per block (WordPress#2797)
  Show most frequently used blocks next to inserter (WordPress#2877)
  add/459: Added light grey background to selected inline boundaries
  Extend inline boundaries to other elements
  Move markdown fix to own plugin-compatibility file
  ...
aduth added a commit that referenced this pull request Oct 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block API API that allows to express the block paradigm. [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants