Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RichText State Structure #771

Closed
ellatrix opened this issue May 11, 2017 · 16 comments
Closed

RichText State Structure #771

ellatrix opened this issue May 11, 2017 · 16 comments
Assignees
Labels
[Feature] Block API API that allows to express the block paradigm. [Feature] Extensibility The ability to extend blocks or the editing experience [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f
Milestone

Comments

@ellatrix
Copy link
Member

ellatrix commented May 11, 2017

This ticket is just a proposal and looks for more opinions on the matter.

In order to have easier logic for splitting, merging, formatting and selection control, we could benefit form normalised content and selection state in the Editable component.

Here's what I have in mind:

Multi-line:

{
	range: {
		start: [ 0, 8 ],
		start: [ 0, 12 ]
	},
	value: [
		[ 'p', {
			formats: [
				{ type: 'em', start: 8, end: 12 },
				{ type: 'a', href: 'http://w.org', start: 13, end: 17 }
			],
			text: 'This is some text.'
		} ],
		[ 'p', {
			formats: [
				// ...
			],
			text: 'More text.'
		} ]
	]
}

Inline:

{
	range: {
		start: 8,
		start: 12
	},
	value: {
		formats: [
			{ type: 'em', start: 8, end: 12 },
			{ type: 'a', href: 'http://w.org', start: 13, end: 17 }
		],
		text: 'This is some text.'
	}
}

Note that these are just two examples, it would be flexible enough to allow deeper nesting.

There would be converters for DOM => state, state => DOM and state => HTML.

@ellatrix
Copy link
Member Author

ellatrix commented May 11, 2017

Suggestion by @aduth for multi-line:

{
    range: {
        start: [ 0, 8 ],
        start: [ 0, 12 ]
    },
    value: [
        {
            formats: [
                { type: 'em', start: 8, end: 12 },
                { type: 'a', href: 'http://w.org', start: 13, end: 17 }
            ],
            text: 'This is some text.'
        },
        {
            formats: [
                // ...
            ],
            text: 'More text.'
        }
    ]
}

@aduth
Copy link
Member

aduth commented May 11, 2017

I got spun in this direction after some frustrations at #689 (comment) in how we have to be very accommodating with content shapes, accounting for children of types string, type object, and type array, and even knowing that a text block is one or more paragraphs, pulling the desired text by accessing props of the element directly.

From a purely data perspective it doesn’t seem to follow to me that "paragraph" need to be embedded in a Text block's state at all, but instead delegated to the implementations of its edit and save functions in mapping the raw values.

This could very easily turn into a rabbit hole, but I'm thinking we could do for applying some constraints to how we think of content; not as a tree of nodes, but as a string carrying inline and block-level (not that block, this block) formatting metadata. Doing so would dramatically simplify how we work with content, particularly in transforming, merging, and splitting blocks.

@aduth
Copy link
Member

aduth commented May 11, 2017

Doing so would dramatically simplify how we work with content, particularly in transforming, merging, and splitting blocks.

To expand on this a bit too, originally we were content with opaque tree shapes returned by children under the assumption that the value and Editable behaved as a black box from the perspective of the block implementer. This only works so long as the developer never needs to inspect or manipulate the value itself, but with block transformations, merging, and splitting, it has become obvious that this manipulation must occur.

@aduth
Copy link
Member

aduth commented May 11, 2017

Ideally we should be able to represent a Text block's state as an array, where each member corresponds to an individual paragraph within the block. Challenges with this were surfaced in the original text of #689, notably preserving the text alignment of each paragraph. With the flattened inline formatting proposal in #771 (comment), this detail is not included. We could provide additional metadata about the paragraph root within each object ({ formatting: { inline: [], block: { textAlign: 'right' } } }) or as a separate attribute of the Text block ({ alignments: [ 'left', 'left', 'right' ] }), or this could tie into nested blocks consideration (#428). It's not entirely clear to me how the alignment values are extracted. This decision should be made with mind toward efforts started in #624 (#608) to discourage treating the block's content as a DOM object.

@ellatrix
Copy link
Member Author

Mapping a DOM tree into this structure (or any similar structure) is fairly easy. I'm not so sure about mapping it back into a DOM tree or HTML. I seems easier if the structure is something like:

formats: {
  /* indices as keys */
  4: {
    start: { /* formats that start here */ },
    end: { /* formats that end here */ }
  }
}

@ellatrix
Copy link
Member Author

In other words, it seems difficult to process formats format-by-format, but easier to do it index-by-index.

@aduth
Copy link
Member

aduth commented May 15, 2017

I'd been toying with some ideas last week and I'm not feeling quite as strongly against having a tree of nodes. As it related to my specific needs for transforming Text to Heading and structuring Text as an array of blocks, the most important thing seemed to be changing children to encompassing details about the root node. Ideally we'd still not need direct access to props in transforms as long as its consistent on children, ideally always returning an array. The grammar could handle these needs quite well with alignment attributes available on attrs and children always returning an array, but I'm unsure yet if we'd want to impose familiarity with this structure to the block implementer (could still be better than imposing traversal of React's structure).

@ellatrix
Copy link
Member Author

ellatrix commented May 19, 2017

Maybe this looks like some crazy structure with lots of duplication, but it also seems a lot easier to manage and access.

{
    formatsByID: {
        1: { type: 'em' },
        2: { type: 'a', href: 'http://w.org' }
    },
    range: {
        start: [ 0, 8 ],
        start: [ 0, 12 ]
    },
    value: [
        {
            formats: {
                8: [ 1 ],
                9: [ 1 ],
                10: [ 1 ],
                11: [ 1 ],
                13: [ 2 ],
                14: [ 2 ],
                15: [ 2 ],
                16: [ 2 ]
            },
            text: 'This is some text.',
        },
        {
            formats: {},
            text: 'More text.'
        }
    ]
}

Or just:

{
    range: {
        start: [ 0, 8 ],
        start: [ 0, 12 ]
    },
    value: [
        {
            formats: {
                8: [ { type: 'em' } ],
                9: [ { type: 'em' } ],
                10: [ { type: 'em' } ],
                11: [ { type: 'em' } ],
                13: [ { type: 'a', href: 'http://w.org' } ],
                14: [ { type: 'a', href: 'http://w.org' } ],
                15: [ { type: 'a', href: 'http://w.org' } ],
                16: [ { type: 'a', href: 'http://w.org' } ],
            },
            text: 'This is some text.',
        },
        {
            formats: {},
            text: 'More text.'
        }
    ]
}

@nylen
Copy link
Member

nylen commented May 23, 2017

What is the advantage of storing formatting as ranges rather than DOM-like trees? It seems like this leads to more frequent and more extensive updates. Imagine a long paragraph with lots of formatting; every time a letter near the beginning of the paragraph is changed, each formatting range after that position needs to be updated.

@ellatrix
Copy link
Member Author

The advantage could be that we have a state that is in sync at all times that is easier to reason about than a tree of text, formatting and ranges. Same goes for selection. It would be something that might be more beneficial in the future though, if we need more interaction with the state. If we have a state => react mapping, we can use that instead so the editable content automatically reflects the state. And then formatting can be applied on the state instead of the DOM, it can be much more complex formatting (other information that needs to be inline) if we need it, and we could apply text transformations/"shortcuts" on this state as well, which can otherwise be quite buggy and complex (you've worked on that too right? 🙂).

@ellatrix
Copy link
Member Author

Something else could be extendibility. E.g. I've always wanted to create a plugin that would make typographical suggestions and corrections. I know some other plugins that also want to make other kinds of suggestions like spell checking, tone checking, accessibility... There is also interest to be able to insert more complex inline objects like mathematical or phonetical expressions, footnotes...

I believe all this will be easier with a state structure like this.

@nylen
Copy link
Member

nylen commented May 23, 2017

that is easier to reason about than a tree of text, formatting and ranges. Same goes for selection.

I believe all this [suggestions, corrections, complex inline objects] will be easier with a state structure like this.

I know other projects have explored a similar approach, like the Medium editor for example. But it's not clear to me how this is easier to work with than a tree. Can you explain a bit more about that?

Would this state structure be something the parser would create and generate from (basically) arbitrary HTML, or would this be done after the parsing step?

@nylen
Copy link
Member

nylen commented May 23, 2017

Also, this sounds similar to how Internet Explorer used to work. Some quotes from that article, it's an interesting read:

As a result of its text-centric design, the principle structure of the DOM was the text backing store, a complex system of text arrays that could be efficiently split and joined with minimal or no memory allocations. The backing store represented both text and tags as a linear progression, addressable by a global index or Character Position (CP). Inserting text at a given CP was highly efficient and copy/pasting a range of text was centrally handled by an efficient “splice” operation.

The foundation of CPs caused much of the complexity of the old DOM. For the whole system to work properly, CPs had to be up-to-date. Thus, CPs were updated after every DOM manipulation (e.g. entering text, copy/paste, DOM API manipulations, even clicking on the page—which set an insertion point in the DOM). Initially, DOM manipulations were driven primarily by the HTML parser, or by user actions, and the CPs-always-up-to-date model was perfectly rational. But with rise of JavaScript and DHTML, these operations became much more common and frequent.

To compensate, new structures were added to make these updates efficient, and the splay tree was born, adding an overlapping series of tree connections onto TreePos objects. The added complexity helped with performance—at first; global CP updates could be achieved with O(log n) speed. Yet, a splay tree is really only optimized for repeated local searches (e.g., for changes centered around one place in the DOM tree), and did not prove to be a consistent benefit for JavaScript and its more random-access patterns.

Due to related complexity and bugginess, ultimately the Edge team ended up refactoring to a tree structure.

Is our case different? How complicated do we expect the editor code to become?

@ellatrix
Copy link
Member Author

But it's not clear to me how this is easier to work with than a tree. Can you explain a bit more about that?

Example: if you get a command to apply or remove bold formatting on a tree, you'll have to start digging into all the text nodes to see what formatting is already applied, start merging and splitting nodes... It's a nightmare to handle. With this state structure, you have all the info on character-by- character, and you don't have to worry about how it will render. So to remove formatting for a range of indices, you just remove that formatting for the indices.

Would this state structure be something the parser would create and generate from (basically) arbitrary HTML, or would this be done after the parsing step?

I don't know. Could be either.

Is our case different? How complicated do we expect the editor code to become?

I don't know how this is comparable. I'm only suggesting to keep inline content as a string with meta data attached to it, not to store the whole editor as a string. The block levels are still a tree.

@dmsnell
Copy link
Member

dmsnell commented Jun 14, 2017

I would like to toss out a reference to the Rope data structure specifically designed to handle text editing and operations on long "arrays" (in this case, arrays of characters or blocks).

The basic idea is that we're taking a linear list and turning it into a binary tree to make splitting and joining fast and immutable. We don't want to have to clone or iterate over a long array every time we make a small change, so the tree preserves the locality of edit operations and can be a big boon for aggregate statistics like word count and friends.

The author of google/xi-editor posted some good conceptual documents explaining ropes. I highly recommend the read.

For our case I could easily imagine such a tree that both holds an array of blocks and for each block an array of inner content and children. Operations like splitting with formatting can remain trivial because the tree is generalized on a pair of idempotent split/merge functions. Indexing and access remain fast as the post grows because it's O(log N) with number of blocks and length of content.

I'm going to circle around for a bit and think about what this could look like discretely. @iseulde and I spoke in person about it and toyed with some use-cases of splitting, merging, and editing blocks and their content but it's easy to let the rich-text editor trump every other block.

In our discussion we raised a significant question: what about global needs for a post? Something simple like a footnote challenges our data structures and the way we finally render and interact with the content.

@jeffpaul jeffpaul added this to the Merge Proposal milestone Feb 8, 2018
@ellatrix ellatrix self-assigned this Jun 14, 2018
@ellatrix ellatrix changed the title Editable state structure RichText State Structure Jun 14, 2018
@ellatrix ellatrix added the [Feature] Extensibility The ability to extend blocks or the editing experience label Jun 14, 2018
@ellatrix
Copy link
Member Author

ellatrix commented Oct 3, 2018

Addressed in #7890.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Block API API that allows to express the block paradigm. [Feature] Extensibility The ability to extend blocks or the editing experience [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f
Projects
None yet
Development

No branches or pull requests

6 participants