Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag type to prevent explosion of formats #21

Open
streamich opened this issue Dec 10, 2023 · 1 comment
Open

Tag type to prevent explosion of formats #21

streamich opened this issue Dec 10, 2023 · 1 comment

Comments

@streamich
Copy link

streamich commented Dec 10, 2023

Currently RESP3 specification allows to encode the same data types in alternative ways because of their different semantic meaning. For example:

  • A map data can be encoded as: (1) a Map type; or (2) as a Push type; or as (3) Attributes type.
  • An array can be encoded as: (1) Array type; or (2) Set type.
  • A string can be encoded as string as one of the many String types or as an Error type.
    • Simple string and Simple error; and Bulk string and Bulk error - are the same data types with different semantic meaning.

Then each type has also a streamed version. This leads to an explosion of types.

In CBOR this problem is solved by the tag type. A tag can be wrapped around any other type.


Proposal

Add ability to tag any RESP type. A tag will attach a semantic meaning to the type, but will not change its data type.

Syntax of a tag:

)<tag>\r\n
<another-resp-node>

For example, below is a tag with value 123 wrapped around a string abc:

)123\r\n
+abc\r\n

The value of a tag, like 123, can be any number. Some numbers would have a reserved meaning, for example:

  • 1 - push tag
  • 2 - attributes tag
  • 3 - set tag
  • 4 - error
  • 5 - UTF-8 text

This way attributes message can be encoded as a Map, instead of adding a new Attributes type:

)2\r\n
%0\r\n

Instead of:

|0\r\n

Alternatively, to save resources, a tag could be specified without the \r\n separator:

)2%0\r\n

Omitting \r\n for tags could be justified because the tag is not a separate data type, it is part of the node it is attached to, it just specifies the semantic meaning of that node.


Push:

)1%0\r\n

instead of:

>0\r\n

Set:

)3*0\r\n

instead of

~0\r\n

Error:

)4+ERR\r\n

instead of

-ERR\r\n

Bulk error:

)4$3\r\n
ERR\r\n

instead of

!3\r\n
ERR\r\n

Verbatim error

)4=3\r\n
txt:ERR\r\n

Streaming error

)4$?\r\n
;3\r\n
ERR\r\n
;0\r\n

Text guaranteed to be UTF-8:

)5+Hello 👋\r\n
@414owen
Copy link

414owen commented Jan 21, 2024

When you say an explosion of types, I take it you mean an explosion of forms to be parsed, rather than types in the sense of datatypes that can be returned as results of a parse?

FWIW, I don't seem to have this issue. You can share a lot of the code between the blob string, blob error, and verbatim string parsers (although these parsers are extremely small anyway).
Same goes with the Array and Set parsers, and presumably the map ones too.

I would prefer to see the specification include instruction on when blob strings can be considered utf8-safe (à la your other issue), rather than see this new form that can change the meaning of other forms introduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants