Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Unescaped dump #1479

Closed
p-groarke opened this issue Feb 12, 2019 · 9 comments
Closed

[Question] Unescaped dump #1479

p-groarke opened this issue Feb 12, 2019 · 9 comments
Labels
kind: question solution: proposed fix a fix for the issue has been proposed and waits for confirmation

Comments

@p-groarke
Copy link

I'm using jsonformoderncpp as a simple intermediary to traverse json and store it in another hierarchical format. All I need is strings for every key/value pair, regardless of their type. I use dump, which works as expected, but it adds escaped quotes to values.

Is their a way to get values "as is" without escaped quotes? get<std::string> can't work as it checks the json type is actually a string.

@jaredgrubb
Copy link
Contributor

If I understand correctly, you want the JSON string representation of each value. You could just call value.dump() on each one right?

@p-groarke
Copy link
Author

Yes that's what I'm doing, but dump() keeps quotes. What I'm looking for is serialization of the value.

Let me try to explain what I mean. Lets say I have the following :

{ "name" : "bob" }

dump() will output "bob". I'm looking to get bob without the quotes, as if you would deserialize to string. Does that make any sense?

I can do the parsing myself, I just want to make sure the capability to do this isn't already available so I don't take the perf hit of cleaning all the strings.

@jaredgrubb
Copy link
Contributor

jaredgrubb commented Feb 12, 2019

Let's be more concrete and include something where the difference matters.

Suppose you have the JSON string {"country": "\\u65e5\\u672c"}:

value.dump() will give type string "\\u65e5\\u672c" (14 bytes, all ASCII)
value.get<string>() will give you the string 日本 (6 bytes, UTF8)

Are you saying you want \\u65e5\\u672c (12 bytes, ASCII) (dropping the quotes, but keeping the JSON-style escaping)?

@p-groarke
Copy link
Author

No, I would want the string "as-if deserialized and reserialized". So 日本, without quotes. However, I need this for every basic type as well (ints, floats etc).

So the setup :
db -> jsonformoderncpp, decomposes everything as string-> templating lib -> output

@p-groarke
Copy link
Author

Just a note though, I would be ok with \\u65e5\\u672c without quotes, as I can always parse the utf8 myself or using another lib.

@gregmarr
Copy link
Contributor

I remember another issue like this where the recommendation was to walk through the elements in the array/object, and if the type is string, use get<std::string>, and for anything else, use dump(). I was looking for that issue, but wasn't able to find it.

@gregmarr
Copy link
Contributor

I think this is the one: #1181

@jaredgrubb
Copy link
Contributor

If you need this for basic types too, then how are you going to distinguish the difference between the number 42 and the string "42"? For example, given the JSON object { "a": 42, "b": "42"}, the values are very different things and it sounds like you want the same two-byte string 42 for both values? Going even further, I am trying to imagine what you are hoping to get for an array of strings or, worse, and array that has strings and numbers. :)

But, yes, you would need to do this at a higher level by switching on the type; there is nothing in the library that does this (because it removes information that cannot be recovered).

@p-groarke
Copy link
Author

p-groarke commented Feb 12, 2019

I remember another issue like this where the recommendation was to walk through the elements in the array/object, and if the type is string, use get<std::string>, and for anything else, use dump(). I was looking for that issue, but wasn't able to find it.

Of course! That's perfect thx :)

To answer your question, the output is html. So 42 and "42" are going to be rendered the same way : 42. As far as I can tell there is no problem (I may be hallucinating, tbd ;) ). Note there is no processing happening, just rendering the values as text to html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: question solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

No branches or pull requests

4 participants