-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core-util] Add reshape helper #28095
Conversation
API change check APIView has identified API level changes in this PR and created following API reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super cool, this will help us simplify the generated code a lot!
needs to be exported in index.ts |
expect(() => reshape(input, "a.b[]", "oops")).toThrowError("cannot rename array indices"); | ||
}); | ||
|
||
it("can rename properties of elements of an array", function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to flatten the properties by using this util?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we can support add nodes into it, then the flatten problem should also be solved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could write something like this:
reshape({ properties: { a: 1, b: 2 } }, "", (input: any) => input.properties);
(This helped me find a bug where I wasn't handling if you returned something different when selecting the root object)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not do something like
reshape({ properties: { a: 1, b: 2 } }, "properties.a", "a")
reshape({ properties: { a: 1, b: 2 } }, "properties.b", "b")
want to sell jsonpath a little :)
if we support jsonpath, we could just use something like reshape({ properties: { a: 1, b: 2 } }, "$.properties", "$.")
@xirzec what I was referring to adding nodes was something during serialization in flatten's case
reshape({ a: 1, b: 2 }, "a", "properties.a")
reshape({ a: 1, b: 2 }, "b", "properties.b")
Which I am not sure if it's working?
Back to the flatten, it looks a little weird to me, that where we put the flatten reshape could have impact on the reshape of the properties inside it.
if we have something like body = {properties: { a: {c: 3}, b: 2 } }
,
when we do flatten first, we will get
body = reshape(body, "properties.a", "a")
body = reshape(body, "a", "d")
when we do flatten last, we will get
body = reshape(body, "properties.a", "properties.d")
body = reshape(body, "properties.d", "d")
which is why I prefer to support reshape in codegen as models based, instead of properties based. see Azure/autorest.typescript#2180 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't mark it as resolved now, I still have concerns about this part. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, does TypeSpec have a way to move properties around on models?
I know we've been talking about flatten, but in general my understanding is that when we map "friendly names" they exist at the same level. That's why the "rename" signature for reshape
takes only the new name of the final property pointing to the selected object rather than a full path again.
For example, reshape({a: { b: { c: 1 }}}, "a.b.c", "d")
produces {a: {b: { d: 1}}}
not {a: { b: {} }, d: 1}
To be clear, I'm generally opposed to flattening per our other discussions, so I wasn't really trying to design for it. You are correct though that we'd have to be careful to do deeper modifications first if we were going to flatten a parent property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does TypeSpec have a way to move properties around on models?
No, TypeSpec doesn't and they don't intend to provide this ability if I understand correctly. Though @bterlson could comment to confirm.
If our intention for reshape is just for rename properties or changing the type of values, then reshape would be a little weird name to me ?
Per the other discussion, I think our point for flatten is to only support flatten for those RPs with high usage. which means we still need to support flatten, let's say we have other way to support flatten in codegen, the flatten order still have influence on the reshape.
I still prefer to have a reshape helper that can more generally do the convert on models basis instead of properties basis, maybe not for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point is if the purpose of reshape is limited to rename the properties and change the properties value, I am actually not very convinced that to how much extent it could simply the generated code, the current generated code for serialize and deserialize has a lot of lines because most of the lines are just copy values, which is a problem that can somehow resolved a little if we can have better modularization issue Azure/autorest.typescript#1971 resolved first ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to both better names and alternative codegen approaches. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can work on a poc of how I think the generated code should look like ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, but I do feel like this is the right thing to do about the serialize and deserialize :)
|
||
type SelectorPart = SelectorArrayIterator | SelectorPropertyName; | ||
|
||
function* selectorParts(selector: string): Iterable<SelectorPart> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it's possible, but could we leverage jsonpath to do the selection ? I feel like that's more generally used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are considering jsonpath to do the selection, could it be used in the customization? like in the original autorest modelerfour, they use directive for customized transformation, and the selector is also using jsonpath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was looking at this a bit, and it seems that jsonpath never quite made it into an RFC: https://datatracker.ietf.org/wg/jsonpath/about/
The original description appears to be hosted here: https://goessner.net/articles/JsonPath/index.html#e2
I think my main concern with trying to support it is that it is more verbose and has additional capabilities that I'm not sure we would want, like the array slice operator and traversal expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask a silly question here, I am not sure I get the point of why jsonpath being made into an RFC matters to us ? 😊
My concerns of supporting selector by ourself is that, it could possibly be buggy as I am not sure if we have all cases considered ? I am also not sure if it's a good idea that service team who wants to write their own customization serialize and deserialize function and would have to learn the grammer we invented here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would prefer to have jsonpath to locate the objects with less learning curve. Is it possible our selectors are just selected/supported subsets of jsonpath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose there are two different considerations here, one is taking a dependency on some third party jsonpath library, the other is supporting the full jsonpath syntax.
The former I'm not that interested in, since it feels unnecessary to take a dependency for the problem we have (provide an efficient and debuggable primitive for renaming properties and mapping values) and we wouldn't be able to easily optimize any code in the dependency itself.
As to the latter, if we're saying "hey this is a syntax people know, so we should use it", that's great except since it's not a standard, there could be different implementations with different behaviors/syntax, so it's not really buying us as much familiarity with an existing syntax. In addition, since this isn't a consumer-facing function, I'm not sure I buy the learning curve being difficult or there being a high need for familiarity.
To address @MaryGao's point, if we wanted to align the syntax we totally could, but it would simply change something like a.b.c[]d
into $.a.b.c[*].d
and all we've really done is:
- add a questionably useful
$.
prefix to every single selector - forced you to write
*
in between the array braces - put an extra
.
after the brace syntax to access properties inside an array element.
I'm not convinced that adding these extra characters greatly improves readability.
I'm curious about this statement:
service team who wants to write their own customization serialize and deserialize function
Is this a scenario we support today? And if so, why would they necessarily have to use this helper? Presumably if they are going through the work of writing a custom TS function that converts one shape to another, they could do so efficiently without any helpers, much like we did in past track 2 dataplane packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder a case like { a: {b: {c: 1}}, b: {c: 2} }
, if we pass b.c
, would it be confusing which part we are referring to ?
Is this a scenario we support today? And if so, why would they necessarily have to use this helper?
Can't argue with that, customer can definitely write customization helper as they like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder a case like { a: {b: {c: 1}}, b: {c: 2} }, if we pass b.c, would it be confusing which part we are referring to ?
I mean if a model from a service has multiple sections with repeated names it could be a little confusing to look at, but is it more confusing than reading parsedBody.a.b.c
vs parsedBody.b.c
?
Would it be useful to take a list of mappings? I thought about chaining calls too: deserializedResult = new Transformer(result.body)
.reshape("created_at", "createAt")
.reshape("fine_tuned_model", "fineTunedModel")
// ...
.result(); But neither unlikely matters much for generated code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied to comments and pushed some updates. Thanks for the feedback!
|
||
type SelectorPart = SelectorArrayIterator | SelectorPropertyName; | ||
|
||
function* selectorParts(selector: string): Iterable<SelectorPart> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was looking at this a bit, and it seems that jsonpath never quite made it into an RFC: https://datatracker.ietf.org/wg/jsonpath/about/
The original description appears to be hosted here: https://goessner.net/articles/JsonPath/index.html#e2
I think my main concern with trying to support it is that it is more verbose and has additional capabilities that I'm not sure we would want, like the array slice operator and traversal expressions.
expect(() => reshape(input, "a.b[]", "oops")).toThrowError("cannot rename array indices"); | ||
}); | ||
|
||
it("can rename properties of elements of an array", function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could write something like this:
reshape({ properties: { a: 1, b: 2 } }, "", (input: any) => input.properties);
(This helped me find a bug where I wasn't handling if you returned something different when selecting the root object)
My main consideration for not taking an array or doing a fluent pattern was I wanted to make it really easy to set a breakpoint at each step and log before/after so you could quickly narrow down where the shape wasn't what you expected. |
import { describe, it, assert, expect } from "vitest"; | ||
import { reshape } from "../../src/reshape"; | ||
|
||
describe("reshape", function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that input
could be an array?
What if the input is not an object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work:
it("can handle arrays as the initial input", function () {
const input = [1, 2, 3];
const actual = reshape(input, "[]", String);
const expected = ["1", "2", "3"];
assert.deepEqual(actual, expected);
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have another question, should we add this to the non branded core ?
expect(() => reshape(input, "a.b[]", "oops")).toThrowError("cannot rename array indices"); | ||
}); | ||
|
||
it("can rename properties of elements of an array", function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does TypeSpec have a way to move properties around on models?
No, TypeSpec doesn't and they don't intend to provide this ability if I understand correctly. Though @bterlson could comment to confirm.
If our intention for reshape is just for rename properties or changing the type of values, then reshape would be a little weird name to me ?
Per the other discussion, I think our point for flatten is to only support flatten for those RPs with high usage. which means we still need to support flatten, let's say we have other way to support flatten in codegen, the flatten order still have influence on the reshape.
I still prefer to have a reshape helper that can more generally do the convert on models basis instead of properties basis, maybe not for this PR.
|
||
type SelectorPart = SelectorArrayIterator | SelectorPropertyName; | ||
|
||
function* selectorParts(selector: string): Iterable<SelectorPart> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder a case like { a: {b: {c: 1}}, b: {c: 2} }
, if we pass b.c
, would it be confusing which part we are referring to ?
Is this a scenario we support today? And if so, why would they necessarily have to use this helper?
Can't argue with that, customer can definitely write customization helper as they like.
expect(() => reshape(input, "a.b[]", "oops")).toThrowError("cannot rename array indices"); | ||
}); | ||
|
||
it("can rename properties of elements of an array", function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point is if the purpose of reshape is limited to rename the properties and change the properties value, I am actually not very convinced that to how much extent it could simply the generated code, the current generated code for serialize and deserialize has a lot of lines because most of the lines are just copy values, which is a problem that can somehow resolved a little if we can have better modularization issue Azure/autorest.typescript#1971 resolved first ?
Abandoning this helper in favor of improving the codegen logic directly. The complexity of specifying the property to change being the most undesirable feature. |
Packages impacted by this PR
@azure/core-util
Describe the problem that is addressed by this PR
When trying to serialize/deserialize object payloads, we often have to recurse into them in order to rename properties and stringify/destringify values. This helper allows for applying these kinds of operations to an object where there is a known path to the property being modified. In the event the property does not exist, no mutation occurs.
The only thing I don't currently handle is repeated selection, e.g. if you have a potential cycle in the path to a property and want to try recursing as deeply as you can.
Are there test cases added in this PR? (If not, why?)
Yes!
Checklists