[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

nhtruong · 2024-02-08T19:04:51Z

In the past discussions on Smithy vs OpenAPI [1] [2], we decided to have OpenAPI as the official spec of OpenSearch but use Smithy to describe the API as it's easier to organize:

We won’t use Smithy features that are not translatable to OpenAPI.

Any update to the Smithy spec must always be reflected in the OpenAPI spec.

Smithy can be dropped in the future if necessary, allowing contributors to update to the OpenAPI spec directly rather than through the Smithy spec.

After a year of going with this strategy, as we've learned a lot more about the quirks of OpenSearch API that are not compatible with Smithy's many guardrails, writing the spec in Smithy is becoming more challenging:

1. Smithy does not support certain data types for path and query-string parameters

Many OpenSearch operations have URL parameters that accepts either a single value as a string, an array of values separated by commas. That is, these parameters have string | string[] as data type. This can be described in Smithy as a union. However, Smithy doesn't allow unions in its URL parameters (It also doesn't allow arrays/lists in path parameters specifically).

WORKAROUND: represent the data type as a string in Smithy but mark it with xDataType('string | string[]'). If the member of the list is not generic, we must also include xArrayItems('MemberName'), and xEnum([...]) when applied.

smithy-lang/smithy#2131

2. Smithy does not support untagged unions.

Only tagged unions are supported, which enforce a wrapping structure, while we have fields that are for example int | boolean. This would map to a oneOf in OpenAPI.

smithy-lang/smithy#2143

3. OpenSearch operations cause URI Conflict errors in Smithy

For example, since operations regarding an index or a list of indices have URI starting with /{index}/..., other operations, say GET _cat/indices, cause URL Conflict for Smithy as it sees _cat as an {index} parameter.

smithy-lang/smithy#1539

4. Unable to customise inlining and `$ref` naming strategy in OpenAPI conversion

If we are able to customise this strategy we can stop certain “simple” types from always being inlined in the OpenAPI conversion, and we can implement a custom naming scheme for the $refs such that we can name them to match the ElasticSearch spec we inherit. (e.g. #/components/schemas/_type:Duration)

smithy-lang/smithy#2144

5. Smithy mixins are always flattened/erased.

We’re unable to easily represent an inheritance hierarchy within Smithy that is translated to OpenAPI. Potential solution by introducing flag in translator that somehow retains mixins and puts them into allOf in OpenAPI.

smithy-lang/smithy#2145

In short, Smithy, with its many guardrails, is unfit to describe OpenSearch API:

We can't correctly describe OpenSearch API with Smithy
Translation from Smithy to OpenAPI is not lossless
There are too many duck-tape workarounds to circumvent Smithy guardrails

The many gotchas, and conventions we have to establish (as seen in the DEVELOPER_GUIDE) also deter people from contributing to this project. People maintaining this repo, on top of understanding OpenAPI, must also learn Smithy and the workarounds to make it work with OpenSearch API.

Given the difficulties in mapping OpenSearch's API to Smithy, I propose we instead maintain our API definitions directly in OpenAPI. Smithy a great tool for creating new APIs thanks to its many guardrails, but OpenSearch API has many unconventional traits that are simply incompatible with it.

The text was updated successfully, but these errors were encountered:

dblock · 2024-02-08T20:00:26Z

Are there things we lose by switching to writing OpenAPI directly? For example, we will likely lose some of the nice features of Smithy such as traits. What about the folder structure? Will developers be editing a giant file?
Are there related issues for the ones described. I believe URI Pattern Validation and Conflict Resolution design doc smithy-lang/smithy#1539 is (3). Maybe we should open issues for others and see what the smithy folks say about it?

nhtruong · 2024-02-08T20:28:39Z

We will definitely lose all the features that are unique to Smithy. While its many guardrails are not compatible with existing OpenSearch API, they are actually great for new endpoints and new applications. Many of Smithy traits do have counterparts in OpenAPI and we can leverage OpenAPI extensions to cover the missing ones.
That might solve Bullet point number 2. But looks like it's still a draft for a design from Dec 2022? Though I see that restriction in Smithy as a necessary guardrail to prevent conflicting endpoints, so I didn't think of asking Smithy to "correct" it as it's not a bug to me. It's just not compatible with OpenSearch existing endpoints.

reta · 2024-02-09T21:28:27Z

👍 to go OpenAPI full speed, Smithy gets its credits but the "recent" updates to OpenAPI spec (3.1) are buying us much more (specifically, JSON Schema support)

Tokesh · 2024-02-10T14:30:06Z

Its sounds logical that we simply and write directly in OpenAPI.

Not all Smithy models are translated to named OpenAPI schema

OpenSearch operations cause URI Conflict errors in Smithy

I have encountered the two problems described above. It would be interesting to try writing directly into OpenAPI without the extra layer of Smithy.

VachaShah · 2024-02-13T23:06:07Z

Are we planning to write plain json for API specs? Or is there a more developer friendly way we are thinking to adopt to write OpenAPI effectively?
If we go the direct OpenAPI route, are there any other issues we would face when writing OpenSearch APIs in OpenAPI?

Just trying to understand that the issues we are facing currently with Smithy are entirely related to how Smithy works or are they because OpenSearch APIs are written in a certain way and we end up writing workarounds with OpenAPI as well?

nhtruong · 2024-02-14T00:12:28Z

@VachaShah

We definitely want to avoid editing a giant JSON/YAML file by hand. OpenAPI uses JSON Pointer in its reference objects that supports reusing and defining components in different files. We can leverage this and break our api into smaller files by namespaces, for example. And write a script to merge those into a single file like we have today.
OpenAPI is very flexible and forgiving (it's both its strength and weakness comparing to Smithy), which is suitable for a peculiar API like OpenSearch.

Of course we will lose out on the biggest perk that comes out of the box with Smithy: Organization of files. Like mentioned in the first bullet, OpenAPI does support that but we will have to be smart about it.

I'm working on the Proof of Concept on how an editable pure OpenSearch API Spec would look like right now.

nhtruong · 2024-02-20T18:43:42Z

Here's the POC Branch: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi
Check out the DEVELOPER_GUIDE to have a better understanding of the new layout.

Note that the POC spec are not the 1:1 translation from the current spec:

The POC spec also includes the missing request and response bodies
Many shemas and references have be renamed and/or moved to heed the new guidelines stated in the Dev Guide, and to keep things tidy and consistent.

dblock · 2024-02-27T16:23:14Z

Looks good. Lowercase tools ;) @Xtansia thoughts?

Xtansia · 2024-02-27T20:37:20Z

It looks good, I've had good success updating my Java codegen POC to operate off the split spec: opensearch-project/opensearch-java#366

Only issue to note is that at least in the case of the multi-file spec, Java/swagger-parser is unhappy with the $refs that include extra #s like below, because it attempts to parse the $ref as a URI (because it's a relative reference calculated from the current document base URI) and # is a reserved character and should be percent encoded unless it's the fragment delimiter:

 - $ref: '#/components/parameters/cluster.health#query:expand_wildcards'

dblock · 2024-02-28T16:26:38Z

Ideas.

As part of this change I'd like to propose we stop publishing the combined spec as a PR back into the repo. Can we have a "build" for that, and publish something versioned that can be a "release" that ends up matching the version of the product? I think it will become important when we have caught up with the spec, or have major versions.
Can/should OpenSearch.openapi.yaml be broken up where all the API-related contents are in subfolders?

nhtruong · 2024-02-28T22:50:17Z

@Xtansia RE $ref in split spec. Java Swagger-Parser doesn't have the same complain in single-file spec? It matters because if we can limit programmatic consumption to single-file spec, this is no longer an issue. If it's still an issue, do you think replacing # with : would work?

@dblock

I'm thinking having the single-file updated as the contributor lint their changes (The linter will need to perform OpenAPI validation on the single-file version as part of its process anyway). It's nice to see how changes to the component files affect the resulted single-file spec. It gives a different angle to the changes when reviewing the PR. That is, the PR must also include the update to the single-file spec, and a workflow will check if the changes to the split spec match that of the single-file spec.
We would still have a build process like you proposed of course.
Aren't all contents in the spec API related contents? Can you elaborate on this idea?

Xtansia · 2024-02-29T00:12:05Z

@nhtruong for the $refs it's not blocking and I wasn't relying on the libraries built in resolving anyway. However it is something to keep in mind even in the single file spec if we want to be fully correct according to the OpenAPI specification. Valid characters for a URI fragment are: A-Za-z, 0-9, -._~!$&'()*+,;=:@/? or otherwise percent-encoded characters

nhtruong · 2024-02-29T03:57:00Z

@Xtansia I see. I'll convert all # to @ in $ref

dblock · 2024-03-01T21:14:06Z

I'm thinking having the single-file updated as the contributor lint their changes

I understand. I think the question is really whether we check in the combined result or not. My initial thought is not because it's a generated file, but I don't feel that strongly about it.

We would still have a build process like you proposed of course.

I'm thinking a release process is more important, something we don't have today in this repo. For example, creating a semver-versioned tag would trigger a proper release like here.

Aren't all contents in the spec API related contents? Can you elaborate on this idea?

My issue is simpler, https://github.com/opensearch-project/opensearch-api-specification/blob/feature/native_openapi/OpenSearch.openapi.yaml is huge and difficult to edit. Maybe it can be broken up into a file per path.

nhtruong · 2024-03-03T19:02:51Z

That file you linked is not meant to be edited. The spec is split into smaller files in the spec folder: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi/spec

The OpenSearch.openapi.yaml in the root folder is generated from the files in the folder above. It's meant to be consumed programmatically and compare the diff of different versions of the spec.

dblock · 2024-03-04T18:16:32Z

That file you linked is not meant to be edited. The spec is split into smaller files in the spec folder: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi/spec

The OpenSearch.openapi.yaml in the root folder is generated from the files in the folder above. It's meant to be consumed programmatically and compare the diff of different versions of the spec.

I see. That goes back to my other comment about checking in generated files. We should either remove those from the repo, or at least have a build folder or something like that for them and add markers/comments into the file(s) to say they should not be edited.

nhtruong · 2024-03-26T21:57:53Z

Here's the PR for the switch from Smithy to OpenAPI: #208. It's easier to review this PR like reviewing a new repo by inspecting the feature branch.

We'll need to disable the coverage workflow to get this merged. I'm working on 2 follow-up PRs:

Make coverage workflow work for yaml specs
Add Validator to make sure that new changes into the spec folder follow guidelines in the developer guide.

nhtruong · 2024-04-02T16:15:59Z

Closing this since #208 has already been merged.

github-actions bot added the untriaged label Feb 8, 2024

nhtruong removed the untriaged label Feb 8, 2024

dblock added the enhancement New feature or request label Feb 8, 2024

dblock assigned nhtruong Feb 13, 2024

nhtruong mentioned this issue Feb 26, 2024

Locked OpenSearch CI cluster to version 2.11.1 #193

Merged

dblock changed the title ~~[PROPOSAL] Dropping Smithy in favor of Native OpenAPI~~ [PROPOSAL] Replace Smithy with a native OpenAPI spec Mar 4, 2024

dblock pinned this issue Mar 14, 2024

dblock mentioned this issue Mar 14, 2024

[draft] Use a git submodule to include the OpenSearch.openapi.json file opensearch-project/opensearch-py#627

Closed

vamsi-amazon mentioned this issue Mar 26, 2024

[RFC] Integrating SQL/PPL query languages into DSL via the _search API opensearch-project/OpenSearch#12434

Closed

This was referenced Mar 28, 2024

[FEATURE] Re-add spec coverage #209

Closed

Added Tests for Search Pipeline and Notifications Plugin opensearch-project/opensearch-py#668

Merged

nhtruong closed this as completed Apr 2, 2024

dblock mentioned this issue Apr 2, 2024

Creating multiple matrix of versions for testing CI framework. #43

Closed

dblock unpinned this issue Apr 2, 2024

dbwiddis mentioned this issue Apr 5, 2024

[FEATURE] Document MLClient API using spec files opensearch-project/ml-commons#2297

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

nhtruong commented Feb 8, 2024 •

edited by dblock

Loading

dblock commented Feb 8, 2024 •

edited

Loading

nhtruong commented Feb 8, 2024

reta commented Feb 9, 2024

Tokesh commented Feb 10, 2024 •

edited

Loading

VachaShah commented Feb 13, 2024

nhtruong commented Feb 14, 2024 •

edited

Loading

nhtruong commented Feb 20, 2024 •

edited

Loading

dblock commented Feb 27, 2024

Xtansia commented Feb 27, 2024

dblock commented Feb 28, 2024 •

edited

Loading

nhtruong commented Feb 28, 2024 •

edited

Loading

Xtansia commented Feb 29, 2024

nhtruong commented Feb 29, 2024

dblock commented Mar 1, 2024

nhtruong commented Mar 3, 2024

dblock commented Mar 4, 2024

nhtruong commented Mar 26, 2024 •

edited

Loading

nhtruong commented Apr 2, 2024

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

Comments

nhtruong commented Feb 8, 2024 • edited by dblock Loading

1. Smithy does not support certain data types for path and query-string parameters

2. Smithy does not support untagged unions.

3. OpenSearch operations cause URI Conflict errors in Smithy

4. Unable to customise inlining and $ref naming strategy in OpenAPI conversion

5. Smithy mixins are always flattened/erased.

dblock commented Feb 8, 2024 • edited Loading

nhtruong commented Feb 8, 2024

reta commented Feb 9, 2024

Tokesh commented Feb 10, 2024 • edited Loading

VachaShah commented Feb 13, 2024

nhtruong commented Feb 14, 2024 • edited Loading

nhtruong commented Feb 20, 2024 • edited Loading

dblock commented Feb 27, 2024

Xtansia commented Feb 27, 2024

dblock commented Feb 28, 2024 • edited Loading

nhtruong commented Feb 28, 2024 • edited Loading

Xtansia commented Feb 29, 2024

nhtruong commented Feb 29, 2024

dblock commented Mar 1, 2024

nhtruong commented Mar 3, 2024

dblock commented Mar 4, 2024

nhtruong commented Mar 26, 2024 • edited Loading

nhtruong commented Apr 2, 2024

nhtruong commented Feb 8, 2024 •

edited by dblock

Loading

4. Unable to customise inlining and `$ref` naming strategy in OpenAPI conversion

dblock commented Feb 8, 2024 •

edited

Loading

Tokesh commented Feb 10, 2024 •

edited

Loading

nhtruong commented Feb 14, 2024 •

edited

Loading

nhtruong commented Feb 20, 2024 •

edited

Loading

dblock commented Feb 28, 2024 •

edited

Loading

nhtruong commented Feb 28, 2024 •

edited

Loading

nhtruong commented Mar 26, 2024 •

edited

Loading