Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Get rid of the dependency on OpenSearch core #262

Open
dblock opened this issue Nov 9, 2022 · 19 comments
Open

[FEATURE] Get rid of the dependency on OpenSearch core #262

dblock opened this issue Nov 9, 2022 · 19 comments
Assignees
Labels
enhancement New feature or request performance Make it fast!

Comments

@dblock
Copy link
Member

dblock commented Nov 9, 2022

Is your feature request related to a problem?

The opensearch-java library uses RestClient from OpenSearch core, bringing in the kitchen sink and causing problems such as #156.

What solution would you like?

Get rid of the dependency.

@dblock dblock added enhancement New feature or request untriaged labels Nov 9, 2022
@dblock
Copy link
Member Author

dblock commented Nov 9, 2022

@reta and @dlvenable - I'd like your thoughts on the problem space, the downsides, and potential solutions.

@dlvenable
Copy link
Member

The core problem I see is that this dependency couples Java clients to the server implementation. The Java version does not impact Python or JS clients for example. But, the update the JDK 11 in core forced an update on clients. OpenSearch core should be able to update JDK versions based on the needs of system administrators and developers, not based on the needs of clients. Similarly, client developers should be able to update their JDK versions based on their needs and the needs of more complicated projects.

The low-level REST client from OpenSearch core is mostly a wrapper around Apache HTTP. It adds some logic that is relevant for communicating with an OpenSearch cluster in general, such as load balancing, error/warning handling, node selection controls.

I see a few options:

  1. Remove the dependency from opensearch-java entirely.
  2. Invert the dependencies. That is, make OpenSearch depend on opensearch-java. I think this approach requires that opensearch-java continue to export the low-level REST client as its own library so that core does not need the high-level APIs.

Option 1)
Upsides:

  • Fixes the core issue.
  • Does not create a dependency between core and the client.

Some downsides:

  • We duplicate some of the same logic in both projects.
  • Client developers must start a migration path

Option 2)
Upsides:

  • Fixes the core issue.
  • This could produce the same library in Maven Central that client developers use. Thus, client developers can continue to use the same setup they have been using.
  • No code duplication.

Some downsides:

  • OpenSearch core will not depend on opensearch-java

@reta
Copy link
Collaborator

reta commented Nov 10, 2022

I agree with @dlvenable statement that

The low-level REST client from OpenSearch core is mostly a wrapper around Apache HTTP. It adds some logic that is relevant for communicating with an OpenSearch cluster in general, such as load balancing, error/warning handling, node selection controls.

It indeed does not depend on anything from the core. I would like to suggest a third option:
3) Move off RestClient to separate repo (+ test module as well), for the reasons it is not dependent nor on core nor on opensearch-java
Pros:

  • not attached to core or opensearch-java
  • changed very rarely
  • own release cycle
  • no lockstep update for core and opensearch-java (each can do that at own pace)

Cons:

  • may need core and opensearch-java releases in case of bugfixes

The option 1) is also on the table but I would certainly be against of option 2): the core should have no deps on opensearch-java at any time or circumstances (my opinion).

@dlvenable
Copy link
Member

I'm good with Option 3 as well.

To clarify, my option 2 was really the same from a module perspective. I was proposing that opensearch-java would produce two jars. They would be developed in the same Git repo and released together. From a module perspective core would not really be affected since it would have the same dependency chain.

@dlvenable
Copy link
Member

I have a few other thoughts on just removing the dependency.

Clarity - We can make a clearer story for client developers. Right now, there are three Java clients. In my opinion, this is a confusing aspect of OpenSearch for end-users, especially client developers. I also experienced something similar when I was using spring-data-elasticsearch and ES 7. I was unsure what client I should be using. If we just remove this dependency, then we have a single client to offer to client developers. This should reduce friction for getting started.

Client Configuration - Using the opensearch-java client has one main transport - RestClientTransport (there is a really nice AWS one, but that is only for some situations). By dropping the RestClient, opensearch-java could instead offer options to use either Apache HTTP Client or Netty. This can help reduce dependencies for clients. Say the client is already using Netty, they can remove an unnecessary Apache dependency and vice-versa.

@reta
Copy link
Collaborator

reta commented Nov 10, 2022

Thanks @dlvenable, I think there is an agreement to recommend opensearch-java as the preferred option [1]. The RestClient is used in many plugins internally (in core) and very likely outside as well (fe security), so I think it still makes sense to reuse it (but that is arguable for sure).

By dropping the RestClient, opensearch-java could instead offer options to use either Apache HTTP Client or Netty.

You can do that now with OpenSearchTransport, one of them happens to be RestClientTransport based on RestClient as you mentioned, so nothing is preventing to offer other options (if it makes sense).

[1] opensearch-project/OpenSearch#4256 (comment)

@dlvenable
Copy link
Member

I think there is an agreement to recommend opensearch-java as the preferred option

This is what I understand as well. I'm suggesting that we go one step further and not even offer these other options externally. These clients could potentially be viewed as internal components. Dropping this dependency helps with that.

You can do that now with OpenSearchTransport, one of them happens to be RestClientTransport based on RestClient as you mentioned, so nothing is preventing to offer other options (if it makes sense).

This is true as well. But we can tighten up the transport concept some by dropping the low-level client. As a client developer I don't know what I'm getting if I choose RestClientTransport versus NettyClientTransport. Specifically, what client does RestClientTransport provide? It would be clearer to provide either a NettyClientTransport or an ApacheClientTransport.

Also, what about the basic cluster management that RestClient provides? Do we drop that in a possible NettyClientTransport.

@reta
Copy link
Collaborator

reta commented Nov 10, 2022

I think we derailed a bit from the subject (but in a good sense), just to sum up the on the RestClientTransport, since it is relevant: the RestClient is at 95% wrapper around Apache HttpClient 5 (or 4 for 2.x) but there are quite a few things that needed to be implemented to make Apache HttpClient 5 work (there is no way around it). This part I would like to avoid duplicating and we will have to even if we rebrand RestClientTransport and ApacheClientTransport and drop RestClient dependency.

@dblock
Copy link
Member Author

dblock commented Nov 15, 2022

Why should the java client be any different from JS or Python client? It shouldn't. If code duplication is to be had, so what? Either way, the long term plan to avoid code duplication is opensearch-project/opensearch-clients#19 which clarifies the relationship between clients and server, all while avoiding duplication for all clients.

I would merge a PR that removes the dependency by duplicating whatever code needs to be duplicated and bumping the major version and creating an UPGRADING.md that explains what to do. It could also restore support for JDK 1.8. Hint ... hint :)

@reta
Copy link
Collaborator

reta commented Nov 16, 2022

Why should the java client be any different from JS or Python client? It shouldn't. If code duplication is to be had, so what?

I think the main difference is that the core uses java client internally (in plugins / modules / tests fe).

I would merge a PR that removes the dependency by duplicating whatever code needs to be duplicated and bumping the major version and creating an UPGRADING.md that explains what to do. It could also restore support for JDK 1.8. Hint ... hint :)

I will prototype that

@dlvenable
Copy link
Member

Why should the java client be any different from JS or Python client? It shouldn't.

Agreed

If code duplication is to be had, so what?

It would be nice to avoid it. But, I think clarity to the end-users is more valuable. So I think code duplication is a good trade-off here.

I would merge a PR that removes the dependency by duplicating whatever code needs to be duplicated and bumping the major version and creating an UPGRADING.md that explains what to do. It could also restore support for JDK 1.8. Hint ... hint :)

I think a migration path could be helpful there. That is opensearch-java 2.x adds the duplicated code as a new transport. But it retains the old dependency as an optional transport. Then in the major version 3.x there is no option to use the old dependency.

@wbeckler
Copy link

@dlvenable what is a scenario in which someone benefits from keeping the old dependency as an optional transport?

@dlvenable
Copy link
Member

@wbeckler , I only think it is useful for a smoother migration.

Minor release (say 2.2) - Adds the new transport and deprecates the old
Major release (say 3.0) - Removes the old transport.

A client developer who uses the library has an opportunity in 2.2 to make these changes before they are removed. They see the deprecation and they have some time to migrate. The alternative is that they want to update to 3.0 and suddenly they have to make the change then.

@wbeckler
Copy link

Why does removing the dependency affect a user of opensearch-java when we're duplicating the old transport? Is there a use of the client that depends on the source of the transport dependency? I'm asking because I'd like to find a way to remove the core dependency and avoid the major version bump.

@dblock
Copy link
Member Author

dblock commented Nov 30, 2022

+1 @wbeckler if there's no end-user change we can avoid the major version increment, it looks like we're not even changing the namespace

@reta
Copy link
Collaborator

reta commented Nov 30, 2022

Why does removing the dependency affect a user of opensearch-java when we're duplicating the old transport?

@dblock @wbeckler this is a tricky question: we could duplicate old transport as-is, which basically means we would have at least 2 RestClient implementations, but we technically do not need to do that (see fe #281) and probably should not do that.

Is there a use of the client that depends on the source of the transport dependency?

We could keep the possibility to use RestClient from core (and not breaking the existing users) by making the rest-client dependency optional in 3.0.0 and still supporting it but not requiring.

@dblock
Copy link
Member Author

dblock commented Nov 30, 2022

I would pick restoring JDK 8 support over not doing a major version increment. I say @reta you should decide what's best and let us contribute with our invaluable thinking (TM). I think I'd merge any ready version of #281, including with breaking changes (and a major version increment).

@dblock
Copy link
Member Author

dblock commented Dec 1, 2022

I also opened opensearch-project/OpenSearch#5424. Would love your comments @reta and @dlvenable!

@reta
Copy link
Collaborator

reta commented Dec 1, 2022

I also opened opensearch-project/OpenSearch#5424. Would love your comments @reta and @dlvenable!

It makes sense to me: if we drop dependency on core, the plugins may migrate to opensearch-java as well, as such RestClient could be dropped at some point. Regarding the suggested plan, it might be a bit aggressive (added a note opensearch-project/OpenSearch#5424 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Make it fast!
Projects
None yet
Development

No branches or pull requests

4 participants