Remove current delete-by-query implementation #10288

mikemccand · 2015-03-26T22:42:48Z

This is master only, and will close #10067.

I removed DBQ everywhere except Engine/Translog/IndexShard so that on upgrade a DBQ in the translog will still apply (the back-compat tests now check this as of #10266).

Once we do #7052 for 2.0 we'll add back delete-by-query, but implemented as a sugar API to run a scan query and scroll through all hits, doing bulk delete for them.

s1monw · 2015-03-27T10:10:26Z

man the only concern I have is, if we add this back later we have to re-add all the API here. I really hate to say it but it might make sense to implement the entire thing as a pure client side simple method for now that really just uses scan/scroll in the java client that way we can get rid of all the internal code, keep the tests and have the API still?

mikemccand · 2015-03-27T18:08:06Z

implement the entire thing as a pure client side simple method for now

I agree this should really be "client side", somehow.

The problem is, every client would need to implement it (I am partial to the Python client so I would naturally do that one first 😏 )? Maybe that's OK?

Or can we do it as server side sugar, so that whichever node receives this request, would (somehow) open a client (or maybe rest/http) connection and issue the request? Does every ES node already have a [Java] client node instance it can use to talk to other nodes? (I am way out of my comfort zone here....).

Maybe there is an example where we already do something similar to this? I could use that for inspiration/starting point ... e.g. when we cast attachments to plain text using Tika (the mapper-attachment plugin) do we do this once per document and then index that result on primary + replicas as if we were a client...?

s1monw · 2015-03-27T20:37:55Z

we could implement this inside the Java client API as an implementation detail of AbstractClient.java and make those methods final? That way we can keep all the api and only trash the transport part and the engine internals. We might even be able to port this to 1.6 which would be great too.

TwP · 2015-03-29T05:35:05Z

Is it not possible to keep the current ES API and implement delete-by-query in the server itself as an internal scan/scroll + bulk delete? That seems like the best solution here as it would avoid implementing this logic in all language clients.

kimchy · 2015-03-29T07:36:42Z

I am leaning towards @TwP idea, I think its common enough API that ES should implement internally. I am using similar logic here into why, for example, reindex is better implemented in ES itself, even though the client libs implement it currently.

kimchy · 2015-03-29T07:39:58Z

@s1monw yea, if we do scan search + bulk delete we can trash the engine part for sure. If we keep the transport action part, we can broadcast and do the scan search collocated with the primary. If we remove it, then we will need to scan search across, which might not be the end of the world.

s1monw · 2015-03-30T08:31:25Z

@mikemccand yeah let's do a simple scroll/scan in the transport action so we can do it per shard?

mikemccand · 2015-03-30T08:36:08Z

we can broadcast and do the scan search collocated with the primary

I agree this would be more efficient. It would also mean the DBQ runs concurrently on each shard I think...

So this means the primary receives the DBQ, runs the scan/scroll search (locally), then issues (replicated) bulk delete-by-id ops within its shard.

I wonder if I can somehow do this in the existing TransportShardDeleteByQueryAction class... seems like it's not quite flexible enough (assumes the op is done on primary then done on replicas). Maybe we need specialized code in TransportShardReplicationOperationAction for DBQ? Or am I looking in entirely the wrong place :)

Separately, I noticed we currently fail to trigger a refresh if version map RAM is too much due to only deletions ... I'll open a separate PR.

mikemccand · 2015-03-30T08:50:39Z

I'm closing this PR as "won't fix": let's have further discussions about the approach on #7052.

However I will say I'm doing so under protest: I think if we have suitably dangerous features in ES, we should be free to simply remove them outright, even if we don't have an alternative implementation immediately in mind / at hand. Coupling removal with "you must also build a better replacement" puts up a big barrier to correcting past mistakes...

I will work towards the new implementation under #7052.

s1monw · 2015-03-30T08:51:47Z

@mikemccand I am more than with you. IMO you cam push it as it is and we add the relevant stuff back later!

mikemccand added 2 commits March 26, 2015 18:09

removed

15b8eac

merge master

782faf3

mikemccand added >breaking v2.0.0-beta1 labels Mar 26, 2015

mikemccand self-assigned this Mar 26, 2015

mikemccand mentioned this pull request Mar 26, 2015

Core: remove delete-by-query #10067

Closed

mikemccand closed this Mar 30, 2015

mikemccand mentioned this pull request Apr 6, 2015

Revert delete-by-query deprecations #10443

Closed

clintongormley removed the v2.0.0-beta1 label May 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove current delete-by-query implementation #10288

Remove current delete-by-query implementation #10288

mikemccand commented Mar 26, 2015

s1monw commented Mar 27, 2015

mikemccand commented Mar 27, 2015

s1monw commented Mar 27, 2015

TwP commented Mar 29, 2015

kimchy commented Mar 29, 2015

kimchy commented Mar 29, 2015

s1monw commented Mar 30, 2015

mikemccand commented Mar 30, 2015

mikemccand commented Mar 30, 2015

s1monw commented Mar 30, 2015

Remove current delete-by-query implementation #10288

Remove current delete-by-query implementation #10288

Conversation

mikemccand commented Mar 26, 2015

s1monw commented Mar 27, 2015

mikemccand commented Mar 27, 2015

s1monw commented Mar 27, 2015

TwP commented Mar 29, 2015

kimchy commented Mar 29, 2015

kimchy commented Mar 29, 2015

s1monw commented Mar 30, 2015

mikemccand commented Mar 30, 2015

mikemccand commented Mar 30, 2015

s1monw commented Mar 30, 2015