Create action to migrate the contents of one index to a new index #20024

nik9000 · 2016-08-17T13:00:02Z

The standard way to change an index's mapping is to create a new index with the
new mapping, _reindex the documents into the new index, flip the alias from
the old index to the new index, and then remove the old index. Traditionally
this sort of thing has been left as an exercise for those implementing an
application against Elasticsearch but I think now is the time to implement this
in Elasticsearch because:

Kibana, Watcher and Security need to run this process as part of upgrading to 5.0.
Elasticsearch 5.0 now has the .tasks index for storing the results of
tasks long running. While we were fairly careful in designing its mappings,
I'm under no illusion that we got it right the first try. That just isn't the
way software works. We're going to want to run this on .tasks one day.
Logstash is considering storing configuration in an Elasticsearch index and
handling upgrades to the format of the data is a concern for Logstash's
engineers.

In all of these cases the indexes are implementation details of their
application so we'd like to automatically upgrade them on startup rather than
provide upgrade scripts. That means that the application will want to migrate
its data every time it starts up so a user only has to get involved if the data
migration fails.

3 of the 5 applications that will need to do this migration live inside
Elasticsearch (Watcher and Security are a plugin, .tasks is in core
Elasticsearch). So it looks like the right place to implement this is in core
Elasticsearch. The other advantage of implementing it there is that it can be
used by the widest range of users.

This PR intends to build an action into core Elasticsearch that:

Responds quickly with 200 OK when the index is in the desired state
already.
Waits on concurrent invocations of the same request. This is especially
important in "masterless" systems like Logstash so they can invoke this API on
startup and not have to worry about one node "winning". They all get the same
response.
Notices if previous executions of this request didn't complete properly and
responds with that information rather than some cryptic failure message.
Performs the create index, migrate documents, flip alias, delete source
index steps.

It exposes it with an HTTP request that looks like:

POST /index_1/_migrate/index_2
{
  "settings": {...},
  "mapping": {...},
  "aliases": ["index"],
  "script": {
    "lang": "painless",
    "inline": "ctx._source.thing = 2"
  }
}

In this example index_1 is the source index and index_2 is the destination
index. Unlike a normal create index command the aliases section is required.
This is how _migrate knows that the process is complete and it is a good
practice anyway. The alias is added to the destination index after all the docs
in the source index are migrated to the destination index and the destination
index has been _refreshed so they are visible.

Like _reindex and _delete_by_query and _update_by_query, these requests
are "big" in that they do many things and we expect them to take a long time if
they operate on a large number of documents. This can't be helped so we want to
make sure that this request integrates well with the task management API. That
means that it should be "cancellable": true and it's status should be super
expressive, returning the phase of the operation currently being performed and
if that phase is reindex then it needs to return the details of the reindex's
status.

We try to limit the number of "big" operations in core Elasticsearch because
every one of them feels like a new trap we are setting for unsuspecting users.
We will need to warn users that this can take some time and put some load on
the cluster. For the users all the way at the top of the document we don't
expect this to be a problem though. A Security index with a million documents
is huge but not a ton of work for reindex. We just have to make very very
sure that it is obvious to users that doing this against an index with a
hundred million documents is going to take a long time.

nik9000 · 2016-08-17T13:02:19Z

This is currently a very rough WIP. I'd mostly like to get feedback on the general direction before I go too deep down a rabbit hole.

I was using a CountDownLatch like a CyclicBarrier....

Throws an exception on current requests to the same index that differ in some way.

`#equals` isn't quite right, so we make something better. And this time we test it.

You can't reuse requests in different threads or they'll be modified by different threads without any proper synchronization. And we check that the request isn't modified in unexpected ways.

nik9000 · 2016-09-13T12:45:08Z

Sorry for leaving this open for so long. A few of us talked verbally and, while this operation would be useful for some folks, it really wouldn't be useful for upgrading indexes on startup. The reasoning is that upgrading an index requires that the cluster be stable for the duration of the upgrade and cluster startup is the time when the cluster is at its most unstable.

nik9000 added discuss WIP :Data Management/Indices APIs APIs to create and manage indices and templates v5.0.0-beta1 labels Aug 17, 2016

nik9000 changed the title ~~Index migrate~~ Create action to migrate the contents of one index to a new index Aug 17, 2016

nik9000 added 15 commits August 17, 2016 12:09

Coalescing

03f79d6

Add preflight

641b78b

Handle filters

8953bd8

Properly handle filters.....

b5031f0

Basics

5d5eb32

Basic REST

d55cd14

Start moving around migrate's guts so it is simpler

fd8634f

Remove much mutable state

3e5cecc

More tests, more sane (hopefully)

d327e23

Fix MigrateIT

332f99d

Add concurrent update tests

a662b97

Fix concurrent migrate tests

9e6f301

I was using a CountDownLatch like a CyclicBarrier....

Add round trip tests for request and response

970a183

Add validation

3a46c3d

Fail to coalesce requests that differ

065e3ce

Throws an exception on current requests to the same index that differ in some way.

nik9000 force-pushed the index_migrate branch from b318d8a to 065e3ce Compare August 17, 2016 16:10

nik9000 added 3 commits August 17, 2016 15:24

Fix coalescing checks

b76d1f8

`#equals` isn't quite right, so we make something better. And this time we test it.

Fix transient weird failures

2f18c61

You can't reuse requests in different threads or they'll be modified by different threads without any proper synchronization. And we check that the request isn't modified in unexpected ways.

Start setting up status

35827ca

This was referenced Aug 18, 2016

do not merge me nik9000/elasticsearch#2

Closed

Add alias action that deletes an index #20064

Closed

nik9000 closed this Sep 13, 2016

clintongormley removed the v5.0.0-beta1 label Sep 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create action to migrate the contents of one index to a new index #20024

Create action to migrate the contents of one index to a new index #20024

nik9000 commented Aug 17, 2016 •

edited by clintongormley

Loading

nik9000 commented Aug 17, 2016

nik9000 commented Sep 13, 2016

Create action to migrate the contents of one index to a new index #20024

Create action to migrate the contents of one index to a new index #20024

Conversation

nik9000 commented Aug 17, 2016 • edited by clintongormley Loading

nik9000 commented Aug 17, 2016

nik9000 commented Sep 13, 2016

nik9000 commented Aug 17, 2016 •

edited by clintongormley

Loading