diff --git a/docs/reference/index-modules/translog.asciidoc b/docs/reference/index-modules/translog.asciidoc index 31d529b6c4436..b1eb36e346d9f 100644 --- a/docs/reference/index-modules/translog.asciidoc +++ b/docs/reference/index-modules/translog.asciidoc @@ -1,41 +1,44 @@ [[index-modules-translog]] == Translog -Changes to Lucene are only persisted to disk during a Lucene commit, -which is a relatively heavy operation and so cannot be performed after every -index or delete operation. Changes that happen after one commit and before another -will be lost in the event of process exit or HW failure. - -To prevent this data loss, each shard has a _transaction log_ or write ahead -log associated with it. Any index or delete operation is written to the -translog after being processed by the internal Lucene index. - -In the event of a crash, recent transactions can be replayed from the -transaction log when the shard recovers. +Changes to Lucene are only persisted to disk during a Lucene commit, which is a +relatively expensive operation and so cannot be performed after every index or +delete operation. Changes that happen after one commit and before another will +be removed from the index by Lucene in the event of process exit or hardware +failure. + +Because Lucene commits are too expensive to perform on every individual change, +each shard copy also has a _transaction log_ known as its _translog_ associated +with it. All index and delete operations are written to the translog after +being processed by the internal Lucene index but before they are acknowledged. +In the event of a crash, recent transactions that have been acknowledged but +not yet included in the last Lucene commit can instead be recovered from the +translog when the shard recovers. An Elasticsearch flush is the process of performing a Lucene commit and -starting a new translog. It is done automatically in the background in order -to make sure the transaction log doesn't grow too large, which would make +starting a new translog. Flushes are performed automatically in the background +in order to make sure the translog doesn't grow too large, which would make replaying its operations take a considerable amount of time during recovery. -It is also exposed through an API, though its rarely needed to be performed -manually. +The ability to perform a flush manually is also exposed through an API, +although this is rarely needed. [float] === Translog settings -The data in the transaction log is only persisted to disk when the translog is +The data in the translog is only persisted to disk when the translog is ++fsync++ed and committed. In the event of hardware failure, any data written since the previous translog commit will be lost. -By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds if `index.translog.durability` is set -to `async` or if set to `request` (default) at the end of every <>, <>, -<>, or <> request. In fact, Elasticsearch -will only report success of an index, delete, update, or bulk request to the -client after the transaction log has been successfully ++fsync++ed and committed -on the primary and on every allocated replica. +By default, Elasticsearch ++fsync++s and commits the translog every 5 seconds +if `index.translog.durability` is set to `async` or if set to `request` +(default) at the end of every <>, <>, +<>, or <> request. More precisely, if set +to `request`, Elasticsearch will only report success of an index, delete, +update, or bulk request to the client after the translog has been successfully +++fsync++ed and committed on the primary and on every allocated replica. -The following <> per-index settings -control the behaviour of the transaction log: +The following <> per-index +settings control the behaviour of the translog: `index.translog.sync_interval`:: @@ -64,17 +67,20 @@ update, or bulk request. This setting accepts the following parameters: `index.translog.flush_threshold_size`:: -The translog stores all operations that are not yet safely persisted in Lucene (i.e., are -not part of a lucene commit point). Although these operations are available for reads, they will -need to be reindexed if the shard was to shutdown and has to be recovered. This settings controls -the maximum total size of these operations, to prevent recoveries from taking too long. Once the -maximum size has been reached a flush will happen, generating a new Lucene commit. Defaults to `512mb`. +The translog stores all operations that are not yet safely persisted in Lucene +(i.e., are not part of a Lucene commit point). Although these operations are +available for reads, they will need to be reindexed if the shard was to +shutdown and has to be recovered. This settings controls the maximum total size +of these operations, to prevent recoveries from taking too long. Once the +maximum size has been reached a flush will happen, generating a new Lucene +commit point. Defaults to `512mb`. `index.translog.retention.size`:: -The total size of translog files to keep. Keeping more translog files increases the chance of performing -an operation based sync when recovering replicas. If the translog files are not sufficient, replica recovery -will fall back to a file based sync. Defaults to `512mb` +The total size of translog files to keep. Keeping more translog files increases +the chance of performing an operation based sync when recovering replicas. If +the translog files are not sufficient, replica recovery will fall back to a +file based sync. Defaults to `512mb` `index.translog.retention.age`:: @@ -86,10 +92,14 @@ The maximum duration for which translog files will be kept. Defaults to `12h`. [[corrupt-translog-truncation]] === What to do if the translog becomes corrupted? -In some cases (a bad drive, user error) the translog can become corrupted. When -this corruption is detected by Elasticsearch due to mismatching checksums, -Elasticsearch will fail the shard and refuse to allocate that copy of the data -to the node, recovering from a replica if available. +In some cases (a bad drive, user error) the translog on a shard copy can become +corrupted. When this corruption is detected by Elasticsearch due to mismatching +checksums, Elasticsearch will fail that shard copy and refuse to use that copy +of the data. If there are other copies of the shard available then +Elasticsearch will automatically recover from one of them using the normal +shard allocation and recovery mechanism. In particular, if the corrupt shard +copy was the primary when the corruption was detected then one of its replicas +will be promoted in its place. If there is no copy of the data from which Elasticsearch can recover successfully, a user may want to recover the data that is part of the shard at