Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation around system feature snapshot and restore #79675

Closed
williamrandolph opened this issue Oct 22, 2021 · 7 comments
Closed
Labels
:Core/Infra/Core Core issues without another label >docs General docs changes Team:Core/Infra Meta label for core/infra team Team:Docs Meta label for docs team v7.16.0 v8.0.0-rc1

Comments

@williamrandolph
Copy link
Contributor

Our documentation around snapshots, system indices, and system features is unclear. The technically correct information exists in the API documentation, but mostly under the information about the feature_states parameter. I don't believe there is a high-level summary about what a system feature is, what it means to take a snapshot of one, and how to include or exclude it from a snapshot restoration.

For example, here is what we currently have in the "Create a snapshot" page:

Besides creating a copy of each data stream and index, the snapshot process can also store global cluster metadata, which includes persistent cluster settings, templates, and data stored in system indices, such as Watches and task records, regardless of whether those system indices are named in the indices section of the request. You can also use the create snapshot API’s feature_states parameter to include only a subset of system indices in the snapshot. Snapshots do not store transient settings or registered snapshot repositories.

User stories to document

First, a few definitions from the developer's perspective:

  • In the codebase, a "system feature" is a component that defines one or more system indices, associated indices, or system data streams, alongside code for various management operations for those indices and data streams.
  • "Feature state" means the system indices, associated indices, and system data streams of a feature at a given time, for example, all of the backing data that Kibana stores in Elasticsearch.
  • A "system index" is an index that is meant to be hidden from users. In 7.x, this means that you get deprecation warnings when you access them. In 8.0, you have to have special permissions to access them directly. Eventually, we don't want there to be any direct access. (Due to a quirk of development, there is no direct access to the GeoIP system index in 7.x)
  • A "system data stream" is a data stream that is hidden from users but used by a system component. Currently, only Fleet has a system data stream.
  • An "associated index" is an index that the feature uses, but that doesn't need the protections provided to system indices. Often this is because it contains information that we want end-users to be able to see and search. Such indices are also included in snapshots of "feature state."

We hope that "feature" and "feature state" are the main concepts with which users needs to concern themselves.

Snapshot

  1. A user wants to make a snapshot that includes the system configuration and data for one or more system components, for example, Elasticsearch security or Kibana.
  2. The user calls the GET features api to see which system features are present in the system. (Under the hood, a "system feature" is usually defined in an x-pack plugin, but that is an implementation detail which already has one exception and is subject to further change.)
  3. The user decides to include a feature state in their snapshot request using the feature_states parameter.
  4. Once the snapshot is taken, the user can see the included system indices using the GET snapshot API.

When the snapshot has include_global_state set to true, all feature states (meaning, all system indices, associated indices, and system data streams) are included in the snapshot. Snapshots with global state have already proved tricky to restore into new clusters, often because a system index in the global state clashes with something already named in the cluster.

Restore

  1. A user wants to reset a feature to a previous state, or restore a feature's settings into a new cluster.
  2. The user calls the GET snapshot API to see which features are included in the snapshot of interest.
  3. The user issues a restore request with the names of those features in the feature_states request parameter.

Where users are running into trouble at the moment is when system indices they don't really care about clash with an index that is already in the cluster. Many users want to know how to exclude a particular system index from the restore operation. Our desired solution is that the user could exclude the feature that owns the system index. Unfortunately, we don't have an excluded_feature_states request parameter; the only way to exclude a feature state right now is to put all the feature states except that one in the feature_states parameter.

Tricky cases

  • In 7.x, snapshotting and restoring all indices includes system indices for backwards compatibility. Users might not expect this. The documented behavior is as follows:
    ** Request snapshots or restores all indices, with no feature_states or include_global_state parameter: all feature states included
    ** Request snapshots or restores one specific index, with include_global_state set to true: all feature states included
    ** Request snapshots or restores all indices, with include_global_state set to false: no feature states included
    ** Request snapshots or restores all indices, with feature_states: []: no feature states included
    ** Request snapshots or restores all indices, with feature_states: ["none"]: no feature states included
  • I honestly can't remember what happens if a snapshot has a feature state that the cluster we are restoring to lacks. I don't think this can happen in practice, but it could in the feature if users have custom plugins that define system indices.
  • For normal indices, a user must explicitly close the index before it can be restored from a snapshot. System indices are different; they are automatically closed and overwritten during the restore operation. In hindsight, I can see that this might have unintended consequences.

cc: @gwbrown, @lockewritesdocs, @debadair

@williamrandolph williamrandolph added >docs General docs changes :Core/Infra/Core Core issues without another label v8.0.0 v7.16.0 v7.16.1 labels Oct 22, 2021
@elasticmachine elasticmachine added Team:Core/Infra Meta label for core/infra team Team:Docs Meta label for docs team labels Oct 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@jrodewig
Copy link
Contributor

jrodewig commented Oct 22, 2021

Hi @williamrandolph,

I'm currently working on an overhaul of the snapshot/restore documentation with #79081. That work doesn't touch on feature states in depth but does mention system indices.

I also recently updated the restore a snapshot tutorial with #76929. Again, that work doesn't go into feature states in depth but does cover some methods for excluding system indices.

I think these updates will work best as a follow-on to those efforts, but let me know what you think. I see that you assigned @lockewritesdocs. If you've already discussed these changes with him, let me know.

@jrodewig
Copy link
Contributor

jrodewig commented Oct 22, 2021

Where users are running into trouble at the moment is when system indices they don't really care about clash with an index that is already in the cluster. Many users want to know how to exclude a particular system index from the restore operation.

I've opened #79683 to specifically address this user pain point. I tentatively plan to incorporate the other changes into #79081.

Our desired solution is that the user could exclude the feature that owns the system index. Unfortunately, we don't have an excluded_feature_states request parameter; the only way to exclude a feature state right now is to put all the feature states except that one in the feature_states parameter.

I may be missing something, but I don't think this works. Even if you exclude the feature state, the request will still return an error if the indices parameter is permissive enough to try to restore an existing system index. Unfortunately, it's that permissive by default.

The only solutions I found were:

  1. Explicitly excluding dot indices from indices using the -.* wildcard pattern.
  2. Restoring the system index via the feature_states parameter. However, this only works if the snapshot contains the feature state for the system index. It may also not actually be desired by the user.

Since option 1 works in all cases, I documented that in #79683. However, I could be wrong. Let me know!

@williamrandolph
Copy link
Contributor Author

Thanks, James!

When it comes to the terminology around "system indices" and "feature states," our thinking is that eventually we want "system indices" to be an implementation detail, and "system features" to be the user-facing catch-all term for system components that use Elasticsearch to store configuration data, metadata, or any other "state."

I think you are right about the unfortunate permissiveness of the indices parameter. @gwbrown is working on changing the behavior for 8.0 to something that makes more sense to everyone, and he might be able to add some useful context here.

If you and @lockewritesdocs think that this issue is already covered by what you are doing in your overhaul, it is okay with me if you close this issue.

@jrodewig
Copy link
Contributor

jrodewig commented Oct 25, 2021

Thanks @williamrandolph. I'll leave this open for now.

Here's how I've spread out the related docs work:

Both of those reflect how things work as of this writing. We'll need to do another update if we merge #79670 or make other changes that affect the snapshot/restore defaults.

jrodewig added a commit that referenced this issue Oct 26, 2021
When restoring a snapshot to a new cluster, users may expect the cluster
to not contain any conflicting indices or data streams. However, some
features, such as the GeoIP processor, automatically create indices at
startup.

This adds and updates related procedures in the restore a snapshot tutorial.
I plan to improve other documentation related to feature states in snapshots
in a separate PR(s).

This PR also updates the restore snapshot API's example to include
the `indices` and `feature_states` parameters.

Relates to #79675
elasticsearchmachine pushed a commit that referenced this issue Oct 26, 2021
When restoring a snapshot to a new cluster, users may expect the cluster
to not contain any conflicting indices or data streams. However, some
features, such as the GeoIP processor, automatically create indices at
startup.

This adds and updates related procedures in the restore a snapshot tutorial.
I plan to improve other documentation related to feature states in snapshots
in a separate PR(s).

This PR also updates the restore snapshot API's example to include
the `indices` and `feature_states` parameters.

Relates to #79675
@jrodewig jrodewig removed their assignment Nov 15, 2021
@jrodewig
Copy link
Contributor

jrodewig commented Nov 15, 2021

With #79683 and #79081 merged, I'm going to close out this issue.

We'll need to update the docs when #79670 or a similar change merges, but I think we can handle that with our usual docs update process for new development. Feel free to re-open, etc. if that makes it easier for tracking.

jrodewig added a commit to elastic/kibana that referenced this issue Nov 16, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
jrodewig added a commit to elastic/kibana that referenced this issue Nov 16, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
jrodewig added a commit to elastic/kibana that referenced this issue Nov 17, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
# Conflicts:
#	docs/setup/upgrade.asciidoc
#	docs/template.asciidoc
fkanout pushed a commit to fkanout/kibana that referenced this issue Nov 17, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
dmlemeshko pushed a commit to elastic/kibana that referenced this issue Nov 29, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
roeehub pushed a commit to build-security/kibana that referenced this issue Dec 16, 2021
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
gbamparop pushed a commit to gbamparop/kibana that referenced this issue Jan 12, 2022
With elastic/elasticsearch#79081, we now cover Kibana's **Snapshot and Restore** feature in the Elasticsearch docs. This removes and redirects the related docs from Kibana.

It also updates some references to the `.kibana` system indices to include the `kibana` feature state. The `kibana` feature state is now the preferred way to back up and restore system indices and other configuration data for Kibana.

Relates to elastic/elasticsearch#79675
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >docs General docs changes Team:Core/Infra Meta label for core/infra team Team:Docs Meta label for docs team v7.16.0 v8.0.0-rc1
Projects
None yet
Development

No branches or pull requests

6 participants