Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saved object migrations to allow pre-migration state collection #34996

Open
mikecote opened this issue Apr 12, 2019 · 12 comments
Open

Saved object migrations to allow pre-migration state collection #34996

mikecote opened this issue Apr 12, 2019 · 12 comments
Labels
Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mikecote
Copy link
Contributor

mikecote commented Apr 12, 2019

Problem:

SavedObject migrations are synchronous functions making it impossible to write migrations that require looking up some state from the old .kibana index being migrated.

Describe a specific use case for the feature:

I want to convert graph saved objects to use index pattern ids instead of names but realized there isn't a method to lookup existing index patterns and the document migration can only be executed synchronously.

Once completed, this will unblock #34989. cc @flash1293

Proposed solution (updated after discussion below):

Allow migrations to provide an asynchronous pre-migration state collection function. The result of this function will be passed as an argument to each of the synchronous migration functions.

@mikecote mikecote added Team:Operations Team label for Operations Team Feature:Saved Objects labels Apr 12, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations

@epixa
Copy link
Contributor

epixa commented Apr 12, 2019

We should be really careful in enabling something like this. The problem with asynchronous migrations is that Kibana is completely unavailable while migrations run, so if even one plugin performs a slow asynchronous migration that doesn't scale, it could cause significant downtime for Kibana. This could also make it pretty trivial to clobber the underlying ES cluster by slamming it with queries.

If we support something like this, which I'm not convinced we should even bother, then we might want to introduce some sort of pre-flight async migration hook that would execute once per plugin per version and then result would be passed to all of the synchronous migration functions, and then something like graph could query for its workspaces in bulk and then return a map of names to ids that its own migrations could then rely on.

@tylersmalley
Copy link
Contributor

I absolutely agree with an async setup or preflight request to capture the data to be used by the migrations instead of providing async migrations.

@mikecote
Copy link
Contributor Author

mikecote commented Apr 12, 2019

That is a good idea, it will solve the scenario and be more efficient at running. 👍

@rudolf
Copy link
Contributor

rudolf commented Jul 4, 2019

Updated the description based on the discussion.

@flash1293
Copy link
Contributor

Discussed the idea with @rudolf and there are a few things to consider.

  • If the async setup function has access to the saved object client, we possibly enter a circular dependency here, because the saved object client is doing migrations on the fly which might require another async setup function and so on - this case would have to be handled somehow.
  • If we avoid that and only expose a way to send a raw ES query in the async setup function, we have to deal with the possibility to get old un-migrated results back.
  • We are also doing migrations if an object is imported which would require us the re-run the async setup function there which might get costly in some scenarios - maybe some caching of the results can help
  • In the scenario of migrations-on-import it's probably necessary for the async setup to also have access to the other objects of the same batch (e.g. if some properties are moved between dashboard and visualization saved objects and the visualization migration has to look at the old dashboard object)

@wylieconlon
Copy link
Contributor

Because we don't currently have any migrations for the config object, can it safely be loaded before migrations are run? This could unblock our work on #53972

@rudolf
Copy link
Contributor

rudolf commented Jan 31, 2020

Discussed this with @pgayvallet and @wylieconlon and we came up with three potential solutions to #53972:

  1. Centralize all index-pattern loading to the data plugin and transform any index-pattern when it's loaded. Having all plugins consume index-patterns through the data plugin has the benefit that in the future we could make small index-pattern tweaks without requiring a migration.
  2. Add the ability to register in-memory-only migrations which have access to CoreStart API's and can collect state asynchronously. These migrations will only run when an object is loaded so it lowers the performance impact but not stalling all of kibana during load time. Changes won't be persisted to ES unless the plugin loading the Saved Object decides to save it explicitly.
  3. Add the async collector as suggested in this issue. Performance problems and exceptions in the async collector could delay or prevent all of kibana from starting up.

The Platform team favours (1) since it will improve the overall architecture of Kibana and has the lowest risk. This also buys us time to improve the saved object migration framework to be more resilient #52202 which we're targeting for 8.0.0

@tylersmalley tylersmalley added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed Team:Operations Team label for Operations Team labels Mar 26, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@rudolf rudolf changed the title Saved object migrations to allow asynchronous operations and lookups Saved object migrations to allow pre-migration state collection May 25, 2020
@rudolf
Copy link
Contributor

rudolf commented Nov 5, 2020

I'm closing this as I'm not aware of any plugins requiring this functionality. Let me know if we should re-open it.

@flash1293 I believe we've decided that the best way to implement the graph migration is to do that in the graph plugin itself during the start lifecycle (that is outside of saved object migrations).

@wylieconlon I'm not sure if we still want to implement #53972 but as far as I'm aware there's nothing preventing us from adopting the proposed solution to "Centralize all index-pattern loading to the data plugin and transform any index-pattern when it's loaded."

@flash1293
Copy link
Contributor

flash1293 commented Sep 20, 2021

I' bringing this back up because the discussion was closed the last time with these words:

I'm closing this as I'm not aware of any plugins requiring this functionality. Let me know if we should re-open it.

Since then two other cases of the same problem emerged:

I'm not sure whether this is enough traction to justify further work, but in all of these cases imperfect workarounds had to be used - it seems unlikely this won't ever come up again plus there's now 3 places where legacy-handling code directly in the app outside of SO migration will have to be kept around indefinitely (or removed in a breaking change).

cc @lukeelmers @stratoula

@flash1293 flash1293 reopened this Sep 20, 2021
@lukeelmers
Copy link
Member

@pgayvallet and I were talking about it this morning, and we tend to agree that this need will keep coming up. We will put this on our list of items to consider as possible 8.x projects, but it will require further discussion on our end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Saved Objects Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

8 participants