Skip to content

Commit

Permalink
[7.x] Resilient saved object migration algorithm (#78413) (#86284)
Browse files Browse the repository at this point in the history
* Resilient saved object migration algorithm (#78413)

* Initial structure of migration state-action machine

* Fix type import

* Retries with exponential back off

* Use discriminated union for state type

* Either type for actions

* Test exponential retries

* TaskEither types for actions

* Fetch indices instead of aliases so we can collect all index state in one request

* Log document id if transform fails

* WIP: Legacy pre-migrations

* UPDATE_TARGET_MAPPINGS

* WIP OUTDATED_DOCUMENTS_TRANSFORM

* Narrow res types depending on control state

* OUTDATED_DOCUMENTS_TRANSFORM

* Use .kibana instead of .kibana_current

* rename control states TARGET_DOCUMENTS* -> OUTDATED_DOCUMENTS*

* WIP MARK_VERSION_INDEX_READY

* Fix and expand INIT -> * transition tests

* Add alias/index name helper functions

* Add feature flag for enabling v2 migrations

* split state_action_machine, reindex legacy indices

* Don't use a scroll search for migrating outdated documents

* model: test control state progressions

* Action integration tests

* Fix existing tests and type errors

* snapshot_in_progress_exception can only happen when closing/deleting an index

* Retry steps up to 10 times

* Update api.md documentation files

* Further actions integration tests

* Action unit tests

* Fix actions integration tests

* Rename actions to be more domain-specific

* Apply suggestions from code review

Co-authored-by: Josh Dover <me@joshdover.com>

* Review feedback: polish and flesh out inline comments

* Fix unhandled rejections in actions unit tests

* model: only delay retryable_es_client_error, reset for other left responses

* Actions unit tests

* More inline comments

* Actions: Group index settings under 'index' key

* bulkIndex -> bulkOverwriteTransformedDocuments to be more domain specific

* state_action_machine tests, fix and add additional tests

* Action integration tests: updateAndPickupMappings, searchForOutdatedDocuments

* oops: uncomment commented out code

* actions integration tests: rejection for createIndex

* update state properties: clearer names, mark all as readonly

* add state properties currentAlias, versionAlias, legacyIndex and test for invalid version scheme in index names

* Use CONSTANTS for constants :D

* Actions: Clarify behaviour and impact of acknowledged: false responses

* Use consistent vocabulary for action responses

* KibanaMigrator test for migrationsV2

* KibanaMigrator test for FATAL state and action exceptions in v2 migrations

* Fix ts error in test

* Refactor: split index file up into a file per model, next, types

* next: use partial application so we don't generate a nextActionMap on every call

* move logic from index.ts to migrations_state_action_machine.ts and test

* add test

* use `Root` to allow specifying oss mode

* Add fix and todo tests for reindexing with preMigrationScript

* Dump execution log of state transitions and responses if we hit FATAL

* add 7.3 xpack tests

* add 100k test data

* Reindex instead of cloning for migrations

* Skip 100k x-pack integration test

* MARK_VERSION_INDEX_READY_CONFLICT for dealing with different versions migrating in parallel

* Track elapsed time

* Fix tests

* Model: make exhaustiveness checks more explicit

* actions integration tests: add additional tests from CR

* migrations_state_action_machine fix flaky test

* Fix flaky integration test

* Reserve FATAL termination only for situations which we never can recover from such as later version already migrated the index

* Handle incompatible_mapping_exception caused by another instance

* Cleanup logging

* Fix/stabilize integration tests

* Add REINDEX_SOURCE_TO_TARGET_VERIFY step

* Strip tests archives of */.DS_Store and __MAC_OSX

* Task manager migrations: remove invalid kibana property when converting legacy indices

* Add disabled mappings for removed field in map saved object type

* verifyReindex action: use count API

* REINDEX_BLOCK_* to prevent lost deletes (needs tests)

* Split out 100k docs integration test so that it has it's own kibana process

* REINDEX_BLOCK_* action tests

* REINDEX_BLOCK_* model tests

* Include original error message when migration_state_machine throws

* Address some CR nits

* Fix TS errors

* Fix bugs

* Reindex then clone to prevent lost deletes

* Fix tests

Co-authored-by: Josh Dover <me@joshdover.com>
Co-authored-by: pgayvallet <pierre.gayvallet@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	rfcs/text/0013_saved_object_migrations.md

* createRootWithCorePlugins support for cliArgs which wasn't backported to 7.x

* Attempt to stabilize cloneIndex integration tests (#86123)

* Attempt to stabilize cloneIndex integration tests

* Unskip test

* return resolves/rejects and add assertions counts to each test

* Await don't return expect promises

* Await don't return expect promises for other tests too

* Remove 8.0.0 integration test
  • Loading branch information
rudolf authored Dec 18, 2020
1 parent d20bb3d commit 6a8726b
Show file tree
Hide file tree
Showing 44 changed files with 6,367 additions and 124 deletions.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,4 @@ export interface SavedObjectsRawDoc
| [\_primary\_term](./kibana-plugin-core-server.savedobjectsrawdoc._primary_term.md) | <code>number</code> | |
| [\_seq\_no](./kibana-plugin-core-server.savedobjectsrawdoc._seq_no.md) | <code>number</code> | |
| [\_source](./kibana-plugin-core-server.savedobjectsrawdoc._source.md) | <code>SavedObjectsRawDocSource</code> | |
| [\_type](./kibana-plugin-core-server.savedobjectsrawdoc._type.md) | <code>string</code> | |

2 changes: 1 addition & 1 deletion src/core/public/public.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import { Action } from 'history';
import { ApiResponse } from '@elastic/elasticsearch/lib/Transport';
import Boom from '@hapi/boom';
import { ConfigDeprecationProvider } from '@kbn/config';
import { ConfigPath } from '@kbn/config';
import { EnvironmentMode } from '@kbn/config';
import { EuiBreadcrumb } from '@elastic/eui';
Expand All @@ -18,7 +19,6 @@ import { History } from 'history';
import { Href } from 'history';
import { IconType } from '@elastic/eui';
import { KibanaClient } from '@elastic/elasticsearch/api/kibana';
import { KibanaConfigType } from 'src/core/server/kibana_config';
import { Location } from 'history';
import { LocationDescriptorObject } from 'history';
import { Logger } from '@kbn/logging';
Expand Down
10 changes: 6 additions & 4 deletions src/core/server/elasticsearch/client/mocks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ import type { DeeplyMockedKeys } from '@kbn/utility-types/jest';
import { ElasticsearchClient } from './types';
import { ICustomClusterClient } from './cluster_client';

const createInternalClientMock = (): DeeplyMockedKeys<Client> => {
const createInternalClientMock = (
res?: MockedTransportRequestPromise<unknown>
): DeeplyMockedKeys<Client> => {
// we mimic 'reflection' on a concrete instance of the client to generate the mocked functions.
const client = new Client({
node: 'http://localhost',
Expand Down Expand Up @@ -59,7 +61,7 @@ const createInternalClientMock = (): DeeplyMockedKeys<Client> => {
.filter(([key]) => !omitted.includes(key))
.forEach(([key, descriptor]) => {
if (typeof descriptor.value === 'function') {
obj[key] = jest.fn(() => createSuccessTransportRequestPromise({}));
obj[key] = jest.fn(() => res ?? createSuccessTransportRequestPromise({}));
} else if (typeof obj[key] === 'object' && obj[key] != null) {
mockify(obj[key], omitted);
}
Expand Down Expand Up @@ -95,8 +97,8 @@ const createInternalClientMock = (): DeeplyMockedKeys<Client> => {

export type ElasticsearchClientMock = DeeplyMockedKeys<ElasticsearchClient>;

const createClientMock = (): ElasticsearchClientMock =>
(createInternalClientMock() as unknown) as ElasticsearchClientMock;
const createClientMock = (res?: MockedTransportRequestPromise<unknown>): ElasticsearchClientMock =>
(createInternalClientMock(res) as unknown) as ElasticsearchClientMock;

export interface ScopedClusterClientMock {
asInternalUser: ElasticsearchClientMock;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ function wrapWithTry(
const failedTransform = `${type}:${version}`;
const failedDoc = JSON.stringify(doc);
log.warn(
`Failed to transform document ${doc}. Transform: ${failedTransform}\nDoc: ${failedDoc}`
`Failed to transform document ${doc?.id}. Transform: ${failedTransform}\nDoc: ${failedDoc}`
);
throw error;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
* serves as a central blueprint for what migrations will end up doing.
*/

import { Logger } from 'src/core/server/logging';
import { Logger } from '../../../logging';
import { MigrationEsClient } from './migration_es_client';
import { SavedObjectsSerializer } from '../../serialization';
import {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,7 @@
* specific language governing permissions and limitations
* under the License.
*/
import type { PublicMethodsOf } from '@kbn/utility-types';

import { KibanaMigrator, KibanaMigratorStatus } from './kibana_migrator';
import { IKibanaMigrator, KibanaMigratorStatus } from './kibana_migrator';
import { buildActiveMappings } from '../core';
const { mergeTypes } = jest.requireActual('./kibana_migrator');
import { SavedObjectsType } from '../../types';
Expand All @@ -45,7 +43,16 @@ const createMigrator = (
types: SavedObjectsType[];
} = { types: defaultSavedObjectTypes }
) => {
const mockMigrator: jest.Mocked<PublicMethodsOf<KibanaMigrator>> = {
const mockMigrator: jest.Mocked<IKibanaMigrator> = {
kibanaVersion: '8.0.0-testing',
savedObjectsConfig: {
batchSize: 100,
scrollDuration: '15m',
pollInterval: 1500,
skip: false,
// TODO migrationsV2: remove/deprecate once we release migrations v2
enableV2: false,
},
runMigrations: jest.fn(),
getActiveMappings: jest.fn(),
migrateDocument: jest.fn(),
Expand Down
247 changes: 216 additions & 31 deletions src/core/server/saved_objects/migrations/kibana/kibana_migrator.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import { KibanaMigratorOptions, KibanaMigrator } from './kibana_migrator';
import { loggingSystemMock } from '../../../logging/logging_system.mock';
import { SavedObjectTypeRegistry } from '../../saved_objects_type_registry';
import { SavedObjectsType } from '../../types';
import { errors as esErrors } from '@elastic/elasticsearch';

const createRegistry = (types: Array<Partial<SavedObjectsType>>) => {
const registry = new SavedObjectTypeRegistry();
Expand Down Expand Up @@ -89,38 +90,188 @@ describe('KibanaMigrator', () => {
expect(options.client.cat.templates).toHaveBeenCalledTimes(1);
});

it('emits results on getMigratorResult$()', async () => {
const options = mockOptions();
describe('when enableV2 = false', () => {
it('when enableV2 = false creates an IndexMigrator which retries NoLivingConnectionsError errors from ES client', async () => {
const options = mockOptions();

options.client.cat.templates.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise(
{ templates: [] },
{ statusCode: 404 }
)
);
options.client.indices.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);
options.client.indices.getAlias.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);
options.client.cat.templates.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise(
{ templates: [] },
{ statusCode: 404 }
)
);
options.client.indices.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);
options.client.indices.getAlias.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);

const migrator = new KibanaMigrator(options);
const migratorStatus = migrator.getStatus$().pipe(take(3)).toPromise();
await migrator.runMigrations();
const { status, result } = await migratorStatus;
expect(status).toEqual('completed');
expect(result![0]).toMatchObject({
destIndex: '.my-index_1',
elapsedMs: expect.any(Number),
sourceIndex: '.my-index',
status: 'migrated',
options.client.indices.create = jest
.fn()
.mockReturnValueOnce(
elasticsearchClientMock.createErrorTransportRequestPromise(
new esErrors.NoLivingConnectionsError('reason', {} as any)
)
)
.mockImplementationOnce(() =>
elasticsearchClientMock.createSuccessTransportRequestPromise('success')
);

const migrator = new KibanaMigrator(options);
const migratorStatus = migrator.getStatus$().pipe(take(3)).toPromise();
await migrator.runMigrations();

expect(options.client.indices.create).toHaveBeenCalledTimes(3);
const { status } = await migratorStatus;
return expect(status).toEqual('completed');
});

it('emits results on getMigratorResult$()', async () => {
const options = mockOptions();

options.client.cat.templates.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise(
{ templates: [] },
{ statusCode: 404 }
)
);
options.client.indices.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);
options.client.indices.getAlias.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({}, { statusCode: 404 })
);

const migrator = new KibanaMigrator(options);
const migratorStatus = migrator.getStatus$().pipe(take(3)).toPromise();
await migrator.runMigrations();
const { status, result } = await migratorStatus;
expect(status).toEqual('completed');
expect(result![0]).toMatchObject({
destIndex: '.my-index_1',
elapsedMs: expect.any(Number),
sourceIndex: '.my-index',
status: 'migrated',
});
expect(result![1]).toMatchObject({
destIndex: 'other-index_1',
elapsedMs: expect.any(Number),
sourceIndex: 'other-index',
status: 'migrated',
});
});
});
describe('when enableV2 = true', () => {
beforeEach(() => {
jest.clearAllMocks();
});
expect(result![1]).toMatchObject({
destIndex: 'other-index_1',
elapsedMs: expect.any(Number),
sourceIndex: 'other-index',
status: 'migrated',

it('creates a V2 migrator that initializes a new index and migrates an existing index', async () => {
const options = mockV2MigrationOptions();
const migrator = new KibanaMigrator(options);
const migratorStatus = migrator.getStatus$().pipe(take(3)).toPromise();
await migrator.runMigrations();

// Basic assertions that we're creating and reindexing the expected indices
expect(options.client.indices.create).toHaveBeenCalledTimes(3);
expect(options.client.indices.create.mock.calls).toEqual(
expect.arrayContaining([
// LEGACY_CREATE_REINDEX_TARGET
expect.arrayContaining([expect.objectContaining({ index: '.my-index_pre8.2.3_001' })]),
// CREATE_REINDEX_TEMP
expect.arrayContaining([
expect.objectContaining({ index: '.my-index_8.2.3_reindex_temp' }),
]),
// CREATE_NEW_TARGET
expect.arrayContaining([expect.objectContaining({ index: 'other-index_8.2.3_001' })]),
])
);
// LEGACY_REINDEX
expect(options.client.reindex.mock.calls[0][0]).toEqual(
expect.objectContaining({
body: expect.objectContaining({
source: expect.objectContaining({ index: '.my-index' }),
dest: expect.objectContaining({ index: '.my-index_pre8.2.3_001' }),
}),
})
);
// REINDEX_SOURCE_TO_TEMP
expect(options.client.reindex.mock.calls[1][0]).toEqual(
expect.objectContaining({
body: expect.objectContaining({
source: expect.objectContaining({ index: '.my-index_pre8.2.3_001' }),
dest: expect.objectContaining({ index: '.my-index_8.2.3_reindex_temp' }),
}),
})
);
const { status } = await migratorStatus;
return expect(status).toEqual('completed');
});
it('emits results on getMigratorResult$()', async () => {
const options = mockV2MigrationOptions();
const migrator = new KibanaMigrator(options);
const migratorStatus = migrator.getStatus$().pipe(take(3)).toPromise();
await migrator.runMigrations();

const { status, result } = await migratorStatus;
expect(status).toEqual('completed');
expect(result![0]).toMatchObject({
destIndex: '.my-index_8.2.3_001',
sourceIndex: '.my-index_pre8.2.3_001',
elapsedMs: expect.any(Number),
status: 'migrated',
});
expect(result![1]).toMatchObject({
destIndex: 'other-index_8.2.3_001',
elapsedMs: expect.any(Number),
status: 'patched',
});
});
it('rejects when the migration state machine terminates in a FATAL state', () => {
const options = mockV2MigrationOptions();
options.client.indices.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise(
{
'.my-index_8.2.4_001': {
aliases: {
'.my-index': {},
'.my-index_8.2.4': {},
},
mappings: { properties: {}, _meta: { migrationMappingPropertyHashes: {} } },
settings: {},
},
},
{ statusCode: 200 }
)
);

const migrator = new KibanaMigrator(options);
return expect(migrator.runMigrations()).rejects.toMatchInlineSnapshot(
`[Error: Unable to complete saved object migrations for the [.my-index] index: The .my-index alias is pointing to a newer version of Kibana: v8.2.4]`
);
});
it('rejects when an unexpected exception occurs in an action', async () => {
const options = mockV2MigrationOptions();
options.client.tasks.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({
completed: true,
error: { type: 'elatsicsearch_exception', reason: 'task failed with an error' },
failures: [],
task: { description: 'task description' },
})
);

const migrator = new KibanaMigrator(options);

await expect(migrator.runMigrations()).rejects.toMatchInlineSnapshot(`
[Error: Unable to complete saved object migrations for the [.my-index] index. Please check the health of your Elasticsearch cluster and try again. Error: Reindex failed with the following error:
{"_tag":"Some","value":{"type":"elatsicsearch_exception","reason":"task failed with an error"}}]
`);
expect(loggingSystemMock.collect(options.logger).error[0][0]).toMatchInlineSnapshot(`
[Error: Reindex failed with the following error:
{"_tag":"Some","value":{"type":"elatsicsearch_exception","reason":"task failed with an error"}}]
`);
});
});
});
Expand All @@ -130,7 +281,40 @@ type MockedOptions = KibanaMigratorOptions & {
client: ReturnType<typeof elasticsearchClientMock.createElasticsearchClient>;
};

const mockOptions = () => {
const mockV2MigrationOptions = () => {
const options = mockOptions({ enableV2: true });

options.client.indices.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise(
{
'.my-index': {
aliases: { '.kibana': {} },
mappings: { properties: {} },
settings: {},
},
},
{ statusCode: 200 }
)
);
options.client.indices.addBlock.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({ acknowledged: true })
);
options.client.reindex.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({ taskId: 'reindex_task_id' })
);
options.client.tasks.get.mockReturnValue(
elasticsearchClientMock.createSuccessTransportRequestPromise({
completed: true,
error: undefined,
failures: [],
task: { description: 'task description' },
})
);

return options;
};

const mockOptions = ({ enableV2 }: { enableV2: boolean } = { enableV2: false }) => {
const options: MockedOptions = {
logger: loggingSystemMock.create().get(),
kibanaVersion: '8.2.3',
Expand All @@ -144,7 +328,7 @@ const mockOptions = () => {
name: { type: 'keyword' },
},
},
migrations: {},
migrations: { '8.2.3': jest.fn().mockImplementation((doc) => doc) },
},
{
name: 'testtype2',
Expand All @@ -168,6 +352,7 @@ const mockOptions = () => {
pollInterval: 20000,
scrollDuration: '10m',
skip: false,
enableV2,
},
client: elasticsearchClientMock.createElasticsearchClient(),
};
Expand Down
Loading

0 comments on commit 6a8726b

Please sign in to comment.