Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop #71269

Merged
merged 10 commits into from
Jul 10, 2020

Conversation

madirey
Copy link
Contributor

@madirey madirey commented Jul 9, 2020

Summary

  • Fixes manifest dispatch on policy creation
  • Refactors ManifestManager code to handle errors and promises properly.
  • Improves types for matcher functions.
  • Ensures that callback returns new policy even in the event of errors
  • Ensures that callback returns manifest even if new data cannot be retrieved

image

Checklist

  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios

For maintainers

@madirey madirey added v8.0.0 release_note:skip Skip the PR/issue when compiling release notes Team:Endpoint Response Endpoint Response Team Feature:Endpoint Elastic Endpoint feature v7.9.0 labels Jul 9, 2020
@madirey madirey requested review from a team as code owners July 9, 2020 15:52
@elasticmachine
Copy link
Contributor

Pinging @elastic/endpoint-app-team (Feature:Endpoint)

@elasticmachine
Copy link
Contributor

Pinging @elastic/endpoint-response (Team:Endpoint Response)

try {
if (snapshot && snapshot.diffs.length > 0) {
// create new artifacts
const errors = await manifestManager.syncArtifacts(snapshot, 'add');
Copy link
Contributor

@nnamdifrankie nnamdifrankie Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does syncArtifacts relates getPackageConfigCreateCallback. That appears to update something, perhaps a side effect?

Sorry I see the method is handlePackageConfigCreate. There is a few condition blocks that may need to be tested to make sure all branches functions as expected. Is it possible to break this up so it may be more testable and probably maintainable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnamdifrankie If new exception list entries were created, we may need to create new artifacts before sending the updated manifest. That's what syncArtifacts does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XavierM and I have been talking about how to improve the way this callback happens... I think we will need some changes to make this more robust. Happy to discuss it @nnamdifrankie

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a test around what branches are supposed to exist together will suffice for now. Not sure what is the relationship between the snapshot and new package config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnamdifrankie Just pushed some of the tests, working on more.

Copy link
Contributor Author

@madirey madirey Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nnamdifrankie @paul-tavares I've added several unit tests and have run through some manual tests. I think this fix will ensure that policy creation is not interrupted by any of the manifest processing code.

@madirey madirey requested a review from a team July 9, 2020 20:41
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Jul 9, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@madirey madirey changed the title [Security Solution][Endpoint] Improved error handling in user manifest loop [Security Solution][Endpoint] Ingest callback fixes + Improved error handling in user manifest loop Jul 9, 2020
@madirey
Copy link
Contributor Author

madirey commented Jul 10, 2020

@elasticmachine merge upstream

@madirey madirey changed the title [Security Solution][Endpoint] Ingest callback fixes + Improved error handling in user manifest loop [Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop Jul 10, 2020
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@paul-tavares paul-tavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍
All of my comments are minor in nature and optional. Also, I assume you (during Dev.) added a few throw's in the code for testing purposes and that he callback always returned successfully 😁

My only concern overall is that we seem to be doing lots of sync call (await) and that can extend overall time it takes for a Package config to be created in Ingest (as it is, it already take a little bit of time due to the work it does with Packages).

FYI:

One of the things I thought about implementing in the Ingest code around these callbacks for Create Package Config was a way to "timeout" the returned Promise after x amount of time. so that no one callback can hold up the entire operation. (cc/ @jfsiii , @jen-huang , @nchaulet )

version: '0.9.0',
},
inputs: [],
} as NewPackageConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add the return type to the function definition instead of casting. Same below

return updatedPackageConfig;
// Until we get the Default Policy Configuration in the Endpoint package,
// we will add it here manually at creation time.
if (newPackageConfig.inputs.length === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally - I would remove this if() and always insert our input into the Package Config.

History:

The if() is only here because at one point we thought that we would get data included in the Package Config at creation time by defining it in the Endpoint Package, thus why we were doing a conditional check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-tavares Will inputs ever have length > 0 here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madirey - For endpoint Package Configs, no. At the moment we only expect 1 input at most that includes all of our data.

let snapshot: ManifestSnapshot | null = null;
let success = true;
try {
// Try to get most up-to-date manifest data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is what's returned here different from what you retrieved above in manifestManager.getLastDispatchedManifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-tavares Possibly. Perhaps could look at renaming this function. The above builds a Manifest from the SO record of the last manifest dispatched. This function builds a manifest dynamically from the latest exception lists... there could be new items, so we try to get the most up-to-date data possible to avoid having to immediately re-dispatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madirey thanks for explaining that. I'm not too familiar with the design around Lists, thus the questions 😃 .

I think there is a process in place that periodically (on a timer) goes through and picks up new versions of lists and re-generates the Manifest (which pushes updates to the Package configs), right? If thats the case, I would suggest that we avoid the additional delays here in explicitly triggering that process, so that the overall Package Config creation is not incur a longer delay.


try {
return updatedPackageConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be just be a personal choice 😄
I find this return here as well as the one inside of catch() confusing because one (a future "us" ) add another return to finally {} and that would be the value that is returned out of the function.

I would propose (for clarify) that the return be consolidated to one under the finally {} block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-tavares It was an attempt to be sure that we return the data before doing other things... but we still don't have a good way to verify that the policy was actually created. We probably need to work together to figure out a better way to do this. Can we address in a follow-up to this PR? There are some important fixes here that will help us in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @madirey
My point really was that the finally {} block could actually change what the function returns (ex. it could overwrite the return from either the catch() or the regular try {}.

Also - I'm tempted to suggest that if we refactor this, that we split up the concerns into 2 callbacks: one for handling only adding the policy data and a second to handle only adding the manifest information, an we register both callbacks with Ingest. We can work on this post 7.9

Copy link
Contributor Author

@madirey madirey Jul 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-tavares As written, that should not happen, but I agree with the concerns. It's ugly.

I can remove the section that pulls the snapshot and tries to get the most up-to-date information. That comes with a few caveats though:

  1. There's no guarantee that we will have already created the manifest when the callback runs, so we may have to send an empty artifact list... the policy returned will not be deterministic on the first policy creation (artifacts may or may not exist), so that will have unit testing implications as well,
  2. If there have been new exception items added within the last minute, we won't pick it up.

In both of the cases above, we'll have to re-send the policy a minute later when the updates are detected. This is probably fine for now, but it may not be ideal when we have large deployments (could result in 2 back-to-back large rollouts if I'm not mistaken).

@@ -6,6 +6,8 @@

import { ILegacyScopedClusterClient, SavedObjectsClientContract } from 'kibana/server';
import { loggingSystemMock, savedObjectsServiceMock } from 'src/core/server/mocks';
// eslint-disable-next-line @kbn/eslint/no-restricted-paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to ask the Kibana Platform team to include this in src/core/server/mocks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-tavares I see you approved, thank you! But let's follow up and figure out a better way to do this. +++

madirey added a commit that referenced this pull request Jul 11, 2020
…ed error handling in user manifest loop (#71269) (#71376)

* Clean up matcher types

* Rework promise and error-handling in ManifestManager

* Write tests for ingest callback and ensure policy is returned when errors occur

* More tests for ingest callback

* Update tests

* Fix tests

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 13, 2020
* master: (78 commits)
  Bump lodash package version (elastic#71392)
  refactor: 💡 use allow-list in AppArch codebase (elastic#71400)
  improve bugfix 7198 test stability (elastic#71250)
  [Security Solution][Ingest Manager][Endpoint] Optional ingest manager (elastic#71198)
  [Metrics UI] Round metric threshold time buckets to nearest unit (elastic#71172)
  [Security Solution][Endpoint] Policy creation callback fixes + Improved error handling in user manifest loop (elastic#71269)
  [Security Solution] Allow to configure Event Renderers settings (elastic#69693)
  Fix a11y keyboard overlay (elastic#71214)
  [APM] UI text updates (elastic#71333)
  [Logs UI] Limit `extendDatemath` to valid ranges (elastic#71113)
  [SIEM] fix tooltip of notes (elastic#71342)
  address index templates feedback (elastic#71353)
  Upgrade EUI to v26.3.1 (elastic#70243)
  [build] Creates Linux aarch64 archive (elastic#69165)
  [SIEM][Detection Engine] Fixes skipped tests (elastic#71347)
  [SIEM][Detection Engine][Lists] Adds read_privileges route for lists and list items
  [kbn/optimizer] implement "requiredBundles" property of KP plugins (elastic#70911)
  [Security Solution][Exceptions] - Exceptions modal pt 2 (elastic#70886)
  [ML] DF Analytics:  stop status polling when job stopped (elastic#71159)
  [SIEM][CASE] IBM Resilient Connector (elastic#66385)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Endpoint Elastic Endpoint feature release_note:skip Skip the PR/issue when compiling release notes Team:Endpoint Response Endpoint Response Team Team:Fleet Team label for Observability Data Collection Fleet team v7.9.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants