Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Job in index: Get datafeed and job stats from index #34591

Closed
wants to merge 1 commit into from

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Oct 18, 2018

For both actions the datafeed and job Ids expressions must be expanded from the index documents.

Jindex feature branch

@davidkyle davidkyle added >feature :ml Machine learning labels Oct 18, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@davidkyle davidkyle changed the title [ML] Get datafeed and job stats from index [ML] Job in index: Get datafeed and job stats from index Oct 18, 2018
Set<String> excludeJobIds = stats.stream().map(GetJobsStatsAction.Response.JobStats::getJobId).collect(Collectors.toSet());
return requestedJobIds.stream().filter(jobId -> !excludeJobIds.contains(jobId) &&
!mlMetadata.isJobDeleted(jobId)).collect(Collectors.toList());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the tricky bit as it is no longer easy to find if the job is being deleted without making an async call for each job. Before this change if a job was being deleted then the calls to gatherForecastStats and gatherDataCountsAndModelSizeStats would not be made, it is still safe to make those calls if those documents are deleted the response accepts null for forecast stats and model size stats and gather data counts returns a default constructed object if the document is not found. Some jobs will take a long time to delete and the job config document is the last thing to be removed so it is possible that forecast stats are in the process of deletion meaning that in a long running delete subsequent calls. It's a reasonable argument to say the API is doing what is says is it doing in this case. Compare with GET jobs with returns all jobs including deleting jobs.

One helpful change is to add an excludeDeleting parameter to jobConfigProvider.expandJobsIds when called in doExecute which would make the behaviour closer to the current albeit with race.

@davidkyle davidkyle mentioned this pull request Oct 18, 2018
43 tasks
@davidkyle
Copy link
Member Author

Closing due to git shenanigans. #34645 has been raised instead

@davidkyle davidkyle closed this Oct 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :ml Machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants