Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(kubernetes): manage log streaming at pod-level #303

Closed
wants to merge 34 commits into from

Conversation

cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Apr 9, 2022

Part of go-vela/community#519
Builds on #302 which should be merged first.

This is a bug fix and an enhancement. The goal is to improve log streaming with the kubernetes runtime
to prevent missing logs, logs that can be observed when tailing with something like stern.

Only these commits are part of this PR (everything after #302)

  • enhance(k8s): expose executor.max_log_size to kubernetes runtime
  • enhance(k8s): add k8s clientset to podTracker
  • enhance(k8s): add k8s namespace to containerTracker
  • refactor(k8s): start streaming logs before steps start
  • enhance: add LOGS TRUNCATED at end of logs if truncated

TODO:

Alternative fixes that didn't work out:
Closes #277
Closes #286

Replaces manual Pods Watch API call with a "container is terminated" signal
in a containerTracker. That signal is controlled by the PodTracker based
on Add/Update/Delete pod events from the PodInformer (which does its own
watch/list API calls internally).
the containerTracker has locks in it, so we don't
want to copy it. We want to get references to it.
@cognifloyd cognifloyd added bug Indicates a bug enhancement Indicates an improvement to a feature labels Apr 9, 2022
@cognifloyd cognifloyd self-assigned this Apr 9, 2022
@codecov
Copy link

codecov bot commented Apr 9, 2022

Codecov Report

Merging #303 (4103334) into master (c091a57) will decrease coverage by 0.41%.
The diff coverage is 72.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #303      +/-   ##
==========================================
- Coverage   79.77%   79.35%   -0.42%     
==========================================
  Files          67       68       +1     
  Lines        4958     5154     +196     
==========================================
+ Hits         3955     4090     +135     
- Misses        859      911      +52     
- Partials      144      153       +9     
Impacted Files Coverage Δ
executor/linux/step.go 64.73% <0.00%> (-0.38%) ⬇️
runtime/kubernetes/opts.go 88.75% <0.00%> (-11.25%) ⬇️
runtime/kubernetes/build.go 85.71% <62.50%> (-2.42%) ⬇️
runtime/kubernetes/container.go 73.68% <62.50%> (-6.25%) ⬇️
runtime/kubernetes/pod_tracker.go 84.78% <84.78%> (ø)
runtime/kubernetes/kubernetes.go 89.91% <86.36%> (-1.00%) ⬇️
runtime/setup.go 100.00% <100.00%> (ø)

Instead of figuring out how to mock a podInformer resync
(which the Informer was not designed to allow),
we just call the HandlePodUpdate method and trust that the
informer works with the API correctly (so no need to test that).
The k8s runtime needs to cache the logs before the executor requests them.
So, expose the maxLogSize so that the cached logs can be truncated near that size.
We need to make sure to capture all of the logs.
@cognifloyd
Copy link
Member Author

This approach did not pan out. Closing.

@cognifloyd cognifloyd closed this Apr 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates a bug enhancement Indicates an improvement to a feature
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

1 participant