Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add id and status labels to pipeline and job metrics #455

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ErezArbell
Copy link

See details in issue 453.
This can be helpful to better filter the queries and also to present more than the last pipeline/job in the dashboard.

@maciej-gol
Copy link

Why joining on the gitlab_ci_pipeline_id is not sufficient? You can lookup how it works in the example dashboards.

@ErezArbell
Copy link
Author

Why joining on the gitlab_ci_pipeline_id is not sufficient? You can lookup how it works in the example dashboards.

@maciej-gol
In the example dashboards you can see only the latest pipeline/job. You cannot see historic data.
Example for something that I would like to have: to show all runs of a scpecific job name during the last week and so you can see when it started to fail.
Such things cannot be done without having the extra labels this PR adds.

Do you have a way to create a dashboard with historical data using the curernt implementation, I would like to hear it. We need such dashaboard and I did not find any way to get a list of historic pipelines/jobs with option to filter.

@maciej-gol
Copy link

maciej-gol commented May 28, 2022 via email

@ErezArbell
Copy link
Author

Thank you @maciej-gol for the insightful comments.

I might be mistaken, but your MR only tackles the labels, not the crawling.

You are correct. However, this MR does create an improvement with collecting the data: the way it works is that it always publish only the latest job (for example) that have that same set of labels-values set. So In the current implementation, if a new pipeline starts ont he same ref before the old one ends then only the job from thenew pipeline will be published. This MR add the pipeline_id and job_id labels and they are unique. So the jobs from the older pipelines will still be published.

in infinity, the exporter will present Prometheus with ALL the jobs ever seen, on every scrape

You have a good point here. Now that I think about it, it is indeed what is expected to happen, but it is not what I see when I look at the '/metrics' endpoint. Anyway, it is a good point.

Having said all of this, I believe this exporter is not suitable to monitor the health of your GCI system when you allow more than one pipeline per ref
...
I share your need of tracking ALL running pipeline, but I'm worried this
exporter would need architectural changes to work to address this need.

I agree. This is not the suitable tool. This was the closest I found so I thought to use it.
I understand that this PR will not be pulled. I will, however, leave this PR open since I would like to get a response from the repo owner, maybe he will have a suggestion.

It is strage that no such tools is avaiable for GitLab, which is a popular commercial product.

BYW, what is "GCI system"?

@maciej-gol
Copy link

maciej-gol commented May 28, 2022 via email

@maciej-gol
Copy link

@ErezArbell since your use-case is monitoring general ratio of successes of your jobs (per ref, perhaps), I believe implementing job hooks to simply store success/failures counters would be enough, without opening yourself to the growing metrics problem.

You could export job status counters, and just expose it via gitlab_ci_pipeline_job_status_counter{job_name, ref, project, status}. Fail rate would be increase(_counter{status='failed'}) / (increase(_counter{status='failed'} + success). Tracking should also be easy, if we start with hooks only first.

It might solve my problem (tracking all pending jobs), but I would need to give it more thought.

What do you think?

@tinchoram
Copy link

👋 Hi everyone! this issue is very interesting. We are having the same problem to be able to track the final status of all the jobs and their evolution, since as @ErezArbell comments, it only reports the status of the last job.

I'm going to try running the app with the changes incorporated by @ErezArbell and see if it fixes our problem.

I look forward to the resolution of this issue 🦊

@ErezArbell
Copy link
Author

@tinchoram, I added to the "quickstart" example two dashboards that I created to use those changes.

  • Pipelines History
  • Jobs History
    Those dashboards allow to present the full history and also let you filter what is shown by various parameters.
    As @maciej-gol wrote, this is not production ready. But those dashboards will let you use those changes and also see the benefits of them and the problems we have, like this issue

@ErezArbell
Copy link
Author

@maciej-gol, I do not need the ratios. I need to see the history of pipelines and jobs in a table, with options to filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants