Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove duplicate exported scrape job name in file name #525

Closed
wants to merge 1 commit into from

Conversation

anarcat
Copy link

@anarcat anarcat commented Feb 22, 2021

Between #435 and #450, the file path for the exported job names was
changed to use underscores instead of colons. That's fine, except
that, in the end, the actual filename had a superfluous
$scrape_job_name parameter added. Taking the Apache exporter as an
example, I used to have:

/etc/prometheus/file_sd_config.d/apache_perdulce.torproject.org:9117.yaml

I now have:

/etc/prometheus/file_sd_config.d/apache_apache_eugeni.torproject.org_9117.yaml

That still works of course: the old file is removed and the new one is
picked up, but it does look silly and needlessly redundant.

I should note that this might mean some clashes between jobs, if the
same host/port combination is scraped by multiple jobs on the
Prometheus server. I have no idea if that's a valid use case, but I
should point out that, if it is, then the job_name parameter should
just be removed and instead the resource name should forcibly include
the job name, instead of enforcing this weird duplication.

In other words, maybe this is also an API breaking change that should
warrant a changelog entry, but I'll let my co-maintainers be the
judges of this.

Pull Request (PR) description

This Pull Request (PR) fixes the following issues

Between voxpupuli#435 and voxpupuli#450, the file path for the exported job names was
changed to use underscores instead of colons. That's fine, except
that, in the end, the actual filename had a superfluous
`$scrape_job_name` parameter added. Taking the Apache exporter as an
example, I used to have:

    /etc/prometheus/file_sd_config.d/apache_perdulce.torproject.org:9117.yaml

I now have:

    /etc/prometheus/file_sd_config.d/apache_apache_eugeni.torproject.org_9117.yaml

That still works of course: the old file is removed and the new one is
picked up, but it does look silly and needlessly redundant.

I should note that this might mean some clashes between jobs, if the
same host/port combination is scraped by multiple jobs on the
Prometheus server. I have no idea if that's a valid use case, but I
should point out that, if it is, then the `job_name` parameter should
just be removed and instead the resource name should forcibly include
the job name, instead of enforcing this weird duplication.

In other words, maybe this is also an API breaking change that should
warrant a changelog entry, but I'll let my co-maintainers be the
judges of this.
@treydock
Copy link

Forcing a name to conform to some pattern rather than using parameters like job_name sounds like a recipe for easily making mistakes and misconfiguring things. I think an ugly file name is a fair trade off to guarantee no collisions. If you are deploying Prometheus with Puppet the file name used shouldn't matter, the purpose of this module is to abstract those details away and ensure things "just work", which they do.

@anarcat
Copy link
Author

anarcat commented Feb 22, 2021

hum. maybe i'm being overly obsessive with details here.

other maintainers have any thoughts? i'm happy to drop this one... i thought the change was accidental.

@anarcat
Copy link
Author

anarcat commented Feb 23, 2021

@treydock maybe we just got this backwards... why don't we pass the job name, host and port directly to scrape_job?

in other words, instead of:

define prometheus::scrape_job (
  String[1] $job_name,
  Array[String[1]] $targets,
  Hash[String[1], String[1]] $labels = {},
  Stdlib::Absolutepath $collect_dir  = undef,
) {

which gets called as:

    @@prometheus::scrape_job { "${scrape_job_name}_${scrape_host}_${scrape_port}":
      job_name => $scrape_job_name,
      targets  => ["${scrape_host}:${scrape_port}"],
      labels   => $scrape_job_labels,
    }

we could have:

define prometheus::scrape_job (
  String[1] $job_name,
  String[1] $scrape_host,
  String[1] $scrape_port,
  Hash[String[1], String[1]] $labels = {},
  Stdlib::Absolutepath $collect_dir  = undef,
) {

and then call it with:

    @@prometheus::scrape_job { "${scrape_job_name}_${scrape_host}_${scrape_port}":
      job_name => $scrape_job_name,
      scrape_host => $scrape_host,
      scrape_port => $scrape_port,
      labels   => $scrape_job_labels,
    }

and that way the scrape_job resource can construct a unique filename without redundancy... it makes scrape_job use a single target, unfortunately, but that's effectively how it's being used now anyways...

otherwise i'm happy to just drop this PR as well.

@treydock
Copy link

One major issue I see with that proposal is the lack of targets in prometheus::scrape_job which would severely limit the usefulness of that defined type, especially with exporters where there can be more than one target per scrape job host like blackbox exporter or other exporters where you might be querying something like websites hosted on a server. I make heavy use of multiple targets as well as scrape jobs where the scrape host does not match the targets like with hosted websites. I could obviously redo everything to work with this change but from a personal standpoint the trade off of cleaner file names isn't worth the effort to rework all my profile classes.

I also often don't need to set port like I will have a target that is https://some-website.domain for blackbox exporter where the scrape port is set on the central deployment of blackbox exporter but the host exporting the scrape job only needs to define the target and that's it, the rest is handled by scrape config where being defined in scrape job would not be useful.

The current parameters of prometheus::scrape_job also look a lot like the Prometheus documentation for how to define a scrape job target, which is essentially just a set of targets and labels associated with a scrape job.

@anarcat
Copy link
Author

anarcat commented Feb 23, 2021

makes sense. i'll just drop this already. :)

i assume you have ways of not having clashes when you define multiple hosts in a job?

@anarcat anarcat closed this Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants