Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BOSH Health Monitor JSON Pluging Not Working #2528

Closed
slcardinal opened this issue Jun 7, 2024 · 2 comments · Fixed by #2530
Closed

BOSH Health Monitor JSON Pluging Not Working #2528

slcardinal opened this issue Jun 7, 2024 · 2 comments · Fixed by #2530
Assignees

Comments

@slcardinal
Copy link

Describe the bug
We are using the JSON Plugin of the BOSH Health Monitor to send BOSH health metrics to Splunk via syslog. This have been working upto version 280.0.20 of BOSH. We are now using version 280.0.25 and we are no longer seeing our BOSH health metircs in Splunk.

To Reproduce
On a running deployment of BOSH do the following

  1. Create a bash script in the /var/vcap/jobs/*/bin/bosh-monitor/ directory
  2. Change the owership of the file to vcap:vcap - chown vcap:vcap <name of script file>
  3. Make the file executable by ower and group - chmod 750 <name of script file>
  4. Add the following to the script file
    #!/bin/bash

    cat | logger -n <name of remove syslog server> -P 514 --rfc5424 --size 4096 --socket-errors=on --tag bosh-hm
  1. Restart the health_monitor job - monit restart health_monitor

Expected behavior
When this is working correctly, BOSH health metrics should be written to syslog.

Logs
Snips from the BOSH health_monitor logs

/var/vcap/sys/log/health_monitor/health_monitor.log

I, [2024-06-07T13:56:06.034678 #7]  INFO : connection.tsdb-reconnecting (5)...
I, [2024-06-07T13:56:06.034997 #7]  INFO : connection.tsdb-failed-to-reconnect, will try again in 31 seconds...
I, [2024-06-07T13:56:06.052889 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:07.053505 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:08.054290 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:09.054937 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:10.055766 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:11.056412 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:12.057079 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:13.057906 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:14.058564 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:15.059357 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:16.061797 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:17.063336 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:18.066054 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:19.066918 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:20.068124 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:21.068891 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh
I, [2024-06-07T13:56:22.069719 #7]  INFO : JSON Plugin: Restarted process /var/vcap/jobs/bosh-hm-to-splunk/bin/bosh-monitor/bosh-hm-to-splunk.sh

/var/vcap/sys/log/health_monitor/health_monitor.stderr.log (This log is easier to read when piped to jq)

{"time":"2024-06-07T14:01:19+00:00","severity":"warn","oid":183400,"pid":7,"subject":"Async::Task","message":"Task may have ended with unhandled exception.","event":{"type":"failure","root":"/var/vcap/data/jobs/health_monitor/f1436fb7889a7c1e9a56f84b77d104141a6b762a","class":"NoMethodError","message":"\u001b[1mundefined method `write' for nil:NilClass (\u001b[1;4mNoMethodError\u001b[m\u001b[1m)\u001b[m\n\n\u001b[1m      @stdin.write(data)\u001b[m\n\u001b[1m            ^^^^^^\u001b[m","backtrace":["/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:143:in `send_data'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:47:in `block (2 levels) in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `each'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `block in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `synchronize'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:15:in `process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:106:in `plugin_process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:51:in `block (2 levels) in process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:164:in `block in run'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:377:in `block in schedule'"]}}
{"time":"2024-06-07T14:01:19+00:00","severity":"warn","oid":183420,"pid":7,"subject":"Async::Task","message":"Task may have ended with unhandled exception.","event":{"type":"failure","root":"/var/vcap/data/jobs/health_monitor/f1436fb7889a7c1e9a56f84b77d104141a6b762a","class":"NoMethodError","message":"\u001b[1mundefined method `write' for nil:NilClass (\u001b[1;4mNoMethodError\u001b[m\u001b[1m)\u001b[m\n\n\u001b[1m      @stdin.write(data)\u001b[m\n\u001b[1m            ^^^^^^\u001b[m","backtrace":["/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:143:in `send_data'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:47:in `block (2 levels) in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `each'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `block in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `synchronize'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:15:in `process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:106:in `plugin_process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:51:in `block (2 levels) in process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:164:in `block in run'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:377:in `block in schedule'"]}}
{"time":"2024-06-07T14:01:19+00:00","severity":"warn","oid":183440,"pid":7,"subject":"Async::Task","message":"Task may have ended with unhandled exception.","event":{"type":"failure","root":"/var/vcap/data/jobs/health_monitor/f1436fb7889a7c1e9a56f84b77d104141a6b762a","class":"NoMethodError","message":"\u001b[1mundefined method `write' for nil:NilClass (\u001b[1;4mNoMethodError\u001b[m\u001b[1m)\u001b[m\n\n\u001b[1m      @stdin.write(data)\u001b[m\n\u001b[1m            ^^^^^^\u001b[m","backtrace":["/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:143:in `send_data'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:47:in `block (2 levels) in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `each'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:46:in `block in send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `synchronize'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:45:in `send_event'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/plugins/json.rb:15:in `process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:106:in `plugin_process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/bosh-monitor-0.0.0/lib/bosh/monitor/event_processor.rb:51:in `block (2 levels) in process'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:164:in `block in run'","/var/vcap/data/packages/health_monitor/fe62bcff5ff327f16625197ce862622cb5c1f8d4/gem_home/ruby/3.2.0/gems/async-2.11.0/lib/async/task.rb:377:in `block in schedule'"]}}

Versions (please complete the following information):

  • Infrastructure: vSphere
  • BOSH version 280.0.25
  • BOSH CLI version 6.4.11-e5579de9-2022-01-05T23:48:09Z
  • Stemcell version bosh-vsphere-esxi-jammy/1.445
  • backup-and-restore-sdk/1.18.70
  • bosh-dns/1.36.11
  • bosh-vsphere-cpi/97.0.12
  • bpm/1.2.19
  • credhub/2.12.76
  • syslog/12.2.5
  • uaa/77.10.0

Deployment info:
I have attached a redacted copy of our deployment manifest to this issue.

We use the bosh-deployment release as a submodule to our BOSH deployment project.

Additional context
bosh-us-west-manifest.txt

@slcardinal
Copy link
Author

I forgot to mention, this issue coinsides with removal of EventMachine from Health Monitor and using the async-io gem.

@ystros ystros self-assigned this Jun 13, 2024
@ystros
Copy link
Contributor

ystros commented Jun 13, 2024

Thanks for the detailed reproduction steps @slcardinal ! The underlying error appears to be that the Async::IO::Stream class is no longer being included automatically for us. Similar to what was fixed in #2526, but for a different class.

I'm going to add the missing require and see if I can add error logging so something like this is more obvious in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

2 participants