Skip to content

Commit

Permalink
Add process summary metrics
Browse files Browse the repository at this point in the history
As we're enabling `include_top_n` by default, a few visualizations from the
Metricbeat-processes list were no longer correct (as they aggregate only a
sample of the data).

This adds a new `process_summary` metricset that adds these metrics. The fields
are namespaced under `process.summary`.

This PR adds summary metrics for the total number of processes and their state, as an
extra document created by the `system.process` metricset.
  • Loading branch information
Tudor Golubenco committed May 12, 2017
1 parent e48b5d4 commit fbac667
Show file tree
Hide file tree
Showing 22 changed files with 345 additions and 31 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ https://github.com/elastic/beats/compare/v6.0.0-alpha1...master[Check the HEAD d
*Metricbeat*

- Add macOS implementation of the system diskio metricset. {issue}4144[4144]
- Add process_summary metricset that records high level metrics about processes. {pull}4231[4231]

*Packetbeat*

Expand Down
63 changes: 63 additions & 0 deletions metricbeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -10512,6 +10512,69 @@ type: long
Total number of I/O operations performed on all devices by processes in the cgroup as seen by the throttling policy.
[float]
== process.summary Fields
Summary metrics for the processes running on the host.
[float]
=== system.process.summary.total
type: long
Total number of processes on this host.
[float]
=== system.process.summary.running
type: long
Number of running processes on this host.
[float]
=== system.process.summary.idle
type: long
Number of idle processes on this host.
[float]
=== system.process.summary.sleeping
type: long
Number of sleeping processes on this host.
[float]
=== system.process.summary.stopped
type: long
Number of stopped processes on this host.
[float]
=== system.process.summary.zombie
type: long
Number of zombie processes on this host.
[float]
=== system.process.summary.unknown
type: long
Number of processes for which the state couldn't be retrieved or is unknown.
[float]
== socket Fields
Expand Down
7 changes: 7 additions & 0 deletions metricbeat/docs/modules/system.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,9 @@ metricbeat.modules:
# Network stats
- network
# Processes summary
- process_summary
# Per process stats
- process
Expand Down Expand Up @@ -147,6 +150,8 @@ The following metricsets are available:

* <<metricbeat-metricset-system-process,process>>

* <<metricbeat-metricset-system-process_summary,process_summary>>

* <<metricbeat-metricset-system-socket,socket>>

include::system/core.asciidoc[]
Expand All @@ -167,5 +172,7 @@ include::system/network.asciidoc[]

include::system/process.asciidoc[]

include::system/process_summary.asciidoc[]

include::system/socket.asciidoc[]

19 changes: 19 additions & 0 deletions metricbeat/docs/modules/system/process_summary.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
////
This file is generated! See scripts/docs_collector.py
////

[[metricbeat-metricset-system-process_summary]]
include::../../../module/system/process_summary/_meta/docs.asciidoc[]


==== Fields

For a description of each field in the metricset, see the
<<exported-fields-system,exported fields>> section.

Here is an example document generated by this metricset:

[source,json]
----
include::../../../module/system/process_summary/_meta/data.json[]
----
1 change: 1 addition & 0 deletions metricbeat/include/list.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ import (
_ "github.com/elastic/beats/metricbeat/module/system/memory"
_ "github.com/elastic/beats/metricbeat/module/system/network"
_ "github.com/elastic/beats/metricbeat/module/system/process"
_ "github.com/elastic/beats/metricbeat/module/system/process_summary"
_ "github.com/elastic/beats/metricbeat/module/system/socket"
_ "github.com/elastic/beats/metricbeat/module/vsphere"
_ "github.com/elastic/beats/metricbeat/module/vsphere/datastore"
Expand Down
3 changes: 3 additions & 0 deletions metricbeat/metricbeat.full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ metricbeat.modules:
# Network stats
- network

# Processes summary
- process_summary

# Per process stats
- process

Expand Down
3 changes: 3 additions & 0 deletions metricbeat/metricbeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ metricbeat.modules:
# Network stats
- network

# Processes summary
- process_summary

# Per process stats
- process

Expand Down
3 changes: 3 additions & 0 deletions metricbeat/module/system/_meta/config.full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@
# Network stats
- network

# Processes summary
- process_summary

# Per process stats
- process

Expand Down
3 changes: 3 additions & 0 deletions metricbeat/module/system/_meta/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@
# Network stats
- network

# Processes summary
- process_summary

# Per process stats
- process

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"description": "",
"title": "Metricbeat-processes",
"uiStateJSON": "{\"P-1\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}},\"P-4\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}}}",
"panelsJSON": "[{\"col\":1,\"id\":\"System-Navigation\",\"panelIndex\":5,\"row\":1,\"size_x\":12,\"size_y\":1,\"type\":\"visualization\"},{\"col\":1,\"id\":\"Number-of-processes\",\"panelIndex\":7,\"row\":2,\"size_x\":3,\"size_y\":3,\"type\":\"visualization\"},{\"col\":4,\"id\":\"Process-state-by-host\",\"panelIndex\":9,\"row\":2,\"size_x\":5,\"size_y\":3,\"type\":\"visualization\"},{\"col\":9,\"id\":\"Number-of-processes-by-host\",\"panelIndex\":8,\"row\":2,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":1,\"id\":\"CPU-usage-per-process\",\"panelIndex\":2,\"row\":8,\"size_x\":6,\"size_y\":8,\"type\":\"visualization\"},{\"col\":7,\"id\":\"Memory-usage-per-process\",\"panelIndex\":3,\"row\":8,\"size_x\":6,\"size_y\":8,\"type\":\"visualization\"},{\"col\":1,\"id\":\"Top-processes-by-memory-usage\",\"panelIndex\":1,\"row\":16,\"size_x\":6,\"size_y\":11,\"type\":\"visualization\"},{\"col\":7,\"id\":\"Top-processes-by-CPU-usage\",\"panelIndex\":4,\"row\":16,\"size_x\":6,\"size_y\":11,\"type\":\"visualization\"},{\"id\":\"Number-of-processes-over-time\",\"type\":\"visualization\",\"panelIndex\":10,\"size_x\":12,\"size_y\":3,\"col\":1,\"row\":5}]",
"panelsJSON": "[{\"col\":1,\"id\":\"System-Navigation\",\"panelIndex\":5,\"row\":1,\"size_x\":12,\"size_y\":1,\"type\":\"visualization\"},{\"col\":1,\"id\":\"Number-of-processes\",\"panelIndex\":7,\"row\":2,\"size_x\":3,\"size_y\":3,\"type\":\"visualization\"},{\"col\":1,\"id\":\"CPU-usage-per-process\",\"panelIndex\":2,\"row\":5,\"size_x\":6,\"size_y\":8,\"type\":\"visualization\"},{\"col\":7,\"id\":\"Memory-usage-per-process\",\"panelIndex\":3,\"row\":5,\"size_x\":6,\"size_y\":8,\"type\":\"visualization\"},{\"col\":1,\"id\":\"Top-processes-by-memory-usage\",\"panelIndex\":1,\"row\":13,\"size_x\":6,\"size_y\":6,\"type\":\"visualization\"},{\"col\":7,\"id\":\"Top-processes-by-CPU-usage\",\"panelIndex\":4,\"row\":13,\"size_x\":6,\"size_y\":6,\"type\":\"visualization\"},{\"col\":4,\"id\":\"Number-of-processes-over-time\",\"panelIndex\":10,\"row\":2,\"size_x\":9,\"size_y\":3,\"type\":\"visualization\"}]",
"optionsJSON": "{\"darkTheme\":false}",
"version": 1,
"kibanaSavedObjectMeta": {
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
{
"visState": "{\"title\":\"Number of processes over time\",\"type\":\"line\",\"params\":{\"shareYAxis\":true,\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"right\",\"showCircles\":true,\"smoothLines\":false,\"interpolate\":\"linear\",\"scale\":\"linear\",\"drawLinesBetweenPoints\":true,\"radiusRatio\":9,\"times\":[],\"addTimeMarker\":false,\"defaultYExtents\":false,\"setYExtents\":false,\"yAxis\":{}},\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"cardinality\",\"schema\":\"metric\",\"params\":{\"field\":\"system.process.pid\",\"customLabel\":\"Number of processes\"}},{\"id\":\"2\",\"enabled\":true,\"type\":\"date_histogram\",\"schema\":\"segment\",\"params\":{\"field\":\"@timestamp\",\"interval\":\"auto\",\"customInterval\":\"2h\",\"min_doc_count\":1,\"extended_bounds\":{}}}],\"listeners\":{}}",
"visState": "{\"title\":\"Number of processes over time\",\"type\":\"line\",\"params\":{\"shareYAxis\":true,\"addTooltip\":true,\"addLegend\":true,\"legendPosition\":\"bottom\",\"showCircles\":true,\"smoothLines\":false,\"interpolate\":\"linear\",\"scale\":\"linear\",\"drawLinesBetweenPoints\":true,\"radiusRatio\":9,\"times\":[],\"addTimeMarker\":false,\"defaultYExtents\":false,\"setYExtents\":false,\"yAxis\":{},\"grid\":{\"categoryLines\":false,\"style\":{\"color\":\"#eee\"}},\"categoryAxes\":[{\"id\":\"CategoryAxis-1\",\"type\":\"category\",\"position\":\"bottom\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\"},\"labels\":{\"show\":true,\"truncate\":100},\"title\":{\"text\":\"@timestamp per 30 seconds\"}}],\"valueAxes\":[{\"id\":\"ValueAxis-1\",\"name\":\"LeftAxis-1\",\"type\":\"value\",\"position\":\"left\",\"show\":true,\"style\":{},\"scale\":{\"type\":\"linear\",\"mode\":\"normal\"},\"labels\":{\"show\":true,\"rotate\":0,\"filter\":false,\"truncate\":100},\"title\":{\"text\":\"Average system.process.summary.total\"}}],\"seriesParams\":[{\"show\":true,\"mode\":\"normal\",\"type\":\"line\",\"drawLinesBetweenPoints\":true,\"showCircles\":true,\"interpolate\":\"linear\",\"lineWidth\":2,\"data\":{\"id\":\"1\",\"label\":\"Average system.process.summary.total\"},\"valueAxis\":\"ValueAxis-1\"}]},\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"avg\",\"schema\":\"metric\",\"params\":{\"field\":\"system.process.summary.total\"}},{\"id\":\"2\",\"enabled\":true,\"type\":\"date_histogram\",\"schema\":\"segment\",\"params\":{\"field\":\"@timestamp\",\"interval\":\"auto\",\"customInterval\":\"2h\",\"min_doc_count\":1,\"extended_bounds\":{}}}],\"listeners\":{}}",
"description": "",
"title": "Number of processes over time",
"uiStateJSON": "{}",
"version": 1,
"savedSearchId": "Process-stats",
"kibanaSavedObjectMeta": {
"searchSourceJSON": "{\"filter\":[]}"
"searchSourceJSON": "{\"filter\":[],\"index\":\"metricbeat-*\",\"query\":{\"query_string\":{\"query\":\"metricset.name: process_summary\",\"analyze_wildcard\":true}},\"highlight\":{\"pre_tags\":[\"@kibana-highlighted-field@\"],\"post_tags\":[\"@/kibana-highlighted-field@\"],\"fields\":{\"*\":{}},\"require_field_match\":false,\"fragment_size\":2147483647}}"
}
}
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
{
"visState": "{\"title\":\"Number of processes\",\"type\":\"metric\",\"params\":{\"handleNoResults\":true,\"fontSize\":\"40\"},\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"cardinality\",\"schema\":\"metric\",\"params\":{\"field\":\"system.process.name\",\"customLabel\":\"Number of Processes\"}}],\"listeners\":{}}",
"visState": "{\"title\":\"Number of processes\",\"type\":\"metric\",\"params\":{\"fontSize\":\"40\",\"handleNoResults\":true},\"aggs\":[{\"id\":\"1\",\"enabled\":true,\"type\":\"max\",\"schema\":\"metric\",\"params\":{\"field\":\"system.process.summary.total\",\"customLabel\":\"Max number of processes\"}}],\"listeners\":{}}",
"description": "",
"title": "Number of processes",
"uiStateJSON": "{}",
"version": 1,
"savedSearchId": "Process-stats",
"kibanaSavedObjectMeta": {
"searchSourceJSON": "{\"filter\":[]}"
"searchSourceJSON": "{\"filter\":[],\"index\":\"metricbeat-*\",\"query\":{\"query_string\":{\"query\":\"metricset.name:process_summary\",\"analyze_wildcard\":true}},\"highlight\":{\"pre_tags\":[\"@kibana-highlighted-field@\"],\"post_tags\":[\"@/kibana-highlighted-field@\"],\"fields\":{\"*\":{}},\"require_field_match\":false,\"fragment_size\":2147483647}}"
}
}

This file was deleted.

4 changes: 2 additions & 2 deletions metricbeat/module/system/process/helper.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import (
"github.com/elastic/beats/metricbeat/module/system"
"github.com/elastic/beats/metricbeat/module/system/memory"
sigar "github.com/elastic/gosigar"
"github.com/pkg/errors"
)

type ProcsMap map[int]*Process
Expand Down Expand Up @@ -335,8 +336,7 @@ func (procStats *ProcStats) GetProcStats() ([]common.MapStr, error) {

pids, err := Pids()
if err != nil {
logp.Warn("Getting the list of pids: %v", err)
return nil, err
return nil, errors.Wrap(err, "failed to fetch the list of PIDs")
}

var processes []Process
Expand Down
24 changes: 24 additions & 0 deletions metricbeat/module/system/process_summary/_meta/data.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"@timestamp": "2016-05-23T08:05:34.853Z",
"beat": {
"hostname": "host.example.com",
"name": "host.example.com"
},
"metricset": {
"module": "system",
"name": "process_summary",
"rtt": 115
},
"system": {
"process_summary": {
"idle": 0,
"running": 225,
"sleeping": 0,
"stopped": 0,
"total": 355,
"unknown": 130,
"zombie": 0
}
},
"type": "metricsets"
}
4 changes: 4 additions & 0 deletions metricbeat/module/system/process_summary/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
=== System Process Summary MetricSet

This is the `process_summary` metricset of the module system. It collects high
level statistics about the running processes.
34 changes: 34 additions & 0 deletions metricbeat/module/system/process_summary/_meta/fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
- name: process.summary
title: Process Summary
type: group
description: >
Summary metrics for the processes running on the host.
fields:
- name: total
type: long
description: >
Total number of processes on this host.
- name: running
type: long
description: >
Number of running processes on this host.
- name: idle
type: long
description: >
Number of idle processes on this host.
- name: sleeping
type: long
description: >
Number of sleeping processes on this host.
- name: stopped
type: long
description: >
Number of stopped processes on this host.
- name: zombie
type: long
description: >
Number of zombie processes on this host.
- name: unknown
type: long
description: >
Number of processes for which the state couldn't be retrieved or is unknown.
5 changes: 5 additions & 0 deletions metricbeat/module/system/process_summary/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
/*
Package process_summary collects high level summary metrics about the running processes.
It is implemented on darwin, freebsd, linux, and windows.
*/
package process_summary
Loading

0 comments on commit fbac667

Please sign in to comment.