Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helper functions for metric conversion [awsecscontainermetricsreceiver] #1089

Merged
merged 6 commits into from
Sep 25, 2020
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
// Copyright 2020, OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package awsecscontainermetrics

import (
"time"

metricspb "github.com/census-instrumentation/opencensus-proto/gen-go/metrics/v1"
resourcepb "github.com/census-instrumentation/opencensus-proto/gen-go/resource/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asuresh4 @bogdandrutu Are metrics receivers still using opencensus proto?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we changed most of the core to use the otlp and internal structs. Completely recommend for new components to avoid oc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hossain-rayhan you need to start using pdata.Metrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bogdandrutu , before sending the data to next consumer I am using internaldata.OCToMetrics(md) to convert our metrics to pdata.Metrics. Wondering, isn't that enough like other receivers in the repo or we should strictly get rid of it now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a temporary solution to make progress and not have to change all components once. And decided to use that for some old components that we did not have time to chnage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hossain-rayhan Yeah so basically you should only be converting at the last moment when passing down, but here in this sort of receiver-specific logic we want to be using pdata, the OTel format. Or we just have to rewrite it right away. We're also having data-model issues because of using the old format (Resource type for example) and we want to make sure the model is right

Copy link
Contributor Author

@hossain-rayhan hossain-rayhan Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bogdandrutu and @anuraaga. I understand we need to use pdata to convert everything to OTel format eventually. I was planning to move forward with this to meet our internal deadline (9/30/2020). We can send a different PR after October 15th I guess. How do you guys feel about it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you create an issue and assign to you and @anuraaga I am fine. I trust that you will fix it. I will let @anuraaga make the final call here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue created: #1122

"go.opentelemetry.io/collector/consumer/consumerdata"
)

// metricDataAccumulator defines the accumulator
type metricDataAccumulator struct {
md []*consumerdata.MetricsData
}

// getMetricsData generates OT Metrics data from task metadata and docker stats
func (acc *metricDataAccumulator) getMetricsData(containerStatsMap map[string]ContainerStats, metadata TaskMetadata) {

taskMetrics := ECSMetrics{}
timestamp := timestampProto(time.Now())
taskResources := taskResources(metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: taskResource would be more accurate. Same with containerResources (-> containerResource)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


for _, containerMetadata := range metadata.Containers {
stats := containerStatsMap[containerMetadata.DockerID]
containerMetrics := getContainerMetrics(stats)
containerMetrics.MemoryReserved = *containerMetadata.Limits.Memory
containerMetrics.CPUReserved = *containerMetadata.Limits.CPU

containerResources := containerResources(containerMetadata)
for k, v := range taskResources.Labels {
containerResources.Labels[k] = v
}

acc.accumulate(
containerResources,
convertToOCMetrics(ContainerPrefix, containerMetrics, nil, nil, timestamp),
)

aggregateTaskMetrics(&taskMetrics, containerMetrics)
}

// Overwrite Memory limit with task level limit
if metadata.Limits.Memory != nil {
taskMetrics.MemoryReserved = *metadata.Limits.Memory
}

taskMetrics.CPUReserved = taskMetrics.CPUReserved / CPUsInVCpu

// Overwrite CPU limit with task level limit
if metadata.Limits.CPU != nil {
taskMetrics.CPUReserved = *metadata.Limits.CPU
}

acc.accumulate(
taskResources,
convertToOCMetrics(TaskPrefix, taskMetrics, nil, nil, timestamp),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the 3rd and 4th parameters to this method are always nil, I would remove those parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about it while writing this piece of code. Here, I kept the skeleton ready and the same method can be utilized to set metric labels. In our next PRs, we can just pass the LabelKeys and LabelValues and we are done. If we really don't utilize, I will remove them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 SGTM

)
}

func (acc *metricDataAccumulator) accumulate(
r *resourcepb.Resource,
m ...[]*metricspb.Metric,
) {
var resourceMetrics []*metricspb.Metric
for _, metrics := range m {
for _, metric := range metrics {
if metric != nil {
resourceMetrics = append(resourceMetrics, metric)
}
}
}

acc.md = append(acc.md, &consumerdata.MetricsData{
Metrics: resourceMetrics,
Resource: r,
})
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
// Copyright 2020, OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package awsecscontainermetrics

import (
"testing"

"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/consumer/consumerdata"
)

func TestGetMetricsData(t *testing.T) {
v := uint64(1)
f := float64(1.0)

memStats := make(map[string]uint64)
memStats["cache"] = v

mem := MemoryStats{
Usage: &v,
MaxUsage: &v,
Limit: &v,
MemoryReserved: &v,
MemoryUtilized: &v,
Stats: memStats,
}

disk := DiskStats{
IoServiceBytesRecursives: []IoServiceBytesRecursive{
{Op: "Read", Value: &v},
{Op: "Write", Value: &v},
{Op: "Total", Value: &v},
},
}

net := make(map[string]NetworkStats)
net["eth0"] = NetworkStats{
RxBytes: &v,
RxPackets: &v,
RxErrors: &v,
RxDropped: &v,
TxBytes: &v,
TxPackets: &v,
TxErrors: &v,
TxDropped: &v,
}

netRate := NetworkRateStats{
RxBytesPerSecond: &f,
TxBytesPerSecond: &f,
}

percpu := []*uint64{&v, &v}
cpuUsage := CPUUsage{
TotalUsage: &v,
UsageInKernelmode: &v,
UsageInUserMode: &v,
PerCPUUsage: percpu,
}

cpuStats := CPUStats{
CPUUsage: cpuUsage,
OnlineCpus: &v,
SystemCPUUsage: &v,
CPUUtilized: &v,
CPUReserved: &v,
}
containerStats := ContainerStats{
Name: "test",
ID: "001",
Memory: mem,
Disk: disk,
Network: net,
NetworkRate: netRate,
CPU: cpuStats,
}

tm := TaskMetadata{
Cluster: "cluster-1",
TaskARN: "arn:aws:some-value/001",
Family: "task-def-family-1",
Revision: "task-def-version",
Containers: []ContainerMetadata{
{ContainerName: "container-1", DockerID: "001", DockerName: "docker-container-1", Limits: Limit{CPU: &f, Memory: &v}},
},
Limits: Limit{CPU: &f, Memory: &v},
}

cstats := make(map[string]ContainerStats)
cstats["001"] = containerStats

var mds []*consumerdata.MetricsData
acc := metricDataAccumulator{
md: mds,
}

acc.getMetricsData(cstats, tm)
require.Less(t, 0, len(acc.md))
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,47 @@ const (
AttributeECSTaskRevesion = "ecs.task-definition-version"
AttributeECSServiceName = "ecs.service"

ContainerMetricsLabelLen = 3
TaskMetricsLabelLen = 6
CPUsInVCpu = 1024
BytesInMiB = 1024 * 1024

TaskPrefix = "ecs.task."
ContainerPrefix = "container."
MetricResourceType = "aoc.ecs"

AttributeMemoryUsage = "memory.usage"
AttributeMemoryMaxUsage = "memory.usage.max"
AttributeMemoryLimit = "memory.usage.limit"
AttributeMemoryReserved = "memory.reserved"
AttributeMemoryUtilized = "memory.utilized"

AttributeCPUTotalUsage = "cpu.usage.total"
AttributeCPUKernelModeUsage = "cpu.usage.kernelmode"
AttributeCPUUserModeUsage = "cpu.usage.usermode"
AttributeCPUSystemUsage = "cpu.usage.system"
AttributeCPUCores = "cpu.cores"
AttributeCPUOnlines = "cpu.onlines"
AttributeCPUReserved = "cpu.reserved"
AttributeCPUUtilized = "cpu.utilized"

AttributeNetworkRateRx = "network.rate.rx"
AttributeNetworkRateTx = "network.rate.tx"

AttributeNetworkRxBytes = "network.io.usage.rx_bytes"
AttributeNetworkRxPackets = "network.io.usage.rx_packets"
AttributeNetworkRxErrors = "network.io.usage.rx_errors"
AttributeNetworkRxDropped = "network.io.usage.rx_dropped"
AttributeNetworkTxBytes = "network.io.usage.tx_bytes"
AttributeNetworkTxPackets = "network.io.usage.tx_packets"
AttributeNetworkTxErrors = "network.io.usage.tx_errors"
AttributeNetworkTxDropped = "network.io.usage.tx_dropped"

AttributeStorageRead = "storage.read_bytes"
AttributeStorageWrite = "storage.write_bytes"

UnitBytes = "Bytes"
UnitMegaBytes = "MB"
UnitNanoSecond = "NS"
UnitBytesPerSec = "Bytes/Sec"
UnitCount = "Count"
UnitVCpu = "vCPU"
)

This file was deleted.

Loading