Traces & Metrics
KubeSense OpenTelemetry Collector
This guide covers collecting traces and metrics from your ECS Serverless (Fargate) applications using an OpenTelemetry Collector sidecar.
Installation
Step 1: Store Collector Configuration in Parameter Store
Store this configuration in AWS Parameter Store at /ecs/kubesense/otelcol-sidecar.yaml:
extensions:
health_check:
receivers:
# ECS task / container metrics
awsecscontainermetrics:
collection_interval: 30s
# App → Collector
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
# Prevent OOM in Fargate
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
# Reduce payload size
batch:
timeout: 10s
send_batch_size: 1024
# Keep only useful ECS metrics
filter:
metrics:
include:
match_type: strict
metric_names:
- ecs.task.cpu.reserved
- ecs.task.cpu.utilized
- ecs.task.memory.reserved
- ecs.task.memory.utilized
- ecs.task.network.rate.rx
- ecs.task.network.rate.tx
- ecs.task.storage.read_bytes
- ecs.task.storage.write_bytes
- container.duration
# Add your platform labels
resource:
attributes:
- key: kubesense.cluster
value: <YOUR_CLUSTER_NAME>
action: insert
- key: kubesense.env_type
value: <YOUR_ENV_TYPE>
action: insert
exporters:
# OTLP over HTTP
otlphttp/kubesense-traces:
endpoint: http://<KUBESENSE_ENDPOINT>:33443
tls:
insecure: true
timeout: 30s
# Metrics → VictoriaMetrics
prometheusremotewrite:
endpoint: http://<KUBESENSE_ENDPOINT>:30060/api/v1/write
timeout: 30s
resource_to_telemetry_conversion:
enabled: true
send_metadata: true
service:
extensions: [health_check]
pipelines:
# Traces
traces:
receivers: [otlp]
processors: [memory_limiter, resource, batch]
exporters: [otlphttp/kubesense-traces]
# App metrics (if any)
metrics:
receivers: [otlp]
processors: [memory_limiter, resource, batch]
exporters: [prometheusremotewrite]
# ECS task/container metrics
metrics/aws:
receivers: [awsecscontainermetrics]
processors: [memory_limiter, filter, resource, batch]
exporters: [prometheusremotewrite]Placeholder Values
Replace the following placeholders in the configuration:
<KUBESENSE_ENDPOINT>- KubeSense ingestion endpoint hostname (provided by KubeSense platform)<YOUR_CLUSTER_NAME>- Your ECS cluster identifier (provided by KubeSense platform)<YOUR_ENV_TYPE>- Environment designation like production or staging (provided by KubeSense platform)
Configuration
Step 2: Add Collector Container to Task Definition
In your ECS task definition, add the OpenTelemetry Collector as a sidecar container:
{
"name": "otel-collector",
"image": "otel/opentelemetry-collector-contrib:0.142.0",
"cpu": 256,
"memory": 512,
"essential": true,
"command": [
"--config=env:OTEL_CONFIG"
],
"secrets": [
{
"name": "OTEL_CONFIG",
"valueFrom": "arn:aws:ssm:<AWS_REGION>:<ACCOUNT_ID>:parameter/ecs/kubesense/otelcol-sidecar.yaml"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/otel-collector",
"awslogs-create-group": "true",
"awslogs-region": "<AWS_REGION>",
"awslogs-stream-prefix": "ecs"
}
},
"portMappings": [
{
"containerPort": 4317,
"protocol": "tcp"
},
{
"containerPort": 4318,
"protocol": "tcp"
}
]
}Placeholder Values:
<AWS_REGION>- Your AWS region<ACCOUNT_ID>- Your AWS account ID
Step 3: Configure Application Container
Update your application container to send traces and metrics to the collector:
{
"name": "your-application",
"image": "your-image:latest",
"essential": true,
"dependsOn": [
{
"containerName": "otel-collector",
"condition": "START"
}
],
"environment": [
{
"name": "OTEL_SERVICE_NAME",
"value": "<SERVICE_NAME>"
},
{
"name": "OTEL_EXPORTER_OTLP_PROTOCOL",
"value": "http/protobuf"
},
{
"name": "OTEL_EXPORTER_OTLP_ENDPOINT",
"value": "http://localhost:4318"
}
],
"portMappings": [
{
"containerPort": 3000,
"protocol": "tcp"
}
]
}Placeholder Values:
<SERVICE_NAME>- Your application service name
Key Points:
dependsOnensures the collector starts before your application- Application sends telemetry to
http://localhost:4318(OTLP HTTP endpoint) - Update
OTEL_SERVICE_NAMEwith your service identifier
Step 4: Update IAM Task Execution Role
Your ECS Task Execution Role needs permission to read from SSM Parameter Store and write to CloudWatch Logs.
Option 1: Attach Managed Policies
Attach the following AWS managed policies to your task execution role:
AmazonSSMReadOnlyAccessCloudWatchLogsFullAccess
Option 2: Add Inline Policy
Alternatively, add an inline policy that allows specific actions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:GetParameter",
"Resource": "arn:aws:ssm:<AWS_REGION>:<ACCOUNT_ID>:parameter/ecs/kubesense/otelcol-sidecar.yaml"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:<AWS_REGION>:<ACCOUNT_ID>:log-group:/ecs/otel-collector:*"
}
]
}Step 5: Update ECS Task Role
The ECS Task Role (not the execution role) should also have access to SSM Parameter Store and CloudWatch Logs if your app or sidecar needs it.
Option 1: Attach Managed Policies
Attach the same managed policies as above:
AmazonSSMReadOnlyAccessCloudWatchLogsFullAccess
Option 2: Use Minimal Inline Policy
For tighter security, use a minimal inline policy for just the required resources:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ssm:GetParameter",
"Resource": "arn:aws:ssm:<AWS_REGION>:<ACCOUNT_ID>:parameter/ecs/kubesense/otelcol-sidecar.yaml"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:<AWS_REGION>:<ACCOUNT_ID>:log-group:/ecs/otel-collector:*"
}
]
}Step 6: Deploy Task Definition
Deploy your updated ECS task definition as follows:
- Update your ECS service with the modified task definition
- Restart the tasks to apply changes
- Monitor CloudWatch Logs at
/ecs/otel-collectorto verify the collector is running and receiving telemetry
Verify Setup
Check CloudWatch Logs for the collector container (/ecs/otel-collector) to confirm:
- Collector starts successfully
- Receives traces and metrics from your application
- Successfully exports to KubeSense
Troubleshooting Installation
Common Issues
Task Not Starting:
- Check ECS cluster has available capacity
- Verify the container image can be pulled from the registry
- Review CloudWatch logs for the failed tasks
Parameter Store Access Issues:
- Ensure the IAM role has
ssm:GetParameterpermissions - Verify the parameter name matches exactly:
/ecs/kubesense/otelcol-sidecar.yaml - Check the parameter is in the same region as your ECS cluster
Container Health Check Failures:
- Verify the health check endpoint is accessible
- Check container logs for any startup errors
- Ensure proper port mappings are configured (4317 and 4318)