Kubesense

S3 Cold Storage Integration

Overview

KubeSense supports configuring retention periods for traces and metrics, with the ability to move data to cold storage (S3) after a specified duration. In AWS EKS environments, you can integrate S3 cold storage in two ways:

  1. Using AWS access and secret keys
  2. Using AWS IAM Role with Service Account

info: S3 integration enables efficient long-term storage of historical observability data while keeping your active storage optimized.

Prerequisites

Before setting up S3 integration, ensure you have:

  • KubeSense deployed on AWS EKS
  • An AWS S3 bucket created for cold storage
  • Appropriate AWS IAM permissions for S3 access
  • Access to modify the KubeSense Helm values

Using AWS IAM roles with Kubernetes Service Accounts is the recommended approach as it eliminates the need to manage credentials and provides better security through IAM integration.

Step 1: Get EKS OIDC Provider URL

  1. Go to your EKS cluster in AWS Console
  2. Copy the OpenID Connect (OIDC) provider URL (found in cluster details)

warning: If you don't see an OIDC provider, you'll need to create one. This is required for IAM role-based authentication.

Step 2: Create IAM OIDC Provider (if not exists)

If there's no IAM OIDC provider for your cluster:

  1. Go to IAMIdentity providersAdd provider
  2. Choose OpenID Connect as provider type
  3. Enter the EKS OIDC URL into Provider URL
  4. Click Get Thumbprint
  5. Enter sts.amazonaws.com as Audience
  6. Click Add provider

Step 3: Create S3 Bucket

Create an S3 bucket for KubeSense cold storage:

  1. Go to S3 ConsoleCreate bucket
  2. Choose a bucket name (e.g., kubesense-cold-storage)
  3. Select your preferred region
  4. Configure bucket settings and permissions as needed
  5. Create the bucket

Step 4: Create IAM Policy for S3 Access

Create an IAM policy that allows access to your S3 bucket. Replace kubesense-cold-storage with your actual bucket name:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::kubesense-cold-storage",
        "arn:aws:s3:::kubesense-cold-storage/*"
      ]
    }
  ]
}

Step 5: Create IAM Role for KubeSense

  1. Go to IAMRolesCreate role
  2. Select Web identity as trusted entity type
  3. Choose the OIDC provider you created earlier
  4. Select sts.amazonaws.com as audience
  5. Attach the S3 policy created in Step 4
  6. Name the role (e.g., kubesense-s3-role)
  7. Create the role

Step 6: Update KubeSense Configuration

Add the cold storage configuration to your KubeSense Helm values:

coldStorageConfig:
  enabled: true
  endpoint: https://kubesense-cold-storage.s3.us-east-2.amazonaws.com/data/
  enableServiceAccountAuth: true
  cloudProvider: "aws"

note: Replace kubesense-cold-storage and us-east-2 with your actual bucket name and region. The endpoint format should be: https://<bucket-name>.s3.<region>.amazonaws.com/<folder>/

Step 7: Annotate KubeSense Service Account

You need to associate the IAM role with KubeSense's ClickHouse service account. Create or update the service account with the IAM role annotation:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: clickhouse-instance
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ACCOUNT_ID:role/kubesense-s3-role

Replace YOUR_ACCOUNT_ID with your actual AWS account ID.

Step 8: Upgrade KubeSense

Apply the configuration by upgrading your KubeSense Helm deployment:

helm upgrade kubesense ./kubesense-chart \
  -f values.yaml \
  --namespace kubesense

Method 2: Using AWS Access and Secret Keys

For environments where IAM roles aren't available or preferred, you can use AWS access keys.

Step 1: Create IAM User with S3 Access

  1. Go to IAMUsersCreate user
  2. Name the user (e.g., kubesense-s3-user)
  3. Attach a policy with S3 permissions (use the same policy from Method 1, Step 4)
  4. Create the user

Step 2: Generate Access Keys

  1. Click on the created user
  2. Go to Security credentials tab
  3. Click Create access key
  4. Choose Application running outside AWS as use case
  5. Download or copy the access key ID and secret access key

warning: Store these credentials securely and never commit them to version control.

Step 3: Update KubeSense Configuration

Add the cold storage configuration with access keys:

coldStorageConfig:
  enabled: true
  endpoint: https://kubesense-cold-storage.s3.us-east-2.amazonaws.com/data/
  accessKeyID: YOUR_ACCESS_KEY_ID
  secretAccessKey: YOUR_SECRET_ACCESS_KEY
  cloudProvider: "aws"

Replace:

  • YOUR_ACCESS_KEY_ID with your AWS access key ID
  • YOUR_SECRET_ACCESS_KEY with your AWS secret access key
  • kubesense-cold-storage and us-east-2 with your bucket name and region

Step 4: Upgrade KubeSense

Apply the configuration:

helm upgrade kubesense ./kubesense-chart \
  -f values.yaml \
  --namespace kubesense

Configuration Parameters

Here's a detailed breakdown of the cold storage configuration:

ParameterTypeRequiredDescription
enabledbooleanYesEnable or disable cold storage
endpointstringYesS3 bucket endpoint URL
cloudProviderstringYesCloud provider (e.g., "aws")
enableServiceAccountAuthbooleanNoUse IAM role authentication (Method 1)
accessKeyIDstringConditional*AWS access key ID
secretAccessKeystringConditional*AWS secret access key

*Required only when enableServiceAccountAuth is false

Verifying Integration

After configuring S3 integration:

  1. Check KubeSense logs to ensure S3 connection is successful:

    kubectl logs -n kubesense deployment/clickhouse-instance -f
  2. Verify data is being written to S3:

    • Check your S3 bucket for backup files
    • Monitor the configured retention period
  3. Test cold storage functionality:

    • Query historical data that should be in cold storage
    • Verify data is accessible through KubeSense UI

Best Practices

  • Security: Prefer IAM roles (Method 1) over access keys for better security
  • Network: Ensure KubeSense pods have proper network access to S3 endpoints
  • Monitoring: Set up S3 bucket lifecycle policies to manage old backups
  • Testing: Test data retrieval from cold storage regularly
  • Documentation: Document your bucket names and IAM roles for your team

Troubleshooting

Issue: Cold storage not working

Solution: Check service account annotations and IAM role ARN format:

kubectl describe serviceaccount clickhouse-instance -n kubesense

Issue: Permission denied errors

Solution: Verify the IAM policy includes all required S3 permissions and the role is correctly attached to the service account.

Issue: Unable to access S3 endpoint

Solution: Verify network connectivity and ensure the S3 endpoint URL is correctly formatted.

Conclusion

S3 integration provides a cost-effective solution for storing historical observability data while maintaining performance for recent data queries. Choose the authentication method that best fits your security requirements and infrastructure setup.