Deploy A Scalable Loki Instance To Kubernetes Via Helm

0

Grafana Loki Architecture

Grafana Loki Architecture, image credit - geekflare

Loki is an excellent log aggregation system configured to scale horizontally conveniently. Prometheus inspired it, however, Loki focuses on log collection via push mechanisms. Loki indexes the metadata of logs which in turn becomes labels for each log stream.

In this article, I will provide details on how to deploy the Loki instance to a Kubernetes(EKS) Cluster via Helm package manager, using S3 as storage.

Requirements

  • Helm
  • Terragrunt/Terraform
  • Loki
  • Kubernetes Cluster
  • AWS S3
  • AWS IRSA

Create AWS S3 Bucket

You need to create an s3 bucket to hold all the logs. Here is a sample terragrunt config to get this done:

...
dependency "kms" {
  config_path = "../<path to kms>/kms"
}

terraform {
  source = <path or url to terraform s3 bucket module>
}

inputs = {
  bucket = "loki-storage"
  acl    = "private"

  server_side_encryption_configuration = {
    rule = {
      apply_server_side_encryption_by_default = {
        kms_master_key_id = <kms key_arn> # incase you are encrypting s3 content, don't forget to include in IRSA policy
        sse_algorithm     = "aws:kms"
      }
      bucket_key_enabled = true
    }
  }

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true

  versioning = {
    enabled = true
  }

  tags = <optional tags>
}
...

Create AWS IRSA

IAM roles for service accounts will provide Kubernetes with the required authentication and minimum authorization to access AWS resources, in this case, the S3 bucket and KMS Key(in case you are encrypting your s3). Using the following config, we can create an IRSA:

...
dependency "eks" {
  config_path = "<path to eks cluster terragrunt file>/eks/"
}

terraform {
  source = "<path or link to aws irsa module>/irsa///"
}

inputs = {
  cluster_id = dependency.eks.outputs.eks.cluster_id

  cluster_oidc_issuer_url = dependency.eks.outputs.eks.cluster_oidc_issuer_url

  iam_assumable_roles_with_oidc_and_policies = {
    grafana-loki = {
      role_name = "grafana-loki-${dependency.eks.outputs.eks.cluster_id}"
      
      #where sa-name is the name of the service account e.g grafana-loki-sa
      oidc_fully_qualified_subjects = ["system:serviceaccount:<namespace>:<sa-name>"]

      iam_policy = {
        name   = "grafana-loki-${dependency.eks.outputs.eks.cluster_id}"
        policy = file("policies/policy.json")
      }
    }
  }

  tags = <optional tags>
}
...

Don’t forget to set up a policy file policy.json to allow access to the S3 bucket and KMS key:

{
    "Version": "2012-10-17",
    "Statement" : [
      {
        "Effect": "Allow",
        "Action": [
            "s3:PutObject",
            "s3:GetObject",
            "s3:DeleteObject"
        ],
        "Resource" : [
          "arn:aws:s3:::loki-storage",
          "arn:aws:s3:::loki-storage/*"
        ]
      },
      {
          "Effect": "Allow",
          "Action": "s3:ListBucket",
          "Resource" : [
            "arn:aws:s3:::loki-storage",
            "arn:aws:s3:::loki-storage/*"
          ]
      },
      {
        "Action" : [
          "s3:ListBucket",
          "s3:PutObject",
          "s3:GetObject",
          "s3:DeleteObject"
        ],
        "Effect" : "Allow",
        "Resource" : [
          "arn:aws:s3:::loki-storage",
          "arn:aws:s3:::loki-storage/*"
        ]
      },
      {
        "Effect": "Allow",
        "Action": [
            "kms:GenerateDataKey",
            "kms:Decrypt",
            "kms:Encrypt",
            "kms:ReEncryptTo",
            "kms:GenerateDataKeyWithoutPlaintext",
            "kms:DescribeKey",
            "kms:GenerateDataKeyPairWithoutPlaintext",
            "kms:GenerateDataKeyPair",
            "kms:ReEncryptFrom"
        ],
        "Resource": "arn:aws:kms:region:<aws-account>:key/<kms key>"
      }
    ]
}

Configure Loki Via Helm Chart

Depending on the logs you intend to handle, head over here to generate the requisite setup size in the form of a value file for your helm chart. Next, we can define our values file in the following way for values-<env>.yaml:

loki:
  auth_enabled: false 
  commonConfig:
    path_prefix: /var/loki
    replication_factor: 1
  storage_config:
    boltdb_shipper:
      active_index_directory: /var/loki/boltdb-shipper-active
      cache_location: /var/loki/boltdb-shipper-cache
      cache_ttl: 24h
      shared_store: s3
    aws:
      bucketnames: loki-storage
      region: eu-central-1
      s3forcepathstyle: false
  storage:
    bucketNames:
      chunks: loki-storage
      ruler: loki-storage
      admin: loki-storage
    type: s3
    s3:
      s3: s3://loki-storage
      endpoint: null
      region: eu-central-1
      secretAccessKey: null
      accessKeyId: null
      s3ForcePathStyle: false
      insecure: false
      http_config: {}
  schemaConfig:
    configs:
      - from: 2020-07-01
        store: boltdb-shipper
        object_store: s3
        schema: v11
        index:
          prefix: index_
          period: 24h
  compactor:
    retention_enabled: true
    working_directory: /var/loki/compactor
    compaction_interval: 10m
    shared_store: s3
    retention_delete_delay: 2h
    retention_delete_worker_count: 150
  limits_config:
    enforce_metric_name: false
    max_cache_freshness_per_query: 10m
    reject_old_samples: true
    reject_old_samples_max_age: 168h
    split_queries_by_interval: 15m
    per_stream_rate_limit: 512M
    per_stream_rate_limit_burst: 1024M
    cardinality_limit: 200000
    ingestion_burst_size_mb: 1000
    ingestion_rate_mb: 10000
    max_entries_limit_per_query: 1000000
    max_label_value_length: 20480
    max_label_name_length: 10240
    max_label_names_per_series: 300
tableManager:
  retention_deletes_enabled: true
  retention_period: 1488h
serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::<aws-account>:role/<loki-role>"
  name: grafana-loki-sa
write:
  replicas: 2
read:
  replicas: 2
monitoring:
  serviceMonitor:
    enabled: true
  selfMonitoring:
    grafanaAgent:
      installOperator: false # we have an existing installation of grafana agent
gateaway:
  basicAuth:
    enabled: true
    username: admin
    password: <nice-password>
ingress:
  enabled: true
  ingressClassName: alb-class
  annotations:
    <ingress annotations here>: <>
  hosts:
  - loki.com

Important Notes:

  1. We set auth_enabled: false because we do not want to enable multi-tenant mode.
  2. Aws s3 storage config items like endpoint, secret access key and access key id are set to null because we will be using a service account so as not to expose aws credentials.

The config needs to be followed to detail to ensure a successful install.

Refer to artifact hub for the Loki helm chart which will be deployed to the Kubernetes cluster.

Deploy Loki

Using the values file previously provided, run the following command in the cluster:

helm repo add grafana https://grafana.github.io/helm-charts

helm repo update

helm install -f loki/values-<env>.yaml loki grafana/loki -n=<namespace>

Congratulations! At this point, you should have a full working grafana-loki instance deployed to your Kubernetes cluster. run the following command to view the resources created:

kubectl get pods -n=<namespace> | grep ^loki

Loki Resources

devops awsobservabilitymonitoringgrafanaloki 0 0

Leave a comment