K8s install application

Once Cloud infrastructure is put in place and access to kubernetes cluster is verified, then you can start with setting up required dependencies and Penfield application on kubernetes cluster. There are two ways to install it on kubernetes cluster:

Install using ArgoCD (recommended)
Install using Helmfile
Install using Helm

Make sure you have the following information handy before you proceed:

Name	Description
CLOUD_PROVIDER	Your cloud provider. e.g. azure or aws or custom
DOCKER_USERNAME	You should have received this username from Penfield.
DOCKER_PASSWORD	You should have received this password from Penfield.
AZURE_OPENAI_API_KEY	Your Azure OpenAI Key from your Azure OpenAI deployment.
AZURE_OPENAI_ENDPOINT	Your Azure Open AI Endpoint URL.
FQDN	Fully Qualified Domain Name that you are using for your Penfield application.
SSL_CERTIFICATE	Only Required if cloud provider is Azure. Create a file with name ssl.key and ssl.crt in /certs directory with your FQDN SSL certificate.
RELEASE_TAG	You should have received this from Penfield. The release tag that you want to use for the Penfield application. e.g. 1.0.10
DEPLOYMENT_TIER	Deployment tier that you want to use. e.g. pathfinder or standard. `pathfinder` tier is used for `Pathfinder` only deployment and `standard` tier is used for `Pathfinder + ITSM` deployment.
INTEGRATION_SOURCE	The type of ITSM integration you are using. e.g. helix or sentinel or jira etc.

Once you have the above information, proceed with next steps.

Install using ArgoCD (recommended)

Clone this Github repo which contains the code that you need and follow the below steps:

warning

Never ever commit your actual secret values in version control like git.
secrets folder contains the secrets, make sure you never commit this folder.

Go to the folder where you have clone github repo. Execute the bash script in your terminal by running the following commands and follow the instructions.

./run.sh

For further information please refer to ArgoCD README

Install using Helmfile

Clone this Github repo which contains the code that you need and follow the below steps:

warning

Never ever commit your actual secret values in version control like git.
secrets folder contains the secrets, make sure you never commit this folder.

Go to the folder where you have clone github repo. Execute the bash script in your terminal by running the following commands and follow the instructions.

./run.sh

Once done please come back here for the next steps.

Uninstall deployment

To delete the deployments from Kubernetes cluster:

./run.sh -d

Grafana

Grafana got deployed as a part of kube-prometheus-stack. You should be able to access grafana using https://FQDN/grafana Grafana will set random password at creation. You can find the grafana password using this command:

kubectl get secret \
--namespace monitoring grafana-secrets \
-o jsonpath="{.data.admin-password}" | base64 --decode ; echo

info

If you deployed the dependencies using Helmfile please skip the next section and go to next page.

Install using Helm

Following services needs to be installed on kubernetes cluster:

Nginx Ingress Controller

Nginx Ingress controller will be used to route the ingress traffic to cluster. You can use helm to install Nginx Ingress Controller with custom values. Run the below commands:

Add helm repo on local: helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
Update helm repository: helm repo update
Find nginx ingress controller: helm search repo ingress-nginx
Install nginx ingress controller:

helm upgrade \
--install internal-ingress-nginx ingress-nginx/ingress-nginx \
--namespace internal-ingress-nginx \
--create-namespace \
--version=4.10.1 \
--values internal_ingress_nginx_values.yaml

Values file internal_ingress_nginx_values.yaml look like

caution

Depending on your cloud provider, you can uncomment either NodePort Implementation (For AWS) or Internal LB Implementation (For Azure):

internal_ingress_nginx_values.yaml
controller:
  config:
    server-tokens: "false"
    use-forwarded-headers: "true"
  extraArgs:
    metrics-per-host: "false"
  ingressClassResource:
    name: internal-ingress-nginx
  kind: DaemonSet
  lifecycle:
    preStop:
      exec:
        command:
        - /bin/sh
        - -c
        - sleep 60; /sbin/nginx -c /etc/nginx/nginx.conf -s quit; while pgrep -x nginx; do sleep 1; done
  metrics:
    enabled: true

##### NodePort Implementation (For AWS)
  # service:
  #   enableHttps: false
  #   externalTrafficPolicy: Local
  #   type: NodePort
  #   nodePorts:
  #     http: "31080"


  ##### Internal LB Implementation (For Azure)

  # service:
  #   enabled: true
  #   annotations: {
  #     service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  #     }
  #   type: LoadBalancer

  #####


defaultBackend:
  enabled: true
  resources:
    requests:
      memory: 16Mi

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Kafka controller

Kafka is needed to run the PenfieldAI application. In order to setup kafka cluster, you need to setup Kafka controller and cluster. First you will setup kafka controller. Then, later in this section, you will setup kafka cluster.

You can use strimzi kafka for deployment using helm. Perform below steps to deploy kafka, Run the following commands:

Add helm repo on local: helm repo add strimzi https://strimzi.io/charts/
Update helm repository: helm repo update
Find kafka repository: helm search repo strimzi
Create a kafka operator:
```
helm upgrade \
--install strimzi-kafka strimzi/strimzi-kafka-operator \
--namespace strimzi-kafka \
--create-namespace \
--version 0.41.0 \
--values strimzi-kafka-operator-values.yaml
```
strimzi-kafka-operator-values.yaml file looks like:
strimzi-kafka-operator-values.yaml
```
watchNamespaces: []
watchAnyNamespace: true

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
```
Wait for few minutes until kafka operator starts running, this is needed to deploy kafka cluster.

Verify: It will take few minutes for kafka operator to fully start running. To verify the setup you can run the following command:

kubectl get pods -n strimzi-kafka

If you will see the output like below, it means kafka operator is up and running:
```
NAME                                          READY   STATUS    RESTARTS   AGE
strimzi-cluster-operator-c8ddd59d-rwffr       1/1     Running   0          4m8s
```

Deploy Postgres DB

You need to create new credentials that you want to use for Postgres DB that will be deployed in cluster. Same secret will be modified when you deploy Penfield-app later on.

Deploy penfield-app secrets

Create a penfield-secrets.yaml file with below code:

penfield-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: penfield-secrets
type: Opaque
data:
    # All values must be base64 encoded.
    POSTGRES_USER: <base 64 encoded value>
    POSTGRES_PASSWORD: <base 64 encoded value>

info

Use this command to encode plain text string into base64 string:
echo -n 'plainTextStringToEncode' | base64

To deploy secrets, run the following command:

kubectl create namespace penfield-app
kubectl apply -f penfield-secrets.yaml -n penfield-app

To deploy Postgres DB in the Kubernetes cluster please follow the below steps:

Add helm repo: helm repo add bitnami https://charts.bitnami.com/bitnami
Update helm repository: helm repo update
Search repo: helm search repo bitnami/postgresql

Deploy postgres using helm:

helm upgrade \
    --install analytics bitnami/postgresql \
    --namespace penfield-app \
    --create-namespace \
    --version 11.9.13 \
    --values penfield-app/postgres-values.yaml

For reference postgres-values.yaml file looks like:

postgres-values.yaml
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
global:
  ## Name of the storage class you want to use.
  ## eg: gp3-encrypted-pv or managed-premium-pv
  storageClass: ""
  postgresql:
    auth:
      existingSecret: "penfield-secrets"
      secretKeys:
        adminPasswordKey: "POSTGRES_PASSWORD"

primary:
  persistence:
      enabled: true
      ## Name of the storage class you want to use.
      ## eg: gp3-encrypted-pv or managed-premium-pv
      storageClass: ""
      accessModes:
        - ReadWriteOnce
      size: 50Gi

caution

Postgres deployment will depend on penfield-app secrets that gets created when we deploy penfield app, So if pod did not in running state please deploy the secrets as mentioned in the next section and then verify postgres deployment.

info

Once pod is setup, export the postgres password: export POSTGRES_PASSWORD=$(kubectl get secret --namespace penfield-app analytics-postgresql -o jsonpath="{.data.postgres-password}" | base64 --decode)

To login to postgres from local:

kubectl run analytics-postgresql-client --rm --tty -i --restart='Never' \
--namespace penfield-app --image docker.io/bitnami/postgresql:14.5.0-debian-11-r21 \
--env="PGPASSWORD=$POSTGRES_PASSWORD" \
--command -- psql --host analytics-postgresql -U postgres -d postgres -p 5432

tip

Do not forget to add POSTGRES_USER and POSTGRES_PASSWORD to penfield-secrets when you will create secrets in the upcoming steps.

Deploy Kratos (authentication backend)

Kratos is needed for dashboard authentication. In order to deploy Kratos, postgres database is required. Kratos is deployed through helm charts. Perform below steps to deploy kratos, Run the following commands:

Add helm repo on local: helm repo add ory https://k8s.ory.sh/helm/charts
Update helm repository: helm repo update
Find kafka repository: helm search repo ory/kratos
Create the namespace: kubectl create namespace kratos

Create the secrets for sensitive info kubectl apply -f kratos-secrets.yaml -n kratos

kratos-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: kratos-secrets
  annotations:
    helm.sh/hook-weight: "0"
    helm.sh/hook: "pre-install, pre-upgrade"
    helm.sh/hook-delete-policy: "before-hook-creation"
    helm.sh/resource-policy: "keep"
type: Opaque
data:
    dsn: <base 64 encoded value> # e.g: postgres://<username>:<password>@analytics-postgresql.penfield-app.svc.cluster.local:5432/postgres
    oidcConfig: <base 64 encoded value> # Only needed if you enable SSO, Update the values in this string: [{"id": "microsoft","provider": "microsoft","client_id": "<Update_client_id_value>","client_secret": "<Update_client_secret_value>","microsoft_tenant": "<Update_microsoft_tenant_value>","mapper_url": "base64://bG9jYWwgY2xhaW1zID0gc3RkLmV4dFZhcignY2xhaW1zJyk7CnsKICBpZGVudGl0eTogewogICAgdHJhaXRzOiB7CiAgICAgIFtpZiAnZW1haWwnIGluIGNsYWltcyB0aGVuICdlbWFpbCcgZWxzZSBudWxsXTogY2xhaW1zLmVtYWlsLAogICAgfSwKICB9LAp9Cg==","scope": ["email"]}]

Deploy kratos via helm chart:

helm upgrade \
--install kratos ory/kratos \
--namespace kratos \
--create-namespace \
--version=0.43.1 \
--values kratos-values.yaml

tip

Make sure the PostgresDB can be reached from Kubernetes cluster.
Make sure the kratos-values.yaml file is configured correctly. Replace FQDN with correct values.
- Replace penfield.example.com with your correct FQDN everywhere.
- Make sure postgres DSN value is correct.
- Use unique value for cookie under secrets.
- (optional) Proper SMTP mail server, This is not needed.

For reference kratos-values.yaml file looks like:

kratos-values.yaml
ingress:
  public:
    enabled: true
    className: internal-ingress-nginx
    annotations:
      nginx.ingress.kubernetes.io/rewrite-target: /$2
    hosts:
      - host: "penfield.example.com" # Replace it with your FQDN
        paths:
          - path: /public(/|$)(.*)
            pathType: ImplementationSpecific

secret:
  enabled: true
  nameOverride: "kratos-secrets"

kratos:
  automigration:
    enabled: true
  config:
    serve:
      public:
        base_url: https://penfield.example.com/public/
        cors:
          enabled: true
      admin:
        base_url: https://penfield.example.com/admin/

    selfservice:
      default_browser_return_url: https://penfield.example.com/web/
      allowed_return_urls:
        - https://penfield.example.com/web/

      methods:
        password:
          enabled: true
        link:
          enabled: true
        # Enable OIDC only if you use SSO
        oidc:
          enabled: false
          config:
            providers: []

      flows:
        error:
          ui_url: https://penfield.example.com/web/error

        settings:
          ui_url: https://penfield.example.com/web/settings
          privileged_session_max_age: 15m

        recovery:
          enabled: true
          ui_url: https://penfield.example.com/web/recovery

        verification:
          enabled: true
          ui_url: https://penfield.example.com/web/verify
          after:
            default_browser_return_url: https://penfield.example.com/web/

        logout:
          after:
            default_browser_return_url: https://penfield.example.com/web/welcome

        login:
          ui_url: https://penfield.example.com/web/auth/login
          lifespan: 10m

        registration:
          lifespan: 10m
          ui_url: https://penfield.example.com/web/auth/registration
          after:
            password:
              hooks:
                - hook: session

    log:
      level: trace
      format: json
      leak_sensitive_values: true

    hashers:
      argon2:
        parallelism: 1
        memory: 128MB
        iterations: 2
        salt_length: 16
        key_length: 16

    identity:
      default_schema_id: admin
      schemas:
        - id: admin
          url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBZG1pbiIsCiAgInR5cGUiOiAib2JqZWN0IiwKICAicHJvcGVydGllcyI6IHsKICAgICJ0cmFpdHMiOiB7CiAgICAgICJ0eXBlIjogIm9iamVjdCIsCiAgICAgICJwcm9wZXJ0aWVzIjogewogICAgICAgICJlbWFpbCI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAiZm9ybWF0IjogImVtYWlsIiwKICAgICAgICAgICJ0aXRsZSI6ICJFLU1haWwiLAogICAgICAgICAgIm1pbkxlbmd0aCI6IDMsCiAgICAgICAgICAib3J5LnNoL2tyYXRvcyI6IHsKICAgICAgICAgICAgImNyZWRlbnRpYWxzIjogewogICAgICAgICAgICAgICJwYXNzd29yZCI6IHsKICAgICAgICAgICAgICAgICJpZGVudGlmaWVyIjogdHJ1ZQogICAgICAgICAgICAgIH0KICAgICAgICAgICAgfSwKICAgICAgICAgICAgInZlcmlmaWNhdGlvbiI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9LAogICAgICAgICAgICAicmVjb3ZlcnkiOiB7CiAgICAgICAgICAgICAgInZpYSI6ICJlbWFpbCIKICAgICAgICAgICAgfQogICAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgImZpcnN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiRmlyc3QgTmFtZSIKICAgICAgICB9LAogICAgICAgICJsYXN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiTGFzdCBOYW1lIgogICAgICAgIH0KICAgICAgICB9LAogICAgICAgICJyZXF1aXJlZCI6IFsiZW1haWwiXSwKICAgICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgICB9CiAgICB9CiAgfQo=
        - id: analyst
          url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBbmFseXN0IiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgInJlcXVpcmVkIjogWyJlbWFpbCJdLAogICAgICAgICJhZGRpdGlvbmFsUHJvcGVydGllcyI6IGZhbHNlCiAgICAgIH0KICAgIH0KICB9Cg==
        - id: automationengineer
          url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBdXRvbWF0aW9uIEVuZ2luZWVyIiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICB9LAogICAgICAicmVxdWlyZWQiOiBbCiAgICAgICAgImVtYWlsIgogICAgICBdLAogICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgfQogIH0KfQo=

    courier:
      smtp:
        connection_uri: smtpConnectionURI

## Enable below config if you enable SSO
# deployment:
#   extraEnv:
#     - name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS
#       valueFrom:
#         secretKeyRef:
#           name: kratos-secrets
#           key: oidcConfig
cleanup:
  enabled: true
  batchSize: 20000

Deploy Milvus

Milvus is used as vector database. In order to deploy Milvus, you can use helm. Run the following commands:

Add helm repo on local: helm repo add milvus https://zilliztech.github.io/milvus-helm/
Update helm repository: helm repo update
Find milvus repository: helm search repo milvus/milvus

Deploy milvus using helm:

helm upgrade \
--install milvus milvus/milvus \
--namespace milvus \
--create-namespace \
--version=4.1.32 \
--values milvus-values.yaml

For reference milvus-values.yaml file looks like:

milvus-values.yaml
## Enable or disable Milvus Cluster mode
cluster:
  enabled: false

etcd:
  replicaCount: 1
  persistence:
    storageClass: managed-premium-pv
    size: 10Gi

minio:
  mode: standalone
  persistence:
    storageClass: managed-premium-pv
    size: 50Gi

pulsar:
  enabled: false

standalone:
  persistence:
    persistentVolumeClaim:
      storageClass: managed-premium-pv
      size: 50Gi

Deploy Minio

To deploy Minio, you can use helm. Run the following commands:

Add helm repo on local: helm repo add minio https://charts.min.io
Update helm repository: helm repo update
Find minio repository: helm search repo minio/minio
Create namespace for minio: kubectl create namespace minio

Create secrets for minio: kubectl apply -f minio-secrets.yaml -n minio

minio-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-user-secret
type: Opaque
data:
  rootUser: <base 64 encoded value> # Set a random 32 alpha numeric value.
  rootPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
  userPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
  secretKey: <base 64 encoded value> # Set a random 40 or more alpha numeric value.

Deploy minio using helm:

helm upgrade \
--install minio minio/minio \
--namespace minio \
--create-namespace \
--version=5.2.0 \
--values minio-values.yaml

For reference minio-values.yaml file looks like:

minio-values.yaml
## Set default rootUser, rootPassword
## .data.rootUser and .data.rootPassword are mandatory,
existingSecret: minio-secrets

# Number of MinIO containers running
replicas: 2

## List of users to be created after minio install
users:
  - accessKey: bucketuser
    existingSecret: minio-secrets
    existingSecretKey: userPassword
    policy: readwrite

## List of service accounts to be created after minio install
svcaccts:
  - accessKey: random-32-alphanumeric-characters-accessKey
    existingSecret: minio-secrets
    existingSecretKey: secretKey
    user: bucketuser

## List of buckets to be created after minio install
buckets:
  - name: general
    policy: none
    purge: false
    versioning: true
    objectlocking: false

## Enable persistence using Persistent Volume Claims
persistence:
  enabled: true
  storageClass: managed-premium-pv
  accessMode: ReadWriteOnce
  size: 10Gi

Deploy Airflow

To deploy Airflow, you can use helm. Run the following commands:

Add helm repo on local: helm repo add apache-airflow https://airflow.apache.org
Update helm repository: helm repo update
Find airflow repository: helm search repo apache-airflow/airflow
Create namespace for airflow: kubectl create namespace airflow

Create secrets for airflow. Airflow needs multiple secrets to be created. You can create secrets using below command:

kubectl apply -f airflow-secrets.yaml -n airflow

airflow-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: airflow-secrets
type: Opaque
data:
  fernet-key: <base 64 encoded value> # Set a random value.
  webserver-secret-key: <base 64 encoded value> # Set a random value.
  postgres-password: <base 64 encoded value> # Set a random password value.
  connection: <base 64 encoded value> # e.g: postgresql://postgres:<password>@airflow-postgresql:5432/postgres

kubectl apply -f penfield-secrets.yaml -n airflow

airflow-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: penfield-secrets
type: Opaque
data:
  AZURE_OPENAI_API_KEY: <base 64 encoded value> # Use the AZURE OPENAI KEY

Configmap for airflow: This is used to pass env variable to executors. kubectl apply -f airflow-configmap.yaml -n airflow

airflow-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: penfield-configmap
data:
  MONGODB_URI: mongodb://mongo-db.penfield-app.svc.cluster.local:27017
  MONGODB_NAME: penfield_appdb
  NLP_ENDPOINT: http://nlp-service.penfield-app.svc.cluster.local:3000/api/v1/ner
  OPENAI_API_TYPE: azure
  AZURE_OPENAI_API_VERSION: 2023-05-15
  AZURE_OPENAI_ENDPOINT: <URL of the Azure Open AI Endpoint>
  AZURE_OPENAI_DEPLOYMENT: <Deployment under the Azure OpenAI resource> #e.g. gpt-35-turbo-ca
  AZURE_OPENAI_MODEL: <The model name of the corresponding deployment> # e.g. gpt-35-turbo

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy airflow using helm:

helm upgrade \
--install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace \
--version=1.13.1 \
--values airflow-values.yaml

For reference airflow-values.yaml file looks like:

airflow-values.yaml
# Default airflow tag to deploy
defaultAirflowTag: "2.9.0"

# Default airflow digest. If specified, it takes precedence over tag
defaultAirflowDigest: ~

# Airflow version (Used to make some decisions based on Airflow Version being deployed)
airflowVersion: "2.9.0"

# Images
images:
  airflow:
    repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-server
    tag: release-1.0.5
    # pullPolicy: Always
  pod_template:
    repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-worker
    tag: release-1.0.5
    # pullPolicy: Always

registry:
  secretName: regcred

config:
  webserver:
    base_url: http://penfield.example.com/airflow
  kubernetes_executor: 
    worker_container_repository: '{{ .Values.images.pod_template.repository | default .Values.defaultAirflowRepository }}'
    worker_container_tag: '{{ .Values.images.pod_template.tag | default .Values.defaultAirflowTag }}'

env:
  - name: "AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG"
    value: "1"
  - name: "AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT"
    value: "false"
extraEnvFrom: |
  - configMapRef:
      name: penfield-configmap
  - secretRef:
      name: penfield-secrets
    
# Airflow executor: One of: LocalExecutor, LocalKubernetesExecutor, CeleryExecutor, KubernetesExecutor, CeleryKubernetesExecutor
executor: KubernetesExecutor

# Ingress configuration
ingress:
  # Configs for the Ingress of the web Service
  web:
    enabled: false
    annotations: {}
    path: "/airflow"
    pathType: "Prefix"
    hosts:
      - name: penfield.example.com
    ingressClassName: "internal-ingress-nginx"
    precedingPaths: []
    succeedingPaths: []

# Fernet key settings
# Note: fernetKey can only be set during install, not upgrade
# Airflow uses Fernet to encrypt passwords in the connection configuration and the variable configuration. 
fernetKey: fernet-key
fernetKeySecretName: airflow-secrets


# Flask secret key for Airflow Webserver: `[webserver] secret_key` in airflow.cfg
webserverSecretKey: webserver-secret-key
webserverSecretKeySecretName: airflow-secrets

# Airflow Worker Config
workers:
  # Number of airflow celery workers in StatefulSet
  replicas: 1

  persistence:
    # Enable persistent volumes
    enabled: true
    # Volume size for worker StatefulSet
    size: 10Gi
    storageClassName: managed-premium-pv
    # Execute init container to chown log directory.
    # This is currently only needed in kind, due to usage
    # of local-path provisioner.
    fixPermissions: false
    # Annotations to add to worker volumes
    annotations: {}
    # Detailed default security context for persistence for container level
    securityContexts:
      container: {}
    # container level lifecycle hooks
    containerLifecycleHooks: {}

# Configuration for postgresql subchart
# Not recommended for production
postgresql:
  enabled: true
  auth:
    enablePostgresUser: true
    existingSecret: airflow-secrets
  primary:
    ## PostgreSQL Primary persistence configuration
    persistence:
      enabled: true
      storageClass: "managed-premium-pv"
      size: 10Gi

# Configuration for the redis provisioned by the chart
redis:
  enabled: false

# Airflow database & redis config
data:
  metadataSecretName: airflow-secrets
  
# Airflow Triggerer Config
triggerer:
  enabled: true
  # Number of airflow triggerers in the deployment
  replicas: 1
  # Max number of old replicasets to retain
  revisionHistoryLimit: ~
  persistence:
    enabled: true
    size: 10Gi
    storageClassName: managed-premium-pv

# StatsD settings
statsd:
  enabled: false

# This runs as a CronJob to cleanup old pods.
cleanup:
  enabled: true
  schedule: "@daily"
  
  logs:
  persistence:
    enabled: false

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Monitoring stack

Deploy Prometheus and Grafana:

Premethus is used to collect metrics and Grafana to visualize it. This will be helpful for troubleshooting. To deploy Prometheus + Grafana, run the following commands:

Add helm repo: helm repo add rometheus-community https://prometheus-community.github.io/helm-charts
Update helm repository: helm repo update
Find airflow repository: helm search repo prometheus-community/kube-prometheus-stack
Deploy configmap for custom dashboard:

kubectl create ns monitoring
kubectl apply -f loki-logs-dashboard.yaml -n monitoring

For reference loki-logs-dashboard.yaml file looks like:

loki-logs-dashboard.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-logs-grafana-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: "1"
data:
  loki-dashboard.json: |-
    {
      "annotations": {
        "list": [
          {
            "builtIn": 1,
            "datasource": "-- Grafana --",
            "enable": true,
            "hide": true,
            "iconColor": "rgba(0, 211, 255, 1)",
            "name": "Annotations & Alerts",
            "target": {
              "limit": 100,
              "matchAny": false,
              "tags": [],
              "type": "dashboard"
            },
            "type": "dashboard"
          }
        ]
      },
      "description": "Loki logs panel dashbaord with prometheus variables ",
      "editable": true,
      "gnetId": null,
      "graphTooltip": 0,
      "id": 29,
      "iteration": 1633543882969,
      "links": [],
      "panels": [
        {
          "aliasColors": {},
          "bars": true,
          "dashLength": 10,
          "dashes": false,
          "datasource": "Loki",
          "description": "Time frame",
          "fieldConfig": {
            "defaults": {
              "unit": "dateTimeFromNow"
            },
            "overrides": []
          },
          "fill": 1,
          "fillGradient": 0,
          "gridPos": {
            "h": 4,
            "w": 24,
            "x": 0,
            "y": 0
          },
          "hiddenSeries": false,
          "id": 4,
          "legend": {
            "avg": false,
            "current": false,
            "max": false,
            "min": false,
            "show": true,
            "total": false,
            "values": false
          },
          "lines": false,
          "linewidth": 1,
          "nullPointMode": "null",
          "options": {
            "alertThreshold": true
          },
          "percentage": false,
          "pluginVersion": "8.1.5",
          "pointradius": 2,
          "points": false,
          "renderer": "flot",
          "seriesOverrides": [],
          "spaceLength": 10,
          "stack": false,
          "steppedLine": false,
          "targets": [
            {
              "expr": "sum(count_over_time({namespace=\"$namespace\", instance=~\"$pod\"} |~ \"$search\"[$__interval]))",
              "refId": "A"
            }
          ],
          "thresholds": [],
          "timeFrom": null,
          "timeRegions": [],
          "timeShift": null,
          "title": "Time Frame",
          "tooltip": {
            "shared": true,
            "sort": 0,
            "value_type": "individual"
          },
          "type": "graph",
          "xaxis": {
            "buckets": null,
            "mode": "time",
            "name": null,
            "show": true,
            "values": []
          },
          "yaxes": [
            {
              "$$hashKey": "object:37",
              "format": "dateTimeFromNow",
              "label": null,
              "logBase": 1,
              "max": null,
              "min": null,
              "show": false
            },
            {
              "$$hashKey": "object:38",
              "format": "short",
              "label": null,
              "logBase": 1,
              "max": null,
              "min": null,
              "show": true
            }
          ],
          "yaxis": {
            "align": false,
            "alignLevel": null
          }
        },
        {
          "datasource": "Loki",
          "description": "Live logs from Kubernetes pod with some delay.",
          "gridPos": {
            "h": 21,
            "w": 24,
            "x": 0,
            "y": 4
          },
          "id": 2,
          "options": {
            "dedupStrategy": "none",
            "enableLogDetails": true,
            "prettifyLogMessage": false,
            "showCommonLabels": false,
            "showLabels": false,
            "showTime": true,
            "sortOrder": "Descending",
            "wrapLogMessage": false
          },
          "targets": [
            {
              "expr": "{namespace=\"$namespace\", pod=~\"$pod\"} |~ \"$search\"",
              "refId": "A"
            }
          ],
          "title": "Logs Panel",
          "type": "logs"
        }
      ],
      "refresh": false,
      "schemaVersion": 30,
      "style": "dark",
      "tags": [],
      "templating": {
        "list": [
          {
            "allValue": null,
            "current": {
              "selected": false,
              "text": "penfield-app",
              "value": "penfield-app"
            },
            "datasource": null,
            "definition": "label_values(kube_pod_info, namespace)",
            "description": "K8 namespace",
            "error": null,
            "hide": 0,
            "includeAll": false,
            "label": "",
            "multi": false,
            "name": "namespace",
            "options": [],
            "query": {
              "query": "label_values(kube_pod_info, namespace)",
              "refId": "StandardVariableQuery"
            },
            "refresh": 1,
            "regex": "",
            "skipUrlSync": false,
            "sort": 0,
            "type": "query"
          },
          {
            "allValue": ".*",
            "current": {
              "selected": true,
              "text": [
                "All"
              ],
              "value": [
                "$__all"
              ]
            },
            "datasource": null,
            "definition": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
            "description": "Pods inside namespace",
            "error": null,
            "hide": 0,
            "includeAll": true,
            "label": null,
            "multi": true,
            "name": "pod",
            "options": [],
            "query": {
              "query": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
              "refId": "StandardVariableQuery"
            },
            "refresh": 1,
            "regex": "",
            "skipUrlSync": false,
            "sort": 0,
            "type": "query"
          },
          {
            "current": {
              "selected": true,
              "text": "",
              "value": ""
            },
            "description": "Search keywords",
            "error": null,
            "hide": 0,
            "label": null,
            "name": "search",
            "options": [
              {
                "selected": true,
                "text": "",
                "value": ""
              }
            ],
            "query": "",
            "skipUrlSync": false,
            "type": "textbox"
          }
        ]
      },
      "time": {
        "from": "now-1h",
        "to": "now"
      },
      "timepicker": {},
      "timezone": "",
      "title": "Loki / Dashboard for k8 Logs",
      "uid": "vmMdG2vnk",
      "version": 1
    }

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy the prometheus stack using helm:

helm upgrade \
    --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
    --namespace monitoring \
    --create-namespace \
    --version=59.0.0 \
    --values kube-prometheus-stack-values.yaml

For reference kube-prometheus-stack-values.yaml file looks like: Update the domain penfield.example.com with your FQDN.

kube-prometheus-stack-values.yaml
#### EKS Specific #####
## Component scraping the kubelet and kubelet-hosted cAdvisor
##
kubelet:
## This is disabled by default because container metrics are already exposed by cAdvisor
  servicemonitor:
    resource: false

## Component scraping the kube controller manager
##
kubeControllerManager:
  enabled: false

## Component scraping kube scheduler
##
kubeScheduler:
  enabled: false

## Component scraping kube proxy
##
kubeProxy:
  enabled: false
#####################

## Configuration for alertmanager
## ref: https://prometheus.io/docs/alerting/alertmanager/
##
alertmanager:
  enabled: false

## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
##
grafana:
  enabled: true
  ## Deploy default dashboards
  ##
  defaultDashboardsEnabled: true

  ## Timezone for the default dashboards
  ## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg
  ##
  defaultDashboardsTimezone: utc

  # adminPassword: prom-operator
  adminUser: admin
  adminPassword:
  admin:
    existingSecret: ""
    userKey: admin-user
    passwordKey: admin-password

  grafana.ini:
    server:
      domain: penfield.example.com
      root_url: "%(protocol)s://%(domain)s/grafana"
      serve_from_sub_path: true

  ingress:
    ## If true, Grafana Ingress will be created
    enabled: true
    ingressClassName: internal-ingress-nginx
    # annotations:
    #   nginx.ingress.kubernetes.io/rewrite-target: /$2
    labels: {}
    hosts:
      - penfield.example.com
    ## Path for grafana ingress
    path: "/grafana"
    pathType: ImplementationSpecific

  sidecar:
    dashboards:
      enabled: true
      label: grafana_dashboard
    datasources:
      enabled: true
      defaultDatasourceEnabled: true
      label: grafana_datasource

## Configuration for kube-state-metrics subchart
##
kube-state-metrics:
  namespaceOverride: ""
  rbac:
    create: true
  podSecurityPolicy:
    enabled: true

## Manages Prometheus and Alertmanager components
##
prometheusOperator:
  enabled: true
  ## Resource limits & requests
  ##
  resources: {}
  limits:
    cpu: 200m
    memory: 200Mi
  requests:
    cpu: 100m
    memory: 100Mi
  nodeSelector: {}

## Deploy a Prometheus instance
##
prometheus:

  enabled: true

  ## Service account for Prometheuses to use.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
  ##
  serviceAccount:
    create: true
    name: "prometheus-operator-prometheus"
    annotations: {}

  ingress:
    enabled: false

  ## Settings affecting prometheusSpec
  ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
  ##
  prometheusSpec:
    ruleSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false
    podMonitorSelectorNilUsesHelmValues: false
    probeSelectorNilUsesHelmValues: false

    ## How long to retain metrics
    ##
    retention: 15d

    ## Maximum size of metrics
    ##
    retentionSize: ""

    ## Resource limits & requests
    ##
    resources: {}
    requests:
      memory: 400Mi
    limits:
      memory: 800Mi

    ## Prometheus StorageSpec for persistent data
    ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md
    ##
    storageSpec:
    ## Using PersistentVolumeClaim
    ##
    volumeClaimTemplate:
      #  spec:
      #    storageClassName: managed-premium-pv
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 50Gi

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

After this deployment Grafana and prometheus will be deployed. You should be able to access grafana using https://FQDN/grafana Grafana will set random password at creation. You can find the grafana password using this command:

kubectl get secret --namespace monitoring kube-prometheus-stack-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

Deploy Loki for Logging:

This will setup loki stack which includes loki and promtail.

Add helm repo: helm repo add grafana https://grafana.github.io/helm-charts
Update helm repository: helm repo update
Search repo: helm search repo grafana/loki-stack
Deploy the logging stack using helm:

helm upgrade \
    --install loki-stack grafana/loki-stack \
    --namespace monitoring \
    --create-namespace \
    --version=2.10.2 \
    --values loki-stack-values.yaml

For reference loki-stack-values.yaml file looks like:

loki-stack-values.yaml
loki:
  enabled: true
  isDefault: false
  persistence:
    enabled: true
    accessModes:
    - ReadWriteOnce
    size: 50Gi
    # storageClassName: managed-premium-pv
  config:
    # This will work with newer version of loki like loki 3.0.0,
    # table_manager will be depreciated in the newer version
    # limits_config:
    #   retention_period: 48h

    compactor:
      retention_enabled: true
      retention_delete_delay: 2h
    table_manager:
      retention_deletes_enabled: true
      retention_period: 15d #15 days

promtail:
  enabled: true

fluent-bit:
  enabled: false

## This creates Loki Data source configmap and adds in kube prometheus stack
grafana:
  enabled: false
  sidecar:
    datasources:
      enabled: true
      maxLines: 1000

prometheus:
  enabled: false

filebeat:
  enabled: false

logstash:
  enabled: false

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Penfield App

To deploy penfield app please deploy in the following order:

Create penfield-app namespace (skip if you have created it already)
Deploy penfield-app regcred
Deploy penfield-app secrets
Deploy kafka cluster
Deploy penfield app

Create penfield-app namespace (skip if you have created it already)

To create namespace, run the following command: kubectl create namespace penfield-app

Deploy penfield-app regcred (skip if it is already created)

Regcred will be used to fetch the images from PenfieldAI image repository. To deploy regcred, run the following command:

tip

Username and Password must be provided by PenfieldAI.

kubectl create secret docker-registry regcred \
  --docker-server=https://registry.gitlab.com \
  --docker-username=<Username> \
  --docker-password=<Password> \
  -n penfield-app

Deploy penfield-app secrets

Secrets will be used to fetch the sensitive data inside kubernetes pods. penfield-secrets is already created. This step needs to modify it and add more secrets that is required.

tip

While depploying secrets, not all of the variables are needed, depend on which service is being deployed in penfield app, you will only need specific variables.

For reference penfield-secrets.yaml file looks like:

apiVersion: v1
kind: Secret
metadata:
  name: penfield-secrets
type: Opaque
data:
  AZURE_OPENAI_API_KEY: <base 64 encoded value> # Needed for multiple services [Pathfinder Only]
  MINIO_ACCESS_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
  MINIO_SECRET_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
  POSTGRES_PASSWORD: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]
  POSTGRES_USER: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]

To deploy secrets, run the following command:

kubectl apply -f penfield-secrets.yaml -n penfield-app

Deploy kafka cluster

At this stage kafka operator is already installed, now use the following command to install kafka cluster in penfield-app namespace:

kubectl apply -f kafka-cluster.yaml -n penfield-app

For reference kafka-cluster.yaml file looks like:

kafka-cluster.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    version: 3.7.0
    replicas: 1
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    config:
      offsets.topic.replication.factor: 1
      transaction.state.log.replication.factor: 1
      transaction.state.log.min.isr: 1
      inter.broker.protocol.version: "3.7"
      log.retention.bytes: 9663676416 # 9GB, per Topic, Currently we have ~10 topics. So 90GB in total disk for 10 topics, logs will be deleted once storage reaches to 9GB for each topic.

    storage:
      type: jbod
      volumes:
      - id: 0
        type: persistent-claim
        size: 100Gi
        deleteClaim: false
  zookeeper:
    replicas: 1
    storage:
      type: persistent-claim
      size: 100Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy penfield app

tip

Use the same Username and Password provided by PenfieldAI which are used to deploy regcred in the previous step.

PenfieldAI app will be deployed using helm. Once you receive helm chart, go to helm chart directory and perform below steps.

Add helm repo: helm repo add --username <username> --password <password> penfieldai https://gitlab.com/api/v4/projects/28749615/packages/helm/stable
Update helm repository: helm repo update
Find helm repository: helm search repo penfieldai/penfieldai
Deploy penfieldai via helm chart:

helm upgrade \
    --install penfield-app penfieldai/penfieldai \
    --namespace penfield-app \
    --create-namespace \
    --version=1.0.7 \
    --values penfield-values.yaml

tip

For penfield-values.yaml file look into the helmchart values, you can use helm show values penfieldai/penfieldai > penfield-values.yaml to export values file and edit as per your requirements. Depend on what type of deployment it is, you only need to enable required services in values file accordingly. You will need either Pathfinder or Pathfinder+ITSM deployment. To prepare value file please refer this guide.

Install using ArgoCD (recommended)​

Install using Helmfile​

Uninstall deployment​

Grafana​

Install using Helm​

Nginx Ingress Controller​

Deploy Kafka controller​

Deploy Postgres DB​

Deploy penfield-app secrets​

Deploy Kratos (authentication backend)​

Deploy Milvus​

Deploy Minio​

Deploy Airflow​

Deploy Monitoring stack​

Deploy Prometheus and Grafana:​

Deploy Loki for Logging:​

Deploy Penfield App​

Create penfield-app namespace (skip if you have created it already)​

Deploy penfield-app regcred (skip if it is already created)​

Deploy penfield-app secrets​

Deploy kafka cluster​

Deploy penfield app​

Install using ArgoCD (recommended)

Install using Helmfile

Uninstall deployment

Grafana

Install using Helm

Nginx Ingress Controller

Deploy Kafka controller

Deploy Postgres DB

Deploy penfield-app secrets

Deploy Kratos (authentication backend)

Deploy Milvus

Deploy Minio

Deploy Airflow

Deploy Monitoring stack

Deploy Prometheus and Grafana:

Deploy Loki for Logging:

Deploy Penfield App

Create penfield-app namespace (skip if you have created it already)

Deploy penfield-app regcred (skip if it is already created)

Deploy penfield-app secrets

Deploy kafka cluster

Deploy penfield app