Skip to main content

K8s install application

Once Cloud infrastructure is put in place and access to kubernetes cluster is verified, then you can start with setting up required dependencies and Penfield application on kubernetes cluster. There are two ways to install it on kubernetes cluster:

Clone this Github repo which contains the code that you need and follow the below steps:

warning

Never ever commit your actual secret values in version control like git.
secrets folder contains the secrets, make sure you never commit this folder.

Make sure you have the following information handy before you proceed:

NameDescription
DOCKER_USERNAMEYou should have received this username from Penfield.
DOCKER_PASSWORDYou should have received this password from Penfield.
AZURE_OPENAI_API_KEYYour Azure OpenAI Key from your Azure OpenAI deployment.
AZURE_OPENAI_ENDPOINTYour Azure Open AI Endpoint URL.
AZURE_OPENAI_DEPLOYMENTDeployment name of your Azure OpenAI deployment. e.g. gpt-35-turbo-ca
AZURE_OPENAI_MODELThe model name of the corresponding deployment. e.g. gpt-35-turbo
FQDNFully Qualified Domain Name that you are using for your Penfield application.
CLOUD_PROVIDERYour cloud provider. e.g. azure or aws
SSL_CERTIFICATEOnly Required if cloud provider is Azure. Create a file with name ssl.key and ssl.crt in /certs directory with your FQDN SSL certificate.
RELEASE_TAGYou should have received this from Penfield. The release tag that you want to use for the Penfield application. e.g. release-1.0.0
DEPLOYMENT_TIERDeployment tier that you want to use. e.g. pathfinder or standard. pathfinder tier is used for Pathfinder only deployment and standard tier is used for Pathfinder + ITSM deployment.
INTEGRATION_SOURCEThe type of ITSM integration you are using. e.g. helix or sentinel or jira.

Once you have the above information, Go to the folder where you have clone github repo. Execute the bash script in your terminal by running the following commands and follow the instructions.

./run.sh

Once done please come back here for the next steps.

Remove deployment

To remove all the deployments:

helmfile destroy

Grafana

Grafana got deployed as a part of kube-prometheus-stack. You should be able to access grafana using https://FQDN/grafana Grafana will set random password at creation. You can find the grafana password using this command:

kubectl get secret \
--namespace monitoring kube-prometheus-stack-grafana \
-o jsonpath="{.data.admin-password}" | base64 --decode ; echo

info

If you deployed the dependencies using Helmfile please skip the next section and go to next page.


Install using Helm

Following services needs to be installed on kubernetes cluster:

Nginx Ingress Controller

Nginx Ingress controller will be used to route the ingress traffic to cluster. You can use helm to install Nginx Ingress Controller with custom values. Run the below commands:

  • Add helm repo on local: helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
  • Update helm repository: helm repo update
  • Find nginx ingress controller: helm search repo ingress-nginx
  • Install nginx ingress controller:
helm upgrade \
--install internal-ingress-nginx ingress-nginx/ingress-nginx \
--namespace internal-ingress-nginx \
--create-namespace \
--version=4.10.1 \
--values internal_ingress_nginx_values.yaml
  • Values file internal_ingress_nginx_values.yaml look like
caution

Depending on your cloud provider, you can uncomment either NodePort Implementation (For AWS) or Internal LB Implementation (For Azure):

internal_ingress_nginx_values.yaml
controller:
config:
server-tokens: "false"
use-forwarded-headers: "true"
extraArgs:
metrics-per-host: "false"
ingressClassResource:
name: internal-ingress-nginx
kind: DaemonSet
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- sleep 60; /sbin/nginx -c /etc/nginx/nginx.conf -s quit; while pgrep -x nginx; do sleep 1; done
metrics:
enabled: true

##### NodePort Implementation (For AWS)
# service:
# enableHttps: false
# externalTrafficPolicy: Local
# type: NodePort
# nodePorts:
# http: "31080"


##### Internal LB Implementation (For Azure)

# service:
# enabled: true
# annotations: {
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# }
# type: LoadBalancer

#####


defaultBackend:
enabled: true
resources:
requests:
memory: 16Mi

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Kafka controller

Kafka is needed to run the PenfieldAI application. In order to setup kafka cluster, you need to setup Kafka controller and cluster. First you will setup kafka controller. Then, later in this section, you will setup kafka cluster.

You can use strimzi kafka for deployment using helm. Perform below steps to deploy kafka, Run the following commands:

  • Add helm repo on local: helm repo add strimzi https://strimzi.io/charts/

  • Update helm repository: helm repo update

  • Find kafka repository: helm search repo strimzi

  • Create a kafka operator:

    helm upgrade \
    --install strimzi-kafka strimzi/strimzi-kafka-operator \
    --namespace strimzi-kafka \
    --create-namespace \
    --version 0.41.0 \
    --values strimzi-kafka-operator-values.yaml

    strimzi-kafka-operator-values.yaml file looks like:

    strimzi-kafka-operator-values.yaml

    watchNamespaces: []
    watchAnyNamespace: true

    ## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

    Wait for few minutes until kafka operator starts running, this is needed to deploy kafka cluster.

    Verify: It will take few minutes for kafka operator to fully start running. To verify the setup you can run the following command:

    kubectl get pods -n strimzi-kafka

    If you will see the output like below, it means kafka operator is up and running:

    NAME                                          READY   STATUS    RESTARTS   AGE
    strimzi-cluster-operator-c8ddd59d-rwffr 1/1 Running 0 4m8s

Deploy Postgres DB

You need to create new credentials that you want to use for Postgres DB that will be deployed in cluster. Same secret will be modified when you deploy Penfield-app later on.

Deploy penfield-app secrets

Create a penfield-secrets.yaml file with below code:

penfield-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: penfield-secrets
type: Opaque
data:
# All values must be base64 encoded.
POSTGRES_USER: <base 64 encoded value>
POSTGRES_PASSWORD: <base 64 encoded value>
info

Use this command to encode plain text string into base64 string:
echo -n 'plainTextStringToEncode' | base64

To deploy secrets, run the following command:

kubectl create namespace penfield-app
kubectl apply -f penfield-secrets.yaml -n penfield-app

To deploy Postgres DB in the Kubernetes cluster please follow the below steps:

  • Add helm repo: helm repo add bitnami https://charts.bitnami.com/bitnami
  • Update helm repository: helm repo update
  • Search repo: helm search repo bitnami/postgresql
  • Deploy postgres using helm:
    helm upgrade \
    --install analytics bitnami/postgresql \
    --namespace penfield-app \
    --create-namespace \
    --version 11.9.13 \
    --values penfield-app/postgres-values.yaml

For reference postgres-values.yaml file looks like:

postgres-values.yaml
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
global:
## Name of the storage class you want to use.
## eg: gp3-encrypted-pv or managed-premium-pv
storageClass: ""
postgresql:
auth:
existingSecret: "penfield-secrets"
secretKeys:
adminPasswordKey: "POSTGRES_PASSWORD"

primary:
persistence:
enabled: true
## Name of the storage class you want to use.
## eg: gp3-encrypted-pv or managed-premium-pv
storageClass: ""
accessModes:
- ReadWriteOnce
size: 50Gi
caution

Postgres deployment will depend on penfield-app secrets that gets created when we deploy penfield app, So if pod did not in running state please deploy the secrets as mentioned in the next section and then verify postgres deployment.

info

Once pod is setup, export the postgres password: export POSTGRES_PASSWORD=$(kubectl get secret --namespace penfield-app analytics-postgresql -o jsonpath="{.data.postgres-password}" | base64 --decode)

To login to postgres from local:

kubectl run analytics-postgresql-client --rm --tty -i --restart='Never' \
--namespace penfield-app --image docker.io/bitnami/postgresql:14.5.0-debian-11-r21 \
--env="PGPASSWORD=$POSTGRES_PASSWORD" \
--command -- psql --host analytics-postgresql -U postgres -d postgres -p 5432
tip

Do not forget to add POSTGRES_USER and POSTGRES_PASSWORD to penfield-secrets when you will create secrets in the upcoming steps.

Deploy Kratos (authentication backend)

Kratos is needed for dashboard authentication. In order to deploy Kratos, postgres database is required. Kratos is deployed through helm charts. Perform below steps to deploy kratos, Run the following commands:

  • Add helm repo on local: helm repo add ory https://k8s.ory.sh/helm/charts

  • Update helm repository: helm repo update

  • Find kafka repository: helm search repo ory/kratos

  • Create the namespace: kubectl create namespace kratos

  • Create the secrets for sensitive info kubectl apply -f kratos-secrets.yaml -n kratos

    kratos-secrets.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: kratos-secrets
    annotations:
    helm.sh/hook-weight: "0"
    helm.sh/hook: "pre-install, pre-upgrade"
    helm.sh/hook-delete-policy: "before-hook-creation"
    helm.sh/resource-policy: "keep"
    type: Opaque
    data:
    dsn: <base 64 encoded value> # e.g: postgres://<username>:<password>@analytics-postgresql.penfield-app.svc.cluster.local:5432/postgres
    oidcConfig: <base 64 encoded value> # Only needed if you enable SSO, Update the values in this string: [{"id": "microsoft","provider": "microsoft","client_id": "<Update_client_id_value>","client_secret": "<Update_client_secret_value>","microsoft_tenant": "<Update_microsoft_tenant_value>","mapper_url": "base64://bG9jYWwgY2xhaW1zID0gc3RkLmV4dFZhcignY2xhaW1zJyk7CnsKICBpZGVudGl0eTogewogICAgdHJhaXRzOiB7CiAgICAgIFtpZiAnZW1haWwnIGluIGNsYWltcyB0aGVuICdlbWFpbCcgZWxzZSBudWxsXTogY2xhaW1zLmVtYWlsLAogICAgfSwKICB9LAp9Cg==","scope": ["email"]}]
  • Deploy kratos via helm chart:

    helm upgrade \
    --install kratos ory/kratos \
    --namespace kratos \
    --create-namespace \
    --version=0.43.1 \
    --values kratos-values.yaml
    tip
    • Make sure the PostgresDB can be reached from Kubernetes cluster.
    • Make sure the kratos-values.yaml file is configured correctly. Replace FQDN with correct values.
      • Replace penfield.example.com with your correct FQDN everywhere.
      • Make sure postgres DSN value is correct.
      • Use unique value for cookie under secrets.
      • (optional) Proper SMTP mail server, This is not needed.

    For reference kratos-values.yaml file looks like:

    kratos-values.yaml
    ingress:
    public:
    enabled: true
    className: internal-ingress-nginx
    annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    hosts:
    - host: "penfield.example.com" # Replace it with your FQDN
    paths:
    - path: /public(/|$)(.*)
    pathType: ImplementationSpecific

    secret:
    enabled: true
    nameOverride: "kratos-secrets"

    kratos:
    automigration:
    enabled: true
    config:
    serve:
    public:
    base_url: https://penfield.example.com/public/
    cors:
    enabled: true
    admin:
    base_url: https://penfield.example.com/admin/

    selfservice:
    default_browser_return_url: https://penfield.example.com/web/
    allowed_return_urls:
    - https://penfield.example.com/web/

    methods:
    password:
    enabled: true
    link:
    enabled: true
    # Enable OIDC only if you use SSO
    oidc:
    enabled: false
    config:
    providers: []

    flows:
    error:
    ui_url: https://penfield.example.com/web/error

    settings:
    ui_url: https://penfield.example.com/web/settings
    privileged_session_max_age: 15m

    recovery:
    enabled: true
    ui_url: https://penfield.example.com/web/recovery

    verification:
    enabled: true
    ui_url: https://penfield.example.com/web/verify
    after:
    default_browser_return_url: https://penfield.example.com/web/

    logout:
    after:
    default_browser_return_url: https://penfield.example.com/web/welcome

    login:
    ui_url: https://penfield.example.com/web/auth/login
    lifespan: 10m

    registration:
    lifespan: 10m
    ui_url: https://penfield.example.com/web/auth/registration
    after:
    password:
    hooks:
    - hook: session

    log:
    level: trace
    format: json
    leak_sensitive_values: true

    hashers:
    argon2:
    parallelism: 1
    memory: 128MB
    iterations: 2
    salt_length: 16
    key_length: 16

    identity:
    default_schema_id: admin
    schemas:
    - id: admin
    url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBZG1pbiIsCiAgInR5cGUiOiAib2JqZWN0IiwKICAicHJvcGVydGllcyI6IHsKICAgICJ0cmFpdHMiOiB7CiAgICAgICJ0eXBlIjogIm9iamVjdCIsCiAgICAgICJwcm9wZXJ0aWVzIjogewogICAgICAgICJlbWFpbCI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAiZm9ybWF0IjogImVtYWlsIiwKICAgICAgICAgICJ0aXRsZSI6ICJFLU1haWwiLAogICAgICAgICAgIm1pbkxlbmd0aCI6IDMsCiAgICAgICAgICAib3J5LnNoL2tyYXRvcyI6IHsKICAgICAgICAgICAgImNyZWRlbnRpYWxzIjogewogICAgICAgICAgICAgICJwYXNzd29yZCI6IHsKICAgICAgICAgICAgICAgICJpZGVudGlmaWVyIjogdHJ1ZQogICAgICAgICAgICAgIH0KICAgICAgICAgICAgfSwKICAgICAgICAgICAgInZlcmlmaWNhdGlvbiI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9LAogICAgICAgICAgICAicmVjb3ZlcnkiOiB7CiAgICAgICAgICAgICAgInZpYSI6ICJlbWFpbCIKICAgICAgICAgICAgfQogICAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgImZpcnN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiRmlyc3QgTmFtZSIKICAgICAgICB9LAogICAgICAgICJsYXN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiTGFzdCBOYW1lIgogICAgICAgIH0KICAgICAgICB9LAogICAgICAgICJyZXF1aXJlZCI6IFsiZW1haWwiXSwKICAgICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgICB9CiAgICB9CiAgfQo=
    - id: analyst
    url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBbmFseXN0IiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgInJlcXVpcmVkIjogWyJlbWFpbCJdLAogICAgICAgICJhZGRpdGlvbmFsUHJvcGVydGllcyI6IGZhbHNlCiAgICAgIH0KICAgIH0KICB9Cg==
    - id: automationengineer
    url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBdXRvbWF0aW9uIEVuZ2luZWVyIiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICB9LAogICAgICAicmVxdWlyZWQiOiBbCiAgICAgICAgImVtYWlsIgogICAgICBdLAogICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgfQogIH0KfQo=

    courier:
    smtp:
    connection_uri: smtpConnectionURI

    ## Enable below config if you enable SSO
    # deployment:
    # extraEnv:
    # - name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS
    # valueFrom:
    # secretKeyRef:
    # name: kratos-secrets
    # key: oidcConfig
    cleanup:
    enabled: true
    batchSize: 20000

Deploy Milvus

Milvus is used as vector database. In order to deploy Milvus, you can use helm. Run the following commands:

  • Add helm repo on local: helm repo add milvus https://zilliztech.github.io/milvus-helm/

  • Update helm repository: helm repo update

  • Find milvus repository: helm search repo milvus/milvus

  • Deploy milvus using helm:

    helm upgrade \
    --install milvus milvus/milvus \
    --namespace milvus \
    --create-namespace \
    --version=4.1.32 \
    --values milvus-values.yaml

    For reference milvus-values.yaml file looks like:

    milvus-values.yaml
    ## Enable or disable Milvus Cluster mode
    cluster:
    enabled: false

    etcd:
    replicaCount: 1
    persistence:
    storageClass: managed-premium-pv
    size: 10Gi

    minio:
    mode: standalone
    persistence:
    storageClass: managed-premium-pv
    size: 50Gi

    pulsar:
    enabled: false

    standalone:
    persistence:
    persistentVolumeClaim:
    storageClass: managed-premium-pv
    size: 50Gi

Deploy Minio

To deploy Minio, you can use helm. Run the following commands:

  • Add helm repo on local: helm repo add minio https://charts.min.io

  • Update helm repository: helm repo update

  • Find minio repository: helm search repo minio/minio

  • Create namespace for minio: kubectl create namespace minio

  • Create secrets for minio: kubectl apply -f minio-secrets.yaml -n minio

    minio-secrets.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: minio-user-secret
    type: Opaque
    data:
    rootUser: <base 64 encoded value> # Set a random 32 alpha numeric value.
    rootPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
    userPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
    secretKey: <base 64 encoded value> # Set a random 40 or more alpha numeric value.
  • Deploy minio using helm:

    helm upgrade \
    --install minio minio/minio \
    --namespace minio \
    --create-namespace \
    --version=5.2.0 \
    --values minio-values.yaml

    For reference minio-values.yaml file looks like:

    minio-values.yaml
    ## Set default rootUser, rootPassword
    ## .data.rootUser and .data.rootPassword are mandatory,
    existingSecret: minio-secrets

    # Number of MinIO containers running
    replicas: 2

    ## List of users to be created after minio install
    users:
    - accessKey: bucketuser
    existingSecret: minio-secrets
    existingSecretKey: userPassword
    policy: readwrite

    ## List of service accounts to be created after minio install
    svcaccts:
    - accessKey: random-32-alphanumeric-characters-accessKey
    existingSecret: minio-secrets
    existingSecretKey: secretKey
    user: bucketuser

    ## List of buckets to be created after minio install
    buckets:
    - name: general
    policy: none
    purge: false
    versioning: true
    objectlocking: false

    ## Enable persistence using Persistent Volume Claims
    persistence:
    enabled: true
    storageClass: managed-premium-pv
    accessMode: ReadWriteOnce
    size: 10Gi

Deploy Airflow

To deploy Airflow, you can use helm. Run the following commands:

  • Add helm repo on local: helm repo add apache-airflow https://airflow.apache.org

  • Update helm repository: helm repo update

  • Find airflow repository: helm search repo apache-airflow/airflow

  • Create namespace for airflow: kubectl create namespace airflow

  • Create secrets for airflow. Airflow needs multiple secrets to be created. You can create secrets using below command:

    kubectl apply -f airflow-secrets.yaml -n airflow

    airflow-secrets.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: airflow-secrets
    type: Opaque
    data:
    fernet-key: <base 64 encoded value> # Set a random value.
    webserver-secret-key: <base 64 encoded value> # Set a random value.
    postgres-password: <base 64 encoded value> # Set a random password value.
    connection: <base 64 encoded value> # e.g: postgresql://postgres:<password>@airflow-postgresql:5432/postgres

    kubectl apply -f penfield-secrets.yaml -n airflow

    airflow-secrets.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: penfield-secrets
    type: Opaque
    data:
    AZURE_OPENAI_API_KEY: <base 64 encoded value> # Use the AZURE OPENAI KEY

    Configmap for airflow: This is used to pass env variable to executors. kubectl apply -f airflow-configmap.yaml -n airflow

    airflow-configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: penfield-configmap
    data:
    MONGODB_URI: mongodb://mongo-db.penfield-app.svc.cluster.local:27017
    MONGODB_NAME: penfield_appdb
    NLP_ENDPOINT: http://nlp-service.penfield-app.svc.cluster.local:3000/api/v1/ner
    OPENAI_API_TYPE: azure
    AZURE_OPENAI_API_VERSION: 2023-05-15
    AZURE_OPENAI_ENDPOINT: <URL of the Azure Open AI Endpoint>
    AZURE_OPENAI_DEPLOYMENT: <Deployment under the Azure OpenAI resource> #e.g. gpt-35-turbo-ca
    AZURE_OPENAI_MODEL: <The model name of the corresponding deployment> # e.g. gpt-35-turbo

    ## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

  • Deploy airflow using helm:

    helm upgrade \
    --install airflow apache-airflow/airflow \
    --namespace airflow \
    --create-namespace \
    --version=1.13.1 \
    --values airflow-values.yaml

    For reference airflow-values.yaml file looks like:

    airflow-values.yaml
    # Default airflow tag to deploy
    defaultAirflowTag: "2.9.0"

    # Default airflow digest. If specified, it takes precedence over tag
    defaultAirflowDigest: ~

    # Airflow version (Used to make some decisions based on Airflow Version being deployed)
    airflowVersion: "2.9.0"

    # Images
    images:
    airflow:
    repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-server
    tag: release-1.0.5
    # pullPolicy: Always
    pod_template:
    repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-worker
    tag: release-1.0.5
    # pullPolicy: Always

    registry:
    secretName: regcred

    config:
    webserver:
    base_url: http://penfield.example.com/airflow
    kubernetes_executor:
    worker_container_repository: '{{ .Values.images.pod_template.repository | default .Values.defaultAirflowRepository }}'
    worker_container_tag: '{{ .Values.images.pod_template.tag | default .Values.defaultAirflowTag }}'

    env:
    - name: "AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG"
    value: "1"
    - name: "AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT"
    value: "false"
    extraEnvFrom: |
    - configMapRef:
    name: penfield-configmap
    - secretRef:
    name: penfield-secrets

    # Airflow executor: One of: LocalExecutor, LocalKubernetesExecutor, CeleryExecutor, KubernetesExecutor, CeleryKubernetesExecutor
    executor: KubernetesExecutor

    # Ingress configuration
    ingress:
    # Configs for the Ingress of the web Service
    web:
    enabled: false
    annotations: {}
    path: "/airflow"
    pathType: "Prefix"
    hosts:
    - name: penfield.example.com
    ingressClassName: "internal-ingress-nginx"
    precedingPaths: []
    succeedingPaths: []

    # Fernet key settings
    # Note: fernetKey can only be set during install, not upgrade
    # Airflow uses Fernet to encrypt passwords in the connection configuration and the variable configuration.
    fernetKey: fernet-key
    fernetKeySecretName: airflow-secrets


    # Flask secret key for Airflow Webserver: `[webserver] secret_key` in airflow.cfg
    webserverSecretKey: webserver-secret-key
    webserverSecretKeySecretName: airflow-secrets

    # Airflow Worker Config
    workers:
    # Number of airflow celery workers in StatefulSet
    replicas: 1

    persistence:
    # Enable persistent volumes
    enabled: true
    # Volume size for worker StatefulSet
    size: 10Gi
    storageClassName: managed-premium-pv
    # Execute init container to chown log directory.
    # This is currently only needed in kind, due to usage
    # of local-path provisioner.
    fixPermissions: false
    # Annotations to add to worker volumes
    annotations: {}
    # Detailed default security context for persistence for container level
    securityContexts:
    container: {}
    # container level lifecycle hooks
    containerLifecycleHooks: {}

    # Configuration for postgresql subchart
    # Not recommended for production
    postgresql:
    enabled: true
    auth:
    enablePostgresUser: true
    existingSecret: airflow-secrets
    primary:
    ## PostgreSQL Primary persistence configuration
    persistence:
    enabled: true
    storageClass: "managed-premium-pv"
    size: 10Gi

    # Configuration for the redis provisioned by the chart
    redis:
    enabled: false

    # Airflow database & redis config
    data:
    metadataSecretName: airflow-secrets

    # Airflow Triggerer Config
    triggerer:
    enabled: true
    # Number of airflow triggerers in the deployment
    replicas: 1
    # Max number of old replicasets to retain
    revisionHistoryLimit: ~
    persistence:
    enabled: true
    size: 10Gi
    storageClassName: managed-premium-pv

    # StatsD settings
    statsd:
    enabled: false

    # This runs as a CronJob to cleanup old pods.
    cleanup:
    enabled: true
    schedule: "@daily"

    logs:
    persistence:
    enabled: false

    ## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Monitoring stack

Deploy Prometheus and Grafana:

Premethus is used to collect metrics and Grafana to visualize it. This will be helpful for troubleshooting. To deploy Prometheus + Grafana, run the following commands:

  • Add helm repo: helm repo add rometheus-community https://prometheus-community.github.io/helm-charts
  • Update helm repository: helm repo update
  • Find airflow repository: helm search repo prometheus-community/kube-prometheus-stack
  • Deploy configmap for custom dashboard:
kubectl create ns monitoring
kubectl apply -f loki-logs-dashboard.yaml -n monitoring

For reference loki-logs-dashboard.yaml file looks like:

loki-logs-dashboard.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-logs-grafana-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
loki-dashboard.json: |-
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Loki logs panel dashbaord with prometheus variables ",
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 29,
"iteration": 1633543882969,
"links": [],
"panels": [
{
"aliasColors": {},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "Loki",
"description": "Time frame",
"fieldConfig": {
"defaults": {
"unit": "dateTimeFromNow"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 4,
"w": 24,
"x": 0,
"y": 0
},
"hiddenSeries": false,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": false,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "8.1.5",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(count_over_time({namespace=\"$namespace\", instance=~\"$pod\"} |~ \"$search\"[$__interval]))",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Time Frame",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:37",
"format": "dateTimeFromNow",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
},
{
"$$hashKey": "object:38",
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"datasource": "Loki",
"description": "Live logs from Kubernetes pod with some delay.",
"gridPos": {
"h": 21,
"w": 24,
"x": 0,
"y": 4
},
"id": 2,
"options": {
"dedupStrategy": "none",
"enableLogDetails": true,
"prettifyLogMessage": false,
"showCommonLabels": false,
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"targets": [
{
"expr": "{namespace=\"$namespace\", pod=~\"$pod\"} |~ \"$search\"",
"refId": "A"
}
],
"title": "Logs Panel",
"type": "logs"
}
],
"refresh": false,
"schemaVersion": 30,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"allValue": null,
"current": {
"selected": false,
"text": "penfield-app",
"value": "penfield-app"
},
"datasource": null,
"definition": "label_values(kube_pod_info, namespace)",
"description": "K8 namespace",
"error": null,
"hide": 0,
"includeAll": false,
"label": "",
"multi": false,
"name": "namespace",
"options": [],
"query": {
"query": "label_values(kube_pod_info, namespace)",
"refId": "StandardVariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"allValue": ".*",
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": null,
"definition": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
"description": "Pods inside namespace",
"error": null,
"hide": 0,
"includeAll": true,
"label": null,
"multi": true,
"name": "pod",
"options": [],
"query": {
"query": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
"refId": "StandardVariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"current": {
"selected": true,
"text": "",
"value": ""
},
"description": "Search keywords",
"error": null,
"hide": 0,
"label": null,
"name": "search",
"options": [
{
"selected": true,
"text": "",
"value": ""
}
],
"query": "",
"skipUrlSync": false,
"type": "textbox"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Loki / Dashboard for k8 Logs",
"uid": "vmMdG2vnk",
"version": 1
}

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
  • Deploy the prometheus stack using helm:
helm upgrade \
--install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--version=59.0.0 \
--values kube-prometheus-stack-values.yaml

For reference kube-prometheus-stack-values.yaml file looks like: Update the domain penfield.example.com with your FQDN.

kube-prometheus-stack-values.yaml
#### EKS Specific #####
## Component scraping the kubelet and kubelet-hosted cAdvisor
##
kubelet:
## This is disabled by default because container metrics are already exposed by cAdvisor
servicemonitor:
resource: false

## Component scraping the kube controller manager
##
kubeControllerManager:
enabled: false

## Component scraping kube scheduler
##
kubeScheduler:
enabled: false

## Component scraping kube proxy
##
kubeProxy:
enabled: false
#####################

## Configuration for alertmanager
## ref: https://prometheus.io/docs/alerting/alertmanager/
##
alertmanager:
enabled: false

## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
##
grafana:
enabled: true
## Deploy default dashboards
##
defaultDashboardsEnabled: true

## Timezone for the default dashboards
## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg
##
defaultDashboardsTimezone: utc

# adminPassword: prom-operator
adminUser: admin
adminPassword:
admin:
existingSecret: ""
userKey: admin-user
passwordKey: admin-password

grafana.ini:
server:
domain: penfield.example.com
root_url: "%(protocol)s://%(domain)s/grafana"
serve_from_sub_path: true

ingress:
## If true, Grafana Ingress will be created
enabled: true
ingressClassName: internal-ingress-nginx
# annotations:
# nginx.ingress.kubernetes.io/rewrite-target: /$2
labels: {}
hosts:
- penfield.example.com
## Path for grafana ingress
path: "/grafana"
pathType: ImplementationSpecific

sidecar:
dashboards:
enabled: true
label: grafana_dashboard
datasources:
enabled: true
defaultDatasourceEnabled: true
label: grafana_datasource

## Configuration for kube-state-metrics subchart
##
kube-state-metrics:
namespaceOverride: ""
rbac:
create: true
podSecurityPolicy:
enabled: true

## Manages Prometheus and Alertmanager components
##
prometheusOperator:
enabled: true
## Resource limits & requests
##
resources: {}
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
nodeSelector: {}

## Deploy a Prometheus instance
##
prometheus:

enabled: true

## Service account for Prometheuses to use.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
##
serviceAccount:
create: true
name: "prometheus-operator-prometheus"
annotations: {}

ingress:
enabled: false

## Settings affecting prometheusSpec
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
##
prometheusSpec:
ruleSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
probeSelectorNilUsesHelmValues: false

## How long to retain metrics
##
retention: 15d

## Maximum size of metrics
##
retentionSize: ""

## Resource limits & requests
##
resources: {}
requests:
memory: 400Mi
limits:
memory: 800Mi

## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md
##
storageSpec:
## Using PersistentVolumeClaim
##
volumeClaimTemplate:
# spec:
# storageClassName: managed-premium-pv
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

After this deployment Grafana and prometheus will be deployed. You should be able to access grafana using https://FQDN/grafana Grafana will set random password at creation. You can find the grafana password using this command:

kubectl get secret --namespace monitoring kube-prometheus-stack-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

Deploy Loki for Logging:

This will setup loki stack which includes loki and promtail.

  • Add helm repo: helm repo add grafana https://grafana.github.io/helm-charts
  • Update helm repository: helm repo update
  • Search repo: helm search repo grafana/loki-stack
  • Deploy the logging stack using helm:
helm upgrade \
--install loki-stack grafana/loki-stack \
--namespace monitoring \
--create-namespace \
--version=2.10.2 \
--values loki-stack-values.yaml

For reference loki-stack-values.yaml file looks like:

loki-stack-values.yaml
loki:
enabled: true
isDefault: false
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 50Gi
# storageClassName: managed-premium-pv
config:
# This will work with newer version of loki like loki 3.0.0,
# table_manager will be depreciated in the newer version
# limits_config:
# retention_period: 48h

compactor:
retention_enabled: true
retention_delete_delay: 2h
table_manager:
retention_deletes_enabled: true
retention_period: 15d #15 days

promtail:
enabled: true

fluent-bit:
enabled: false

## This creates Loki Data source configmap and adds in kube prometheus stack
grafana:
enabled: false
sidecar:
datasources:
enabled: true
maxLines: 1000

prometheus:
enabled: false

filebeat:
enabled: false

logstash:
enabled: false

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy Penfield App

To deploy penfield app please deploy in the following order:

  • Create penfield-app namespace (skip if you have created it already)
  • Deploy penfield-app regcred
  • Deploy penfield-app secrets
  • Deploy kafka cluster
  • Deploy penfield app

Create penfield-app namespace (skip if you have created it already)

To create namespace, run the following command: kubectl create namespace penfield-app

Deploy penfield-app regcred (skip if it is already created)

Regcred will be used to fetch the images from PenfieldAI image repository. To deploy regcred, run the following command:

tip

Username and Password must be provided by PenfieldAI.

kubectl create secret docker-registry regcred \
--docker-server=https://registry.gitlab.com \
--docker-username=<Username> \
--docker-password=<Password> \
-n penfield-app

Deploy penfield-app secrets

Secrets will be used to fetch the sensitive data inside kubernetes pods. penfield-secrets is already created. This step needs to modify it and add more secrets that is required.

tip

While depploying secrets, not all of the variables are needed, depend on which service is being deployed in penfield app, you will only need specific variables.

For reference penfield-secrets.yaml file looks like:

apiVersion: v1
kind: Secret
metadata:
name: penfield-secrets
type: Opaque
data:
AZURE_OPENAI_API_KEY: <base 64 encoded value> # Needed for multiple services [Pathfinder Only]
MINIO_ACCESS_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
MINIO_SECRET_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
POSTGRES_PASSWORD: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]
POSTGRES_USER: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]

To deploy secrets, run the following command:

kubectl apply -f penfield-secrets.yaml -n penfield-app

Deploy kafka cluster

At this stage kafka operator is already installed, now use the following command to install kafka cluster in penfield-app namespace:

kubectl apply -f kafka-cluster.yaml -n penfield-app

For reference kafka-cluster.yaml file looks like:

kafka-cluster.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
version: 3.7.0
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
inter.broker.protocol.version: "3.7"
log.retention.bytes: 9663676416 # 9GB, per Topic, Currently we have ~10 topics. So 90GB in total disk for 10 topics, logs will be deleted once storage reaches to 9GB for each topic.

storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 100Gi
deleteClaim: false
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 100Gi
deleteClaim: false
entityOperator:
topicOperator: {}

## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##

Deploy penfield app

tip

Use the same Username and Password provided by PenfieldAI which are used to deploy regcred in the previous step.

PenfieldAI app will be deployed using helm. Once you receive helm chart, go to helm chart directory and perform below steps.

  • Add helm repo: helm repo add --username <username> --password <password> penfieldai https://gitlab.com/api/v4/projects/28749615/packages/helm/stable
  • Update helm repository: helm repo update
  • Find helm repository: helm search repo penfieldai/penfieldai
  • Deploy penfieldai via helm chart:
helm upgrade \
--install penfield-app penfieldai/penfieldai \
--namespace penfield-app \
--create-namespace \
--version=1.0.7 \
--values penfield-values.yaml
tip

For penfield-values.yaml file look into the helmchart values, you can use helm show values penfieldai/penfieldai > penfield-values.yaml to export values file and edit as per your requirements. Depend on what type of deployment it is, you only need to enable required services in values file accordingly. You will need either Pathfinder or Pathfinder+ITSM deployment. To prepare value file please refer this guide.