K8s install application
Once Cloud infrastructure is put in place and access to kubernetes cluster is verified, then you can start with setting up required dependencies and Penfield application on kubernetes cluster. There are two ways to install it on kubernetes cluster:
Install using Helmfile (recommended)
Clone this Github repo which contains the code that you need and follow the below steps:
Never ever commit your actual secret values in version control like git.
secrets
folder contains the secrets, make sure you never commit this folder.
Make sure you have the following information handy before you proceed:
Name | Description |
---|---|
CLOUD_PROVIDER | Your cloud provider. e.g. azure or aws |
DOCKER_USERNAME | You should have received this username from Penfield. |
DOCKER_PASSWORD | You should have received this password from Penfield. |
AZURE_OPENAI_API_KEY | Your Azure OpenAI Key from your Azure OpenAI deployment. |
AZURE_OPENAI_ENDPOINT | Your Azure Open AI Endpoint URL. |
AZURE_OPENAI_DEPLOYMENT | Deployment name of your Azure OpenAI deployment. e.g. gpt-35-turbo |
AZURE_OPENAI_MODEL | The model name of the corresponding deployment. e.g. gpt-35-turbo |
FQDN | Fully Qualified Domain Name that you are using for your Penfield application. |
SSL_CERTIFICATE | Only Required if cloud provider is Azure. Create a file with name ssl.key and ssl.crt in /certs directory with your FQDN SSL certificate. |
RELEASE_TAG | You should have received this from Penfield. The release tag that you want to use for the Penfield application. e.g. 1.0.10 |
DEPLOYMENT_TIER | Deployment tier that you want to use. e.g. pathfinder or standard. pathfinder tier is used for Pathfinder only deployment and standard tier is used for Pathfinder + ITSM deployment. |
INTEGRATION_SOURCE | The type of ITSM integration you are using. e.g. helix or sentinel or jira. |
Once you have the above information, Go to the folder where you have clone github repo. Execute the bash script in your terminal by running the following commands and follow the instructions.
./run.sh
Once done please come back here for the next steps.
Uninstall deployment
To delete the deployments from Kubernetes cluster:
./run.sh -d
Grafana
Grafana got deployed as a part of kube-prometheus-stack. You should be able to access grafana using https://FQDN/grafana
Grafana will set random password at creation. You can find the grafana password using this command:
kubectl get secret \
--namespace monitoring grafana-secrets \
-o jsonpath="{.data.admin-password}" | base64 --decode ; echo
If you deployed the dependencies using Helmfile please skip the next section and go to next page.
Install using Helm
Following services needs to be installed on kubernetes cluster:
Nginx Ingress Controller
Nginx Ingress controller will be used to route the ingress traffic to cluster. You can use helm to install Nginx Ingress Controller with custom values. Run the below commands:
- Add helm repo on local:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
- Update helm repository:
helm repo update
- Find nginx ingress controller:
helm search repo ingress-nginx
- Install nginx ingress controller:
helm upgrade \
--install internal-ingress-nginx ingress-nginx/ingress-nginx \
--namespace internal-ingress-nginx \
--create-namespace \
--version=4.10.1 \
--values internal_ingress_nginx_values.yaml
- Values file
internal_ingress_nginx_values.yaml
look like
Depending on your cloud provider, you can uncomment either NodePort Implementation (For AWS) or Internal LB Implementation (For Azure):
controller:
config:
server-tokens: "false"
use-forwarded-headers: "true"
extraArgs:
metrics-per-host: "false"
ingressClassResource:
name: internal-ingress-nginx
kind: DaemonSet
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- sleep 60; /sbin/nginx -c /etc/nginx/nginx.conf -s quit; while pgrep -x nginx; do sleep 1; done
metrics:
enabled: true
##### NodePort Implementation (For AWS)
# service:
# enableHttps: false
# externalTrafficPolicy: Local
# type: NodePort
# nodePorts:
# http: "31080"
##### Internal LB Implementation (For Azure)
# service:
# enabled: true
# annotations: {
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# }
# type: LoadBalancer
#####
defaultBackend:
enabled: true
resources:
requests:
memory: 16Mi
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
Deploy Kafka controller
Kafka is needed to run the PenfieldAI application. In order to setup kafka cluster, you need to setup Kafka controller and cluster. First you will setup kafka controller. Then, later in this section, you will setup kafka cluster.
You can use strimzi kafka for deployment using helm. Perform below steps to deploy kafka, Run the following commands:
-
Add helm repo on local:
helm repo add strimzi https://strimzi.io/charts/
-
Update helm repository:
helm repo update
-
Find kafka repository:
helm search repo strimzi
-
Create a kafka operator:
helm upgrade \
--install strimzi-kafka strimzi/strimzi-kafka-operator \
--namespace strimzi-kafka \
--create-namespace \
--version 0.41.0 \
--values strimzi-kafka-operator-values.yamlstrimzi-kafka-operator-values.yaml
file looks like:strimzi-kafka-operator-values.yaml
watchNamespaces: []
watchAnyNamespace: true
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##Wait for few minutes until kafka operator starts running, this is needed to deploy kafka cluster.
Verify: It will take few minutes for kafka operator to fully start running. To verify the setup you can run the following command:
kubectl get pods -n strimzi-kafka
If you will see the output like below, it means kafka operator is up and running:
NAME READY STATUS RESTARTS AGE
strimzi-cluster-operator-c8ddd59d-rwffr 1/1 Running 0 4m8s
Deploy Postgres DB
You need to create new credentials that you want to use for Postgres DB that will be deployed in cluster. Same secret will be modified when you deploy Penfield-app later on.
Deploy penfield-app secrets
Create a penfield-secrets.yaml
file with below code:
apiVersion: v1
kind: Secret
metadata:
name: penfield-secrets
type: Opaque
data:
# All values must be base64 encoded.
POSTGRES_USER: <base 64 encoded value>
POSTGRES_PASSWORD: <base 64 encoded value>
Use this command to encode plain text string into base64 string:
echo -n 'plainTextStringToEncode' | base64
To deploy secrets, run the following command:
kubectl create namespace penfield-app
kubectl apply -f penfield-secrets.yaml -n penfield-app
To deploy Postgres DB in the Kubernetes cluster please follow the below steps:
- Add helm repo:
helm repo add bitnami https://charts.bitnami.com/bitnami
- Update helm repository:
helm repo update
- Search repo:
helm search repo bitnami/postgresql
- Deploy postgres using helm:
helm upgrade \
--install analytics bitnami/postgresql \
--namespace penfield-app \
--create-namespace \
--version 11.9.13 \
--values penfield-app/postgres-values.yaml
For reference postgres-values.yaml
file looks like:
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
global:
## Name of the storage class you want to use.
## eg: gp3-encrypted-pv or managed-premium-pv
storageClass: ""
postgresql:
auth:
existingSecret: "penfield-secrets"
secretKeys:
adminPasswordKey: "POSTGRES_PASSWORD"
primary:
persistence:
enabled: true
## Name of the storage class you want to use.
## eg: gp3-encrypted-pv or managed-premium-pv
storageClass: ""
accessModes:
- ReadWriteOnce
size: 50Gi
Postgres deployment will depend on penfield-app secrets that gets created when we deploy penfield app, So if pod did not in running state please deploy the secrets as mentioned in the next section and then verify postgres deployment.
Once pod is setup, export the postgres password:
export POSTGRES_PASSWORD=$(kubectl get secret --namespace penfield-app analytics-postgresql -o jsonpath="{.data.postgres-password}" | base64 --decode)
To login to postgres from local:
kubectl run analytics-postgresql-client --rm --tty -i --restart='Never' \
--namespace penfield-app --image docker.io/bitnami/postgresql:14.5.0-debian-11-r21 \
--env="PGPASSWORD=$POSTGRES_PASSWORD" \
--command -- psql --host analytics-postgresql -U postgres -d postgres -p 5432
Do not forget to add POSTGRES_USER and POSTGRES_PASSWORD to penfield-secrets when you will create secrets in the upcoming steps.
Deploy Kratos (authentication backend)
Kratos is needed for dashboard authentication. In order to deploy Kratos, postgres database is required. Kratos is deployed through helm charts. Perform below steps to deploy kratos, Run the following commands:
-
Add helm repo on local:
helm repo add ory https://k8s.ory.sh/helm/charts
-
Update helm repository:
helm repo update
-
Find kafka repository:
helm search repo ory/kratos
-
Create the namespace:
kubectl create namespace kratos
-
Create the secrets for sensitive info
kubectl apply -f kratos-secrets.yaml -n kratos
kratos-secrets.yamlapiVersion: v1
kind: Secret
metadata:
name: kratos-secrets
annotations:
helm.sh/hook-weight: "0"
helm.sh/hook: "pre-install, pre-upgrade"
helm.sh/hook-delete-policy: "before-hook-creation"
helm.sh/resource-policy: "keep"
type: Opaque
data:
dsn: <base 64 encoded value> # e.g: postgres://<username>:<password>@analytics-postgresql.penfield-app.svc.cluster.local:5432/postgres
oidcConfig: <base 64 encoded value> # Only needed if you enable SSO, Update the values in this string: [{"id": "microsoft","provider": "microsoft","client_id": "<Update_client_id_value>","client_secret": "<Update_client_secret_value>","microsoft_tenant": "<Update_microsoft_tenant_value>","mapper_url": "base64://bG9jYWwgY2xhaW1zID0gc3RkLmV4dFZhcignY2xhaW1zJyk7CnsKICBpZGVudGl0eTogewogICAgdHJhaXRzOiB7CiAgICAgIFtpZiAnZW1haWwnIGluIGNsYWltcyB0aGVuICdlbWFpbCcgZWxzZSBudWxsXTogY2xhaW1zLmVtYWlsLAogICAgfSwKICB9LAp9Cg==","scope": ["email"]}] -
Deploy kratos via helm chart:
helm upgrade \
--install kratos ory/kratos \
--namespace kratos \
--create-namespace \
--version=0.43.1 \
--values kratos-values.yamltip- Make sure the PostgresDB can be reached from Kubernetes cluster.
- Make sure the kratos-values.yaml file is configured correctly. Replace FQDN with correct values.
- Replace
penfield.example.com
with your correct FQDN everywhere. - Make sure postgres DSN value is correct.
- Use unique value for
cookie
under secrets. - (optional) Proper SMTP mail server, This is not needed.
- Replace
For reference
kratos-values.yaml
file looks like:kratos-values.yamlingress:
public:
enabled: true
className: internal-ingress-nginx
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
hosts:
- host: "penfield.example.com" # Replace it with your FQDN
paths:
- path: /public(/|$)(.*)
pathType: ImplementationSpecific
secret:
enabled: true
nameOverride: "kratos-secrets"
kratos:
automigration:
enabled: true
config:
serve:
public:
base_url: https://penfield.example.com/public/
cors:
enabled: true
admin:
base_url: https://penfield.example.com/admin/
selfservice:
default_browser_return_url: https://penfield.example.com/web/
allowed_return_urls:
- https://penfield.example.com/web/
methods:
password:
enabled: true
link:
enabled: true
# Enable OIDC only if you use SSO
oidc:
enabled: false
config:
providers: []
flows:
error:
ui_url: https://penfield.example.com/web/error
settings:
ui_url: https://penfield.example.com/web/settings
privileged_session_max_age: 15m
recovery:
enabled: true
ui_url: https://penfield.example.com/web/recovery
verification:
enabled: true
ui_url: https://penfield.example.com/web/verify
after:
default_browser_return_url: https://penfield.example.com/web/
logout:
after:
default_browser_return_url: https://penfield.example.com/web/welcome
login:
ui_url: https://penfield.example.com/web/auth/login
lifespan: 10m
registration:
lifespan: 10m
ui_url: https://penfield.example.com/web/auth/registration
after:
password:
hooks:
- hook: session
log:
level: trace
format: json
leak_sensitive_values: true
hashers:
argon2:
parallelism: 1
memory: 128MB
iterations: 2
salt_length: 16
key_length: 16
identity:
default_schema_id: admin
schemas:
- id: admin
url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBZG1pbiIsCiAgInR5cGUiOiAib2JqZWN0IiwKICAicHJvcGVydGllcyI6IHsKICAgICJ0cmFpdHMiOiB7CiAgICAgICJ0eXBlIjogIm9iamVjdCIsCiAgICAgICJwcm9wZXJ0aWVzIjogewogICAgICAgICJlbWFpbCI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAiZm9ybWF0IjogImVtYWlsIiwKICAgICAgICAgICJ0aXRsZSI6ICJFLU1haWwiLAogICAgICAgICAgIm1pbkxlbmd0aCI6IDMsCiAgICAgICAgICAib3J5LnNoL2tyYXRvcyI6IHsKICAgICAgICAgICAgImNyZWRlbnRpYWxzIjogewogICAgICAgICAgICAgICJwYXNzd29yZCI6IHsKICAgICAgICAgICAgICAgICJpZGVudGlmaWVyIjogdHJ1ZQogICAgICAgICAgICAgIH0KICAgICAgICAgICAgfSwKICAgICAgICAgICAgInZlcmlmaWNhdGlvbiI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9LAogICAgICAgICAgICAicmVjb3ZlcnkiOiB7CiAgICAgICAgICAgICAgInZpYSI6ICJlbWFpbCIKICAgICAgICAgICAgfQogICAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgImZpcnN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiRmlyc3QgTmFtZSIKICAgICAgICB9LAogICAgICAgICJsYXN0TmFtZSI6IHsKICAgICAgICAgICJ0eXBlIjogInN0cmluZyIsCiAgICAgICAgICAidGl0bGUiOiAiTGFzdCBOYW1lIgogICAgICAgIH0KICAgICAgICB9LAogICAgICAgICJyZXF1aXJlZCI6IFsiZW1haWwiXSwKICAgICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgICB9CiAgICB9CiAgfQo=
- id: analyst
url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBbmFseXN0IiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICAgIH0sCiAgICAgICAgInJlcXVpcmVkIjogWyJlbWFpbCJdLAogICAgICAgICJhZGRpdGlvbmFsUHJvcGVydGllcyI6IGZhbHNlCiAgICAgIH0KICAgIH0KICB9Cg==
- id: automationengineer
url: base64://ewogICIkaWQiOiAiaHR0cHM6Ly9zY2hlbWFzLm9yeS5zaC9wcmVzZXRzL2tyYXRvcy9xdWlja3N0YXJ0L2VtYWlsLXBhc3N3b3JkL2lkZW50aXR5LnNjaGVtYS5qc29uIiwKICAiJHNjaGVtYSI6ICJodHRwOi8vanNvbi1zY2hlbWEub3JnL2RyYWZ0LTA3L3NjaGVtYSMiLAogICJ0aXRsZSI6ICJBdXRvbWF0aW9uIEVuZ2luZWVyIiwKICAidHlwZSI6ICJvYmplY3QiLAogICJwcm9wZXJ0aWVzIjogewogICAgInRyYWl0cyI6IHsKICAgICAgInR5cGUiOiAib2JqZWN0IiwKICAgICAgInByb3BlcnRpZXMiOiB7CiAgICAgICAgImVtYWlsIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJmb3JtYXQiOiAiZW1haWwiLAogICAgICAgICAgInRpdGxlIjogIkUtTWFpbCIsCiAgICAgICAgICAibWluTGVuZ3RoIjogMywKICAgICAgICAgICJvcnkuc2gva3JhdG9zIjogewogICAgICAgICAgICAiY3JlZGVudGlhbHMiOiB7CiAgICAgICAgICAgICAgInBhc3N3b3JkIjogewogICAgICAgICAgICAgICAgImlkZW50aWZpZXIiOiB0cnVlCiAgICAgICAgICAgICAgfQogICAgICAgICAgICB9LAogICAgICAgICAgICAidmVyaWZpY2F0aW9uIjogewogICAgICAgICAgICAgICJ2aWEiOiAiZW1haWwiCiAgICAgICAgICAgIH0sCiAgICAgICAgICAgICJyZWNvdmVyeSI6IHsKICAgICAgICAgICAgICAidmlhIjogImVtYWlsIgogICAgICAgICAgICB9CiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICAiZmlyc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJGaXJzdCBOYW1lIgogICAgICAgIH0sCiAgICAgICAgImxhc3ROYW1lIjogewogICAgICAgICAgInR5cGUiOiAic3RyaW5nIiwKICAgICAgICAgICJ0aXRsZSI6ICJMYXN0IE5hbWUiCiAgICAgICAgfQogICAgICB9LAogICAgICAicmVxdWlyZWQiOiBbCiAgICAgICAgImVtYWlsIgogICAgICBdLAogICAgICAiYWRkaXRpb25hbFByb3BlcnRpZXMiOiBmYWxzZQogICAgfQogIH0KfQo=
courier:
smtp:
connection_uri: smtpConnectionURI
## Enable below config if you enable SSO
# deployment:
# extraEnv:
# - name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS
# valueFrom:
# secretKeyRef:
# name: kratos-secrets
# key: oidcConfig
cleanup:
enabled: true
batchSize: 20000
Deploy Milvus
Milvus is used as vector database. In order to deploy Milvus, you can use helm. Run the following commands:
-
Add helm repo on local:
helm repo add milvus https://zilliztech.github.io/milvus-helm/
-
Update helm repository:
helm repo update
-
Find milvus repository:
helm search repo milvus/milvus
-
Deploy milvus using helm:
helm upgrade \
--install milvus milvus/milvus \
--namespace milvus \
--create-namespace \
--version=4.1.32 \
--values milvus-values.yamlFor reference
milvus-values.yaml
file looks like:milvus-values.yaml## Enable or disable Milvus Cluster mode
cluster:
enabled: false
etcd:
replicaCount: 1
persistence:
storageClass: managed-premium-pv
size: 10Gi
minio:
mode: standalone
persistence:
storageClass: managed-premium-pv
size: 50Gi
pulsar:
enabled: false
standalone:
persistence:
persistentVolumeClaim:
storageClass: managed-premium-pv
size: 50Gi
Deploy Minio
To deploy Minio, you can use helm. Run the following commands:
-
Add helm repo on local:
helm repo add minio https://charts.min.io
-
Update helm repository:
helm repo update
-
Find minio repository:
helm search repo minio/minio
-
Create namespace for minio:
kubectl create namespace minio
-
Create secrets for minio:
kubectl apply -f minio-secrets.yaml -n minio
minio-secrets.yamlapiVersion: v1
kind: Secret
metadata:
name: minio-user-secret
type: Opaque
data:
rootUser: <base 64 encoded value> # Set a random 32 alpha numeric value.
rootPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
userPassword: <base 64 encoded value> # Set a random 32 alpha numeric value.
secretKey: <base 64 encoded value> # Set a random 40 or more alpha numeric value. -
Deploy minio using helm:
helm upgrade \
--install minio minio/minio \
--namespace minio \
--create-namespace \
--version=5.2.0 \
--values minio-values.yamlFor reference
minio-values.yaml
file looks like:minio-values.yaml## Set default rootUser, rootPassword
## .data.rootUser and .data.rootPassword are mandatory,
existingSecret: minio-secrets
# Number of MinIO containers running
replicas: 2
## List of users to be created after minio install
users:
- accessKey: bucketuser
existingSecret: minio-secrets
existingSecretKey: userPassword
policy: readwrite
## List of service accounts to be created after minio install
svcaccts:
- accessKey: random-32-alphanumeric-characters-accessKey
existingSecret: minio-secrets
existingSecretKey: secretKey
user: bucketuser
## List of buckets to be created after minio install
buckets:
- name: general
policy: none
purge: false
versioning: true
objectlocking: false
## Enable persistence using Persistent Volume Claims
persistence:
enabled: true
storageClass: managed-premium-pv
accessMode: ReadWriteOnce
size: 10Gi
Deploy Airflow
To deploy Airflow, you can use helm. Run the following commands:
-
Add helm repo on local:
helm repo add apache-airflow https://airflow.apache.org
-
Update helm repository:
helm repo update
-
Find airflow repository:
helm search repo apache-airflow/airflow
-
Create namespace for airflow:
kubectl create namespace airflow
-
Create secrets for airflow. Airflow needs multiple secrets to be created. You can create secrets using below command:
kubectl apply -f airflow-secrets.yaml -n airflow
airflow-secrets.yamlapiVersion: v1
kind: Secret
metadata:
name: airflow-secrets
type: Opaque
data:
fernet-key: <base 64 encoded value> # Set a random value.
webserver-secret-key: <base 64 encoded value> # Set a random value.
postgres-password: <base 64 encoded value> # Set a random password value.
connection: <base 64 encoded value> # e.g: postgresql://postgres:<password>@airflow-postgresql:5432/postgreskubectl apply -f penfield-secrets.yaml -n airflow
airflow-secrets.yamlapiVersion: v1
kind: Secret
metadata:
name: penfield-secrets
type: Opaque
data:
AZURE_OPENAI_API_KEY: <base 64 encoded value> # Use the AZURE OPENAI KEYConfigmap for airflow: This is used to pass env variable to executors.
kubectl apply -f airflow-configmap.yaml -n airflow
airflow-configmap.yamlapiVersion: v1
kind: ConfigMap
metadata:
name: penfield-configmap
data:
MONGODB_URI: mongodb://mongo-db.penfield-app.svc.cluster.local:27017
MONGODB_NAME: penfield_appdb
NLP_ENDPOINT: http://nlp-service.penfield-app.svc.cluster.local:3000/api/v1/ner
OPENAI_API_TYPE: azure
AZURE_OPENAI_API_VERSION: 2023-05-15
AZURE_OPENAI_ENDPOINT: <URL of the Azure Open AI Endpoint>
AZURE_OPENAI_DEPLOYMENT: <Deployment under the Azure OpenAI resource> #e.g. gpt-35-turbo-ca
AZURE_OPENAI_MODEL: <The model name of the corresponding deployment> # e.g. gpt-35-turbo
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ## -
Deploy airflow using helm:
helm upgrade \
--install airflow apache-airflow/airflow \
--namespace airflow \
--create-namespace \
--version=1.13.1 \
--values airflow-values.yamlFor reference
airflow-values.yaml
file looks like:airflow-values.yaml# Default airflow tag to deploy
defaultAirflowTag: "2.9.0"
# Default airflow digest. If specified, it takes precedence over tag
defaultAirflowDigest: ~
# Airflow version (Used to make some decisions based on Airflow Version being deployed)
airflowVersion: "2.9.0"
# Images
images:
airflow:
repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-server
tag: release-1.0.5
# pullPolicy: Always
pod_template:
repository: registry.gitlab.com/penfieldai/data-pipeline/airflow-worker
tag: release-1.0.5
# pullPolicy: Always
registry:
secretName: regcred
config:
webserver:
base_url: http://penfield.example.com/airflow
kubernetes_executor:
worker_container_repository: '{{ .Values.images.pod_template.repository | default .Values.defaultAirflowRepository }}'
worker_container_tag: '{{ .Values.images.pod_template.tag | default .Values.defaultAirflowTag }}'
env:
- name: "AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG"
value: "1"
- name: "AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT"
value: "false"
extraEnvFrom: |
- configMapRef:
name: penfield-configmap
- secretRef:
name: penfield-secrets
# Airflow executor: One of: LocalExecutor, LocalKubernetesExecutor, CeleryExecutor, KubernetesExecutor, CeleryKubernetesExecutor
executor: KubernetesExecutor
# Ingress configuration
ingress:
# Configs for the Ingress of the web Service
web:
enabled: false
annotations: {}
path: "/airflow"
pathType: "Prefix"
hosts:
- name: penfield.example.com
ingressClassName: "internal-ingress-nginx"
precedingPaths: []
succeedingPaths: []
# Fernet key settings
# Note: fernetKey can only be set during install, not upgrade
# Airflow uses Fernet to encrypt passwords in the connection configuration and the variable configuration.
fernetKey: fernet-key
fernetKeySecretName: airflow-secrets
# Flask secret key for Airflow Webserver: `[webserver] secret_key` in airflow.cfg
webserverSecretKey: webserver-secret-key
webserverSecretKeySecretName: airflow-secrets
# Airflow Worker Config
workers:
# Number of airflow celery workers in StatefulSet
replicas: 1
persistence:
# Enable persistent volumes
enabled: true
# Volume size for worker StatefulSet
size: 10Gi
storageClassName: managed-premium-pv
# Execute init container to chown log directory.
# This is currently only needed in kind, due to usage
# of local-path provisioner.
fixPermissions: false
# Annotations to add to worker volumes
annotations: {}
# Detailed default security context for persistence for container level
securityContexts:
container: {}
# container level lifecycle hooks
containerLifecycleHooks: {}
# Configuration for postgresql subchart
# Not recommended for production
postgresql:
enabled: true
auth:
enablePostgresUser: true
existingSecret: airflow-secrets
primary:
## PostgreSQL Primary persistence configuration
persistence:
enabled: true
storageClass: "managed-premium-pv"
size: 10Gi
# Configuration for the redis provisioned by the chart
redis:
enabled: false
# Airflow database & redis config
data:
metadataSecretName: airflow-secrets
# Airflow Triggerer Config
triggerer:
enabled: true
# Number of airflow triggerers in the deployment
replicas: 1
# Max number of old replicasets to retain
revisionHistoryLimit: ~
persistence:
enabled: true
size: 10Gi
storageClassName: managed-premium-pv
# StatsD settings
statsd:
enabled: false
# This runs as a CronJob to cleanup old pods.
cleanup:
enabled: true
schedule: "@daily"
logs:
persistence:
enabled: false
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
Deploy Monitoring stack
Deploy Prometheus and Grafana:
Premethus is used to collect metrics and Grafana to visualize it. This will be helpful for troubleshooting. To deploy Prometheus + Grafana, run the following commands:
- Add helm repo:
helm repo add rometheus-community https://prometheus-community.github.io/helm-charts
- Update helm repository:
helm repo update
- Find airflow repository:
helm search repo prometheus-community/kube-prometheus-stack
- Deploy configmap for custom dashboard:
kubectl create ns monitoring
kubectl apply -f loki-logs-dashboard.yaml -n monitoring
For reference loki-logs-dashboard.yaml
file looks like:
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-logs-grafana-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
loki-dashboard.json: |-
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Loki logs panel dashbaord with prometheus variables ",
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 29,
"iteration": 1633543882969,
"links": [],
"panels": [
{
"aliasColors": {},
"bars": true,
"dashLength": 10,
"dashes": false,
"datasource": "Loki",
"description": "Time frame",
"fieldConfig": {
"defaults": {
"unit": "dateTimeFromNow"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 4,
"w": 24,
"x": 0,
"y": 0
},
"hiddenSeries": false,
"id": 4,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": false,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "8.1.5",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(count_over_time({namespace=\"$namespace\", instance=~\"$pod\"} |~ \"$search\"[$__interval]))",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Time Frame",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:37",
"format": "dateTimeFromNow",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
},
{
"$$hashKey": "object:38",
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"datasource": "Loki",
"description": "Live logs from Kubernetes pod with some delay.",
"gridPos": {
"h": 21,
"w": 24,
"x": 0,
"y": 4
},
"id": 2,
"options": {
"dedupStrategy": "none",
"enableLogDetails": true,
"prettifyLogMessage": false,
"showCommonLabels": false,
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"targets": [
{
"expr": "{namespace=\"$namespace\", pod=~\"$pod\"} |~ \"$search\"",
"refId": "A"
}
],
"title": "Logs Panel",
"type": "logs"
}
],
"refresh": false,
"schemaVersion": 30,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"allValue": null,
"current": {
"selected": false,
"text": "penfield-app",
"value": "penfield-app"
},
"datasource": null,
"definition": "label_values(kube_pod_info, namespace)",
"description": "K8 namespace",
"error": null,
"hide": 0,
"includeAll": false,
"label": "",
"multi": false,
"name": "namespace",
"options": [],
"query": {
"query": "label_values(kube_pod_info, namespace)",
"refId": "StandardVariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"allValue": ".*",
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": null,
"definition": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
"description": "Pods inside namespace",
"error": null,
"hide": 0,
"includeAll": true,
"label": null,
"multi": true,
"name": "pod",
"options": [],
"query": {
"query": "label_values(container_network_receive_bytes_total{namespace=~\"$namespace\"},pod)",
"refId": "StandardVariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"current": {
"selected": true,
"text": "",
"value": ""
},
"description": "Search keywords",
"error": null,
"hide": 0,
"label": null,
"name": "search",
"options": [
{
"selected": true,
"text": "",
"value": ""
}
],
"query": "",
"skipUrlSync": false,
"type": "textbox"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Loki / Dashboard for k8 Logs",
"uid": "vmMdG2vnk",
"version": 1
}
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
- Deploy the prometheus stack using helm:
helm upgrade \
--install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--version=59.0.0 \
--values kube-prometheus-stack-values.yaml
For reference kube-prometheus-stack-values.yaml
file looks like: Update the domain penfield.example.com
with your FQDN.
#### EKS Specific #####
## Component scraping the kubelet and kubelet-hosted cAdvisor
##
kubelet:
## This is disabled by default because container metrics are already exposed by cAdvisor
servicemonitor:
resource: false
## Component scraping the kube controller manager
##
kubeControllerManager:
enabled: false
## Component scraping kube scheduler
##
kubeScheduler:
enabled: false
## Component scraping kube proxy
##
kubeProxy:
enabled: false
#####################
## Configuration for alertmanager
## ref: https://prometheus.io/docs/alerting/alertmanager/
##
alertmanager:
enabled: false
## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
##
grafana:
enabled: true
## Deploy default dashboards
##
defaultDashboardsEnabled: true
## Timezone for the default dashboards
## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg
##
defaultDashboardsTimezone: utc
# adminPassword: prom-operator
adminUser: admin
adminPassword:
admin:
existingSecret: ""
userKey: admin-user
passwordKey: admin-password
grafana.ini:
server:
domain: penfield.example.com
root_url: "%(protocol)s://%(domain)s/grafana"
serve_from_sub_path: true
ingress:
## If true, Grafana Ingress will be created
enabled: true
ingressClassName: internal-ingress-nginx
# annotations:
# nginx.ingress.kubernetes.io/rewrite-target: /$2
labels: {}
hosts:
- penfield.example.com
## Path for grafana ingress
path: "/grafana"
pathType: ImplementationSpecific
sidecar:
dashboards:
enabled: true
label: grafana_dashboard
datasources:
enabled: true
defaultDatasourceEnabled: true
label: grafana_datasource
## Configuration for kube-state-metrics subchart
##
kube-state-metrics:
namespaceOverride: ""
rbac:
create: true
podSecurityPolicy:
enabled: true
## Manages Prometheus and Alertmanager components
##
prometheusOperator:
enabled: true
## Resource limits & requests
##
resources: {}
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
nodeSelector: {}
## Deploy a Prometheus instance
##
prometheus:
enabled: true
## Service account for Prometheuses to use.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
##
serviceAccount:
create: true
name: "prometheus-operator-prometheus"
annotations: {}
ingress:
enabled: false
## Settings affecting prometheusSpec
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
##
prometheusSpec:
ruleSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
probeSelectorNilUsesHelmValues: false
## How long to retain metrics
##
retention: 15d
## Maximum size of metrics
##
retentionSize: ""
## Resource limits & requests
##
resources: {}
requests:
memory: 400Mi
limits:
memory: 800Mi
## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md
##
storageSpec:
## Using PersistentVolumeClaim
##
volumeClaimTemplate:
# spec:
# storageClassName: managed-premium-pv
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
After this deployment Grafana and prometheus will be deployed. You should be able to access grafana using https://FQDN/grafana
Grafana will set random password at creation. You can find the grafana password using this command:
kubectl get secret --namespace monitoring kube-prometheus-stack-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
Deploy Loki for Logging:
This will setup loki stack which includes loki and promtail.
- Add helm repo:
helm repo add grafana https://grafana.github.io/helm-charts
- Update helm repository:
helm repo update
- Search repo:
helm search repo grafana/loki-stack
- Deploy the logging stack using helm:
helm upgrade \
--install loki-stack grafana/loki-stack \
--namespace monitoring \
--create-namespace \
--version=2.10.2 \
--values loki-stack-values.yaml
For reference loki-stack-values.yaml
file looks like:
loki:
enabled: true
isDefault: false
persistence:
enabled: true
accessModes:
- ReadWriteOnce
size: 50Gi
# storageClassName: managed-premium-pv
config:
# This will work with newer version of loki like loki 3.0.0,
# table_manager will be depreciated in the newer version
# limits_config:
# retention_period: 48h
compactor:
retention_enabled: true
retention_delete_delay: 2h
table_manager:
retention_deletes_enabled: true
retention_period: 15d #15 days
promtail:
enabled: true
fluent-bit:
enabled: false
## This creates Loki Data source configmap and adds in kube prometheus stack
grafana:
enabled: false
sidecar:
datasources:
enabled: true
maxLines: 1000
prometheus:
enabled: false
filebeat:
enabled: false
logstash:
enabled: false
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
Deploy Penfield App
To deploy penfield app please deploy in the following order:
- Create penfield-app namespace (skip if you have created it already)
- Deploy penfield-app regcred
- Deploy penfield-app secrets
- Deploy kafka cluster
- Deploy penfield app
Create penfield-app namespace (skip if you have created it already)
To create namespace, run the following command:
kubectl create namespace penfield-app
Deploy penfield-app regcred (skip if it is already created)
Regcred will be used to fetch the images from PenfieldAI image repository. To deploy regcred, run the following command:
Username and Password must be provided by PenfieldAI.
kubectl create secret docker-registry regcred \
--docker-server=https://registry.gitlab.com \
--docker-username=<Username> \
--docker-password=<Password> \
-n penfield-app
Deploy penfield-app secrets
Secrets will be used to fetch the sensitive data inside kubernetes pods. penfield-secrets
is already created. This step needs to modify it and add more secrets that is required.
While depploying secrets, not all of the variables are needed, depend on which service is being deployed in penfield app, you will only need specific variables.
For reference penfield-secrets.yaml
file looks like:
apiVersion: v1
kind: Secret
metadata:
name: penfield-secrets
type: Opaque
data:
AZURE_OPENAI_API_KEY: <base 64 encoded value> # Needed for multiple services [Pathfinder Only]
MINIO_ACCESS_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
MINIO_SECRET_KEY: <base 64 encoded value> # Needed for web service [Pathfinder Only]
POSTGRES_PASSWORD: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]
POSTGRES_USER: <base 64 encoded value> # Needed for datapipeline, soc-data-analytics-service, kratos [Pathfinder Only]
To deploy secrets, run the following command:
kubectl apply -f penfield-secrets.yaml -n penfield-app
Deploy kafka cluster
At this stage kafka operator is already installed, now use the following command to install kafka cluster in penfield-app
namespace:
kubectl apply -f kafka-cluster.yaml -n penfield-app
For reference kafka-cluster.yaml
file looks like:
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
version: 3.7.0
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
inter.broker.protocol.version: "3.7"
log.retention.bytes: 9663676416 # 9GB, per Topic, Currently we have ~10 topics. So 90GB in total disk for 10 topics, logs will be deleted once storage reaches to 9GB for each topic.
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 100Gi
deleteClaim: false
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 100Gi
deleteClaim: false
entityOperator:
topicOperator: {}
## Copyright (C) 2024 Penfield.AI INC. - All Rights Reserved ##
Deploy penfield app
Use the same Username and Password provided by PenfieldAI which are used to deploy regcred in the previous step.
PenfieldAI app will be deployed using helm. Once you receive helm chart, go to helm chart directory and perform below steps.
- Add helm repo:
helm repo add --username <username> --password <password> penfieldai https://gitlab.com/api/v4/projects/28749615/packages/helm/stable
- Update helm repository:
helm repo update
- Find helm repository:
helm search repo penfieldai/penfieldai
- Deploy penfieldai via helm chart:
helm upgrade \
--install penfield-app penfieldai/penfieldai \
--namespace penfield-app \
--create-namespace \
--version=1.0.7 \
--values penfield-values.yaml
For penfield-values.yaml
file look into the helmchart values, you can use helm show values penfieldai/penfieldai > penfield-values.yaml
to export values file and edit as per your requirements. Depend on what type of deployment it is, you only need to enable required services in values file accordingly. You will need either Pathfinder
or Pathfinder+ITSM
deployment. To prepare value file please refer this guide.