Install Obliq AI SRE Agent
Obliq is an AI-Powered infrastructure management platform designed to bring intelligence, automation, and reliability to operations across Kubernetes and AWS environments.
Install Obliq AI SRE Agent using Helm charts.
Set up Installation Prerequisites
Installation Packages
Before starting deployment, ensure you have the following packages installed:
Kubernetes Tools
Tool | Version | Notes |
---|---|---|
kubectl | v1.24+ | Compatible with your cluster version |
helm | v3.8+ | — |
helmfile | v0.148+ | — |
Commands to Install Kubernetes Tools
Use the following commands to install the Kubernetes tools listed above:
# kubectl - Kubernetes command line tool
brew install kubectl # macOS
# or
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" # Linux
chmod +x kubectl
sudo mv kubectl /usr/local/bin/
# helm - Kubernetes package manager
brew install helm # macOS
# or
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash # Linux
# helmfile - Helm chart deployment tool
brew install helmfile # macOS
# or
curl -L https://github.com/helmfile/helmfile/releases/download/v0.148.0/helmfile_linux_amd64 > /usr/local/bin/helmfile # Linux
chmod +x /usr/local/bin/helmfile
Network and DNS Tools
The following network and DNS tools are required to test connectivity, resolve issues, and validate configurations.
Tool | Purpose | Availability |
---|---|---|
nslookup and dig | DNS lookup tools | Usually pre-installed on most systems |
curl | HTTP client (API testing) | Usually pre-installed on most systems |
Verify the Required Packages Installation
-
Check versions using the following commands:
kubectl version --client
helm version
helmfile version -
Test the kubectl Connectivity using the following command:
kubectl cluster-info
Create ACR Secret
Create ACR Secret using the following command:
kubectl create secret docker-registry registry \
--docker-server=avesha.azurecr.io \
--docker-username=<your-acr-username> \
--docker-password=<your-acr-password> \
--docker-email=<your-email> \
--namespace=avesha
Download the Helm Charts Package
-
Download the AI SRE Agent Helm charts using the following command:
wget -L https://smartscaler.nexus.aveshalabs.io/repository/agents/obliq-sre-agent-charts-v1.0.0.zip
All versions are available in this location.
-
Install the
unzip
package and extract the file you just downloaded using the following commands:sudo apt-get install -y unzip
unzip agents-charts-v1.0.0.zip -d agents
cd agents
Platforms Installed From the Helm Charts
Name | Namespace | Enabled | Installed | Labels | Chart | Version |
---|---|---|---|---|---|---|
cert-manager | cert-manager | true | true | step:platform | cert-manager/cert-manager | v1.13.3 |
cert-manager-clusterissuer | cert-manager | true | true | step:platform | ./apps/cert-manager-clusterissuer | 0.1.0 |
nginx-ingress | ingress-nginx | true | true | step:platform | ingress-nginx/ingress-nginx | 4.7.1 |
Applications Installed From the Helm Charts
Name | Namespace | Enabled | Installed | Labels | Chart | Version |
---|---|---|---|---|---|---|
anomaly-detection | avesha | true | true | step:apps | ./apps/anomaly-detection | — |
active-inventory | avesha | true | true | step:apps | ./apps/active-inventory | — |
aws-ec2-cloudwatch-alarms | avesha | true | true | step:apps | ./apps/aws-ec2-cloudwatch-alarms | — |
kubernetes-events-ingester | avesha | true | true | step:apps | ./apps/kubernetes-events-ingester | — |
slack-ingester | avesha | true | true | step:apps | ./apps/slack-ingester | — |
loki-mcp | avesha | true | true | step:apps | ./apps/loki-mcp | — |
cloudwatch-mcp | avesha | true | true | step:apps | ./apps/cloudwatch-mcp | — |
neo4j-mcp | avesha | true | true | step:apps | ./apps/neo4j-mcp | — |
opentelemetry-collector | avesha | true | true | step:apps | open-telemetry/opentelemetry-collector | 0.130.2 |
incident-manager | avesha | true | true | step:apps | ./apps/incident-manager | — |
rca-agent | avesha | true | true | step:apps | ./apps/rca-agent | — |
auto-remediation | avesha | true | true | step:apps | ./apps/auto-remediation | — |
aws-mcp | avesha | true | true | step:apps | ./apps/aws-mcp | — |
k8s-mcp | avesha | true | true | step:apps | ./apps/k8s-mcp | — |
prometheus-mcp | avesha | true | true | step:apps | ./apps/prometheus-mcp | — |
neo4j | avesha | true | true | step:apps | neo4j/neo4j | 5.26.10 |
mongodb | avesha | true | true | step:apps | bitnami/mongodb | 13.17.0 |
backend | avesha | true | true | step:apps | ./apps/backend | — |
infra-agent | avesha | true | true | step:apps | ./apps/infra-agent | — |
avesha-unified-ui | avesha | true | true | step:apps | ./apps/avesha-unified-ui | — |
service-graph-engine | avesha | true | true | step:apps | ./apps/service-graph-engine | — |
orchestrator | avesha | true | true | step:apps | ./apps/orchestrator | — |
Set up an Environment
Set up an environment using the following command:
Currently, we have env.sample
only for Sandbox.
cp env.sample .env
vim .env
We recommend you to go through the env.sample
to better understand the environments. To
know more, see Explore the env.sample Helm Chart.
Environments
Currently, we have env.sample
only for Sandbox.
- sandbox - Development/testing
env/sandbox/
- Sandbox environment values (.gotmpl
files)
Configure Kubernetes Access
A few services require kubeconfig files to access Kubernetes clusters. Edit the following files in the
env/sandbox/
directory.
For detailed information, see Kubernetes Permissions.
Edit the env/sandbox/k8s-mcp.yaml.gotmpl File
configMap:
enabled: true
files:
config: |
# Add your kubeconfig content here
apiVersion: v1
kind: Config
clusters:
- name: your-cluster
cluster:
server: https://your-cluster-server:6443
certificate-authority-data: LS0tLS1CRUdJTi...
users:
- name: your-user
user:
client-certificate-data: LS0tLS1CRUdJTi...
client-key-data: LS0tLS1CRUdJTi...
contexts:
- name: your-context
context:
cluster: your-cluster
user: your-user
current-context: your-context
Edit the env/sandbox/kubernetes-events-ingester.yaml.gotmpl File
configMap:
enabled: true
files:
config: |
# Add your kubeconfig content here
apiVersion: v1
kind: Config
clusters:
- name: your-cluster
cluster:
server: https://your-cluster-server:6443
certificate-authority-data: LS0tLS1CRUdJTi...
users:
- name: your-user
user:
client-certificate-data: LS0tLS1CRUdJTi...
client-key-data: LS0tLS1CRUdJTi...
contexts:
- name: your-context
context:
cluster: your-cluster
user: your-user
current-context: your-context
Get your KubeConfig
You can either get the current cluster configuration or copy from your local kubeconfig using the the corresponding commands listed below:
-
Get current cluster configuration using the following command:
kubectl config view --raw
-
Copy from your local kubeconfig using the following command:
cat ~/.kube/config
Install Obliq AI SRE Agent in a Green Field Environment
Currently, the AI SRE Agent only supports Jira tracking tool.
Initialize Helmfile
Before installing the Agent, initialize the Helmfile environment using the following command:
# Initialize Helmfile (required before first deployment)
helmfile init
Install Obliq AI SRE Agent Quickly
-
Load environment variables using the following command:
set -a ; source .env; set +a;
-
Install the Helm chart using the following command:
helmfile -e sandbox apply
The Helm chart deployment installs platforms and applications on your cluster. See their complete list in Platforms Installed From the Helm Charts and Applications Installed From the Helm Charts.
Install Obliq AI SRE Agent in in Phases
Install in two phases using step labels:
-
Deploy the platform using the following command:
helmfile -e sandbox -l step=platform apply
This command installs only the platforms, nginx and cert-manager. See the complete list in Platforms Installed From the Helm Charts.
-
Install applications using the following command:
helmfile -e sandbox -l step=apps apply
This command installs all applications. See the complete list in Applications Installed From the Helm Charts.
Install Platform and Applications
Install both platforms and applications using the following command:
helmfile -e sandbox apply
This command installs platforms and applications. See their complete list in Platforms Installed From the Helm Charts and Applications Installed From the Helm Charts.
Get IP Address for the Obliq AI SRE Agent Console
Get the IP address from the LoadBalancer that can be used to access the Obliq AI SRE Agent console.
To get the external IP address from nginx-ingress LoadBalancer for DNS configuration, use the following commands:
# Get the LoadBalancer IP
kubectl get service -n ingress-nginx nginx-ingress-ingress-nginx-controller
# Or watch until the external IP is assigned
kubectl get service -n ingress-nginx nginx-ingress-ingress-nginx-controller -w
# Alternative: Get all services in ingress-nginx namespace
kubectl get service -n ingress-nginx
Example Output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-ingress-ingress-nginx-controller LoadBalancer 10.96.123.45 203.0.113.10 80:30080/TCP,443:30443/TCP 5m
Use the EXTERNAL-IP (203.0.113.10) for your DNS A records as shown in the example below:
*.yourdomain.com
→203.0.113.10
See Access the Obliq AI SRE Agent Console to learn how to add a port number to the LoadBalancer IP address to create the console access URL.
Troubleshooting and Cleanup
Check Access Issues
Check access and communication issues using the commands listed in the following table.
Task | Command |
---|---|
Check DNS resolution | nslookup ui.yourdomain.com |
Check SSL certificates | kubectl get certificates -n avesha |
Check ingress status | kubectl get ingress -n avesha |
Delete Persistent Volume Claims and Persistent Volumes
When you need to completely clean up your deployment or troubleshoot storage issues, you may need to delete all Persistent Volume Claims (PVCs) and Persistent Volumes (PVs).
⚠️ This will permanently delete all data!
Delete All PVCs in a Namespace
Delete all PVCs in a namespace using the following commands:
# List all PVCs in the namespace
kubectl get pvc -n avesha
# Delete all PVCs in the namespace
kubectl delete pvc --all -n avesha
# Or delete specific PVCs
kubectl delete pvc <pvc-name> -n avesha
Delete All PVs
Delete all PVs using the following commands:
# List all PVs
kubectl get pv
# Delete all PVs (be careful - this affects the entire cluster)
kubectl delete pv --all
# Or delete specific PVs
kubectl delete pv <pv-name>
Force Delete Stuck PVCs/PVs
If PVCs or PVs are stuck in the Terminating
state, force delete them using the following commands:
# Force delete stuck PVC
kubectl patch pvc <pvc-name> -n avesha -p '{"metadata":{"finalizers":null}}' --type=merge
# Force delete stuck PV
kubectl patch pv <pv-name> -p '{"metadata":{"finalizers":null}}' --type=merge
Complete Cleanup Script
Use this cleanup script in the following scenarios:
- Complete environment reset: Starting fresh with no data
- Storage troubleshooting: Resolving persistent storage issues
- Environment migration: Moving to a different storage class
- Development cleanup: Removing test data
Use the following script for a complete cleanup:
#!/bin/bash
NAMESPACE="avesha"
echo "⚠️ WARNING: This will delete ALL data in namespace: $NAMESPACE"
read -p "Are you sure? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
echo "Deleting all PVCs in namespace: $NAMESPACE"
kubectl delete pvc --all -n $NAMESPACE
echo "Deleting all PVs"
kubectl delete pv --all
echo "Cleanup complete!"
else
echo "Cleanup cancelled."
if
Post Cleanup
After deleting PVCs and PVs, redeploy your applications using the following commands:
# Load environment variables
set -a ; source .env; set +a;
# Redeploy
helmfile -e sandbox apply
Charts
- service-graph-engine - Service graph engine
- cert-manager-clusterissuer - SSL certificate issuers
Global Configuration
Each environment includes a global.yaml.gotmpl
file with common settings:
- Image registry configuration
- Resource limits
- Security contexts
- Monitoring settings
- Ingress configuration
- Environment variables from
.env
file
Service-Specific Configuration
Each service has its own .gotmpl
file in the environment directory:
backend.yaml.gotmpl
- Backend service configurationanomaly-detection.yaml.gotmpl
- Anomaly detection configurationincident-manager.yaml.gotmpl
- Incident manager configurationservice-graph-engine.yaml.gotmpl
- Service graph engine configurationavesha-unified-ui.yaml.gotmpl
- Unified UI configurationinfra-agent.yaml.gotmpl
- Infrastructure agent configuration
Customization
# Override values
helmfile apply --environment sandbox --set backend.image.tag=v1.2.0
# Use custom values file
helmfile apply --environment sandbox -f custom-values.yaml
SSL Setup
Add the following properties to your .env
file:
CERT_MANAGER_EMAIL=admin@yourdomain.com
DOMAIN_NAME=yourdomain.com
Services will be available at:
https://ui.yourdomain.com
https://api.yourdomain.com