What is SRE?

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. In general, an SRE team is responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their service(s).

SRE Principles

  1. Find Service Level
    Service Level Indicator(SLI), Service Level Object(SLO) & Service Level Agreement(SLA) are parameters with which reliability, availability and performance of the service are measured.
  2. Error Budgets
    •An error budget is 1 minus the SLO of the service. A 99.9% SLO service has a 0.1% error budget. If our service receives 1,000,000 requests in four weeks, a 99.9% availability SLO gives us a budget of 1,000 errors over that period.
  3. Eliminate Toil
    Toil is the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows. SRE job is to eliminate as many as Toils by Automating stuff
  4. Automate Everything
    SRE team Automation provides
       – Consistency as systems scale
       – A platform for extending to other systems
       – Faster repairs for common problems
       – Faster action than humans
       – Time savings by decoupling operator from
  5. Support Releases
    Running reliable services requires reliable release processes.
    Continuously build and deploy, including
    – Automating check gates
    – A/B deployments and other methods for checking sanity
            SRE don’t afraid to roll-back a problem release.
Facebook Comments Box

DevOps to SRE Tranformation

It is not as easy as you think. And let me explain why?

There are so many misconceptions about the SRE = DevOps. But, it is NOT equal and there are so many things in SRE that DevOps won’t cover. For example, DevOps focus more on Deployment Velocity and application uptime. But, SRE focus on SILOs and Error budgets. DevOps won’t take any authority on deployments nor it influences the deployment velocity. Where SRE can STOP Deploying the application when the Error budget is exceeded. So, this proves SRE has authority in the SDLC process and also it can impact the business owner’s view.

And There is say “SRE Class Implement DevOps”. if we take SRE as one Big function….

SRE(DevOps) {

I would assume that most Organizations are practicing the DevOps that want to jump into SRE to increase their Application or Product uptime and focus on the SILOs.

DevOps to SRE
1. There are not many changes required, easy to get started on your SRE journey.  DevOps is mainly focused CI/CD, automation, and monitoring apps. with this DevOps team easily adapt the SRE culture by Implementing the additional controls in SRE. This is still a big change but i would say that should be the start.

2. You can practice and adopt SRE approach, an experiment in your environment (product) at a low cost. As i mentioned above, we can start with the DevOps controls, and moving forward the Practicing the SRE controls won’t cost.

3. FullStack to SRE Journey. Small and medium enterprise Companies have a limited # of DevOps Engineers following the full-stack engineering model. -That case implemented SRE will be 5 Steps Process.

Fullstack to SRE Journey

4. No knowledge/coverage gaps between SRE/DevOps teams. DevOps acts as a glue between various teams that are creating solutions, dependent on each other, or consists out of distinct pieces of software. So, moving to SRE from DevOps is not going to be challenging.

DevOps to SRE Model

Again, this transformation depends on how teams collaborate with each other and how fast they can adapt to the change. There are a few Good books available for you to learn SRE Approach. But in my view, not all textbooks and theories can teach you with specific teams structure that you have. Understanding the current state is the starting point for the SRE journey.

Here Some of SRE books links:
Site Reliability Engineering: How Google Runs Production Systems (known as “The SRE Book”)
The Site Reliability Workbook: Practical Ways to Implement SRE (known as “The SRE Workbook”)
Seeking SRE: Conversations About Running Production Systems at Scale

Let me know your thoughts in comments.

Facebook Comments Box

Vicidial install on Ubuntu 18.04

Updated: Oct-18-2021

Note: Below steps only cover standalone server installation on Ubuntu 18.04.

I am using Digitalocean VPC. Installation and it should be similar in AWS EC2 instances.

Make sure to open 8088,8089,80,443 TCP and 10000 -20000 UDP ports Open in your firewall..

git clone https://github.com/jaganthoutam/vicidial-install-centos7.git
cd vicidial-install-scripts
chmod +x vicidial-install-ubuntu18.sh

While installing Please enter below details:

#Do back to root Directory of vicidial 
cd .. 
perl install.pl

#Fallow the setup with appropriate

#Configiguration example

#Populate ISO country codes 

cd /usr/src/astguiclient/trunk/bin perl ADMIN_area_code_populate.pl 

#update the Server IP with latest IP address.(VICIDIAL DEFAULT IP IS 

perl /usr/src/astguiclient/trunk/bin/ADMIN_update_server_ip.pl --old-server_ip= #Say 'Yes' to all
VICIDIAL processes run on screen. There should be 9 Processes running on the screen.
[email protected]:~# screen -ls

There are screens on:

 2240.ASTVDremote (03/21/2019 02:16:03 AM) (Detached)

 2237.ASTVDauto (03/21/2019 02:16:03 AM) (Detached)

 2234.ASTlisten (03/21/2019 02:16:02 AM) (Detached)

 2231.ASTsend (03/21/2019 02:16:02 AM) (Detached)

 2228.ASTupdate (03/21/2019 02:16:02 AM) (Detached)

 2025.ASTconf3way (03/21/2019 02:15:02 AM) (Detached)

 2019.ASTVDadapt (03/21/2019 02:15:02 AM) (Detached)

 1826.asterisk (03/21/2019 02:14:51 AM) (Detached)

 1819.astshell20190321021448 (03/21/2019 02:14:49 AM) (Detached)

9 Sockets in /var/run/screen/S-root.
All Set now. Now, You can configure web interface and logins.
Vicidial Admin login :
user: 6666
Pass: 1234
Continue On to the Initial Setup
#Add Secure Password for admin and SIP
#Give Super admin access to 6666 user
users —> 6666 –> Change all 0 to 1 in Interface Options.
For WebRTC we need to Run the below Script
chmod +x vicidial-enable-webrtc.sh

#Next steps
1. Create Campaign
2. Create SIP Trunk
3. Create Dialplan
4. Upload Leads
5. Register Users to SopftPhone
6. Create Agents/users
Note: If WebRTC enable you don’t need softphone anymore.
And Enjoy…
Note: if you building the server for more than 30+ agents, I recommend to use bare metal servers than VPC. 
Please let me know if you have any issues.
Facebook Comments Box

ViciDial CentOS 7 Installation Script

OS Version: CentOS Linux release 7.9.2009 (Core)

It been so Many years I never saw any Quick Installation Script for ViciDial. So, decided to Create one.
There are two scripts found in the repo

vicidial-install-centos7.sh Contains full ViciDial installation.
vicidial-enable-webrtc.sh WebRTC with WebPhone Configuration.

There are few pre_requisites you need to do before running ViciDial Installations Scripts.

yum check-update
yum update -y
yum install epel-release -y
yum update -y
yum groupinstall 'Development Tools' -y
yum install git -y
yum install kernel* -y

#Disable SELINUX
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config   

********Reboot is Necessary********

Now, you can down load the repo and Start ViciDial Installation.

git clone https://github.com/jaganthoutam/vicidial-install-scripts.git
cd vicidial-install-scripts
chmod +x vicidial-install-centos7.sh

If you want Run ViciDial with WebRTC and WebPhone. You need to run Below Script.

chmod +x vicidial-enable-webrtc.sh
Facebook Comments Box

Gitlab & Runner Install with Private CA SSL

This installation method is used in AWS EKS Cluster to Install Gitlab and Gitlab Kubernetes Executors. 

Tech stack used in this installations:

  • EKS Cluster(2 Node with )
  • Controller EC2 Instance (To Manage the EKS cluster)
  • Helm (Gitlab Installation)
  • SSL certs(Self-Signed/SSL Provider/Private CA)

EKS Cluster:

Creating EKS cluster is not Part of this Discussion. Please fallow this EKS Cluster creation Doc.

Controller EC2 Instance:

Create Ec2 Instance with Proffered, in this case i am using Amazon Linux AMI.(Make Sure that EKS cluster and Controller in Same VPC.) In-Order to maintain the EKS you need kubectl installed in EC2 and also you need to import the kubeconfg from the Cluster. Lets see how we can do that.

And Also, we will be using helm to Install the Gitlab.

Install Kubectl:

curl -o kubectl https://amazon-eks.s3.us-west-2.amazonaws.com/1.18.9/2020-11-02/bin/linux/amd64/kubect
chmod +x ./kubectl
mkdir -p $HOME/bin && cp ./kubectl $HOME/bin/kubectl && export PATH=$PATH:$HOME/bin
yum install bash-completion
kubectl version --client

Install Kubectl bash completion:

yum install bash-completion
type _init_completion
source /usr/share/bash-completion/bash_completion
type _init_completion
echo 'source <(kubectl completion bash)' >>~/.bashrc
kubectl completion bash >/etc/bash_completion.d/kubectl

Get EKS Cluster list and Import kubeconfig:
(replace the –name with Cluster name)

aws eks update-kubeconfig --name <NAME OF THE EKS CLUSTER >

Install Helm:

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
cp /usr/local/bin/helm /usr/bin/

Install Helm Auto completion:

helm completion bash >> ~/.bash_completion
. /etc/profile.d/bash_completion.sh
. ~/.bash_completion
source <(helm completion bash)

Now, EC2 instance is ready for the Gitlab installation. Before going to install the Gitlab in EKS. Let create TLS and Generic Secrets for Gitlab and Gitlab-Runner.

You can use any other SSL provider like(Lets Encrypt, Digicert, Comodo …). Here i am using Self Signed Certificates. You can generate Self Signed Certificates with this Link.

Create TLS Secret for Gitlab’s Helm chart Global Values:

kubectl create secret tls gitlab-self-signed --cert=gitlab.gitlabtesting.com.crt --key=gitlab.gitlabtesting.com.key

Here we created secret name gitlab-self-signed with cert and Key. It is better way of mounting the SSL certificate to Ingress.

Create SSL Generic cert Secret:

This will be used for communication between the Gitlab Server and Gitlab-runner Visa SSL. (IMPORTANT: Make sure the filename you mounting Match with the Domain). in this Case my Domain name is gitlab.gitlabtesting.com.

kubectl create secret generic gitlabsr-runner-certs-secret-3 --from-file=gitlab.gitlabtesting.com.crt=gitlab.gitlabtesting.com.crt

Create service account:(This will be used for gitlab-runner to perform actions)

vim gitlab-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
  name: gitlab
  namespace: kube-system
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
  name: gitlab-admin
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
  - kind: ServiceAccount
    name: gitlab
    namespace: kube-system
kubectl apply -f vim gitlab-serviceaccount.yaml

Now that everything ready lets create vaules.yaml for Gitlab Values.

Example file look below.

Add Gitlab Helm to repo:

  email: [email protected]
  install: false
        cpu: 50m
        memory: 650M
        secretName: gitlab-self-signed #TLS Secret we catered above
        memory: 1.5G
  install: false
    privileged: true
    domain: gitlabtesting.com
      enabled: true
  enabled: false
  install: false
      secretName: gitlab-self-signed #TLS Secret we catered above
helm repo add gitlab https://charts.gitlab.io/

Install Gitlab with Helm with Values file we created above:

helm install gitlab gitlab/gitlab -f values.yaml

After 5 min, all the pods will be up. You can check with below command and Also get Root password of Gitlab Login:

kubectl get po

#Get Root password:

kubectl get secret gitlab-gitlab-initial-root-password -ojsonpath='{.data.password}' | base64 --decode ; echo

Now Gitlab Installation Completed. You can access the Gitlab with https://gitlab.gitlabtesting.com


Facebook Comments Box

Kubernetes pods dns issue with kube-flannel.

kubectl -n kube-system logs coredns-6fdfb45d56-2rsxc

[INFO] plugin/reload: Running configuration MD5 = 8b19e11d5b2a72fb8e63383b064116a1
linux/amd64, go1.13.6, da7f65b
[ERROR] plugin/errors: 2 1898610492461102613.3835327825105568521. HINFO: read udp> i/o timeout
[ERROR] plugin/errors: 2 1898610492461102613.3835327825105568521. HINFO: read udp> i/o timeout
[ERROR] plugin/errors: 2 1898610492461102613.3835327825105568521. HINFO: read udp> i/o timeout

Debugging DNS Resolution

kubectl exec -ti dnsutils -- nslookup kubernetes.default
Address 1:

nslookup: can't resolve 'kubernetes.default'

How to Solve it ?

iptables -P INPUT ACCEPT
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
systemctl restart docker
systemctl restart kubelet

Apply in Nodes/Master. And check if logs working or not.

Facebook Comments Box

Docker Swarm Cluster Kong and Konga Cluster (Centos/8)

We are using kong-konga-compose to deploy the Cluster Kong with Konga.

Preparation: Execute below commands on All nodes.

 systemctl stop firewalld
 systemctl disable firewalld
 systemctl status firewalld
 sed -i s/^SELINUX=.*$/SELINUX=permissive/ /etc/selinux/config
 setenforce 0
 yum update -y
 yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
 sudo curl  https://download.docker.com/linux/centos/docker-ce.repo -o /etc/yum.repos.d/docker-ce.repo
 sudo yum makecache
 sudo dnf -y install docker-ce
 sudo dnf -y install  git
 sudo systemctl enable --now docker
 sudo curl -L https://github.com/docker/compose/releases/download/1.25.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
 sudo chmod +x /usr/local/bin/docker-compose && ln -sv /usr/local/bin/docker-compose /usr/bin/docker-compose
 sudo docker-compose --version
 sudo docker --version

in node01:

docker swarm init --advertise-addr MASTERNODEIP

docker swarm join --token SWMTKN-1-1t1u0xijip6l33wdtt7jpq51blwx0hx3t54088xa4bxjy3yx42-90lf5b4nyyw4stbvcqyrde9sf MASTERNODEIP:2377

in node02:

# The command you find in MASTER NODE.
  docker swarm join --token SWMTKN-1-1t1u0xijip6l33wdtt7jpq51blwx0hx3t54088xa4bxjy3yx42-90lf5b4nyyw4stbvcqyrde9sf MASTERNODEIP:2377

in node01:

docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
m55wcdrkq0ckmtovuxwsjvgl1 *   master01            Ready               Active              Leader              19.03.8
e9igg0l9tru83ygoys5qcpjv2     node01              Ready               Active                                  19.03.8

git clone https://github.com/jaganthoutam/kong-konga-compose.git
cd kong-konga-compose

docker stack deploy --compose-file=docker-compose-swarm.yaml kong

#Check Services
docker service ls
ID                  NAME                  MODE                REPLICAS            IMAGE                             PORTS
ahucq8qru2xx        kong_kong             replicated          1/1                 kong:1.4.3                        *:8000-8001->8000-8001/tcp, *:8443->8443/tcp
bhf0tdd36isg        kong_kong-database    replicated          1/1                 postgres:9.6.11-alpine
tij6peru7tb8        kong_kong-migration   replicated          0/1                 kong:1.4.3
n0gaj0l6jyac        kong_konga            replicated          1/1                 pantsel/konga:latest              *:1337->1337/tcp
83q1eybkhvvy        kong_konga-database   replicated          1/1                 mongo:4.1.5   
Facebook Comments Box

ElasticSearch Filebeat custom index

Custom Template and Index pattern setup.

    setup.ilm.enabled: false               #Set ilm to False 
    setup.template.name: "k8s-dev"         #Create Custom Template
    setup.template.pattern: "k8s-dev-*"    #Create Custom Template pattern
      index.number_of_shards: 1    #Set number_of_shards 1, ONLY if you have ONE NODE ES
      index.number_of_replicas: 0#Set number_of_replicas 1, ONLY if you have ONE NODE ES
       hosts: ['']
       index: "k8s-dev-%{+yyyy.MM.dd}" #Set k8s-dev-2020.01.01 as Index name


apiVersion: v1
kind: ConfigMap
  name: filebeat-config
  namespace: kube-system
    k8s-app: filebeat
  filebeat.yml: |-
    - type: container
        - /var/log/containers/*.log
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            - logs_path:
                logs_path: "/var/log/containers/"

    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    #  providers:
    #    - type: kubernetes
    #      node: ${NODE_NAME}
    #      hints.enabled: true
    #      hints.default_config:
    #        type: container
    #        paths:
    #          - /var/log/containers/*${data.kubernetes.container.id}.log

      - add_cloud_metadata:
      - add_host_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    setup.ilm.enabled: false
    setup.template.name: "k8s-dev"
    setup.template.pattern: "k8s-dev-*"
      index.number_of_shards: 1
      index.number_of_replicas: 0

       hosts: ['']
       index: "k8s-dev-%{+yyyy.MM.dd}"

apiVersion: apps/v1
kind: DaemonSet
  name: filebeat
  namespace: kube-system
    k8s-app: filebeat
      k8s-app: filebeat
        k8s-app: filebeat
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:7.6.2
        args: [
          "-c", "/etc/filebeat.yml",
        - name: ELASTICSEARCH_HOST
          value: ""
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTIC_CLOUD_ID
        - name: ELASTIC_CLOUD_AUTH
        - name: NODE_NAME
              fieldPath: spec.nodeName
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
            memory: 200Mi
            cpu: 100m
            memory: 100Mi
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlog
          mountPath: /var/log
          readOnly: true
      - name: config
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
          path: /var/lib/docker/containers
      - name: varlog
          path: /var/log
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
  name: filebeat
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
  name: filebeat
    k8s-app: filebeat
- apiGroups: [""] # "" indicates the core API group
  - namespaces
  - pods
  - get
  - watch
  - list
apiVersion: v1
kind: ServiceAccount
  name: filebeat
  namespace: kube-system
    k8s-app: filebeat

Index will appear in kibana:

Create index pattern in Kibana:


Facebook Comments Box

Clone all git repos from Organisation

Using below command you can clone all the git repos at once.

for i in `curl -u USERNAME:TOKEN_HERE -s "https://api.github.com/orgs/ottonova/repos?per_page=200" |grep ssh_url | cut -d ':' -f 2-3|tr -d '",'`; do git clone $i; done

Facebook Comments Box

Mac cleanup

#user cache file
echo "cleaning user cache file from ~/Library/Caches"
rm -rf ~/Library/Caches/*
echo "done cleaning from ~/Library/Caches"
#user logs
echo "cleaning user log file from ~/Library/logs"
rm -rf ~/Library/logs/*
echo "done cleaning from ~/Library/logs"
#user preference log
echo "cleaning user preference logs"
#rm -rf ~/Library/Preferences/*
echo "done cleaning from /Library/Preferences"
#system caches
echo "cleaning system caches"
sudo rm -rf /Library/Caches/*
echo "done cleaning system cache"
#system logs
echo "cleaning system logs from /Library/logs"
sudo rm -rf /Library/logs/*
echo "done cleaning from /Library/logs"
echo "cleaning system logs from /var/log"
sudo rm -rf /var/log/*
echo "done cleaning from /var/log"
echo "cleaning from /private/var/folders"
sudo rm -rf /private/var/folders/*
echo "done cleaning from /private/var/folders"
#ios photo caches
echo "cleaning ios photo caches"
rm -rf ~/Pictures/iPhoto\ Library/iPod\ Photo\ Cache/*
echo "done cleaning from ~/Pictures/iPhoto Library/iPod Photo Cache"
#application caches
echo "cleaning application caches"
for x in $(ls ~/Library/Containers/) 
    echo "cleaning ~/Library/Containers/$x/Data/Library/Caches/"
    rm -rf ~/Library/Containers/$x/Data/Library/Caches/*
    echo "done cleaning ~/Library/Containers/$x/Data/Library/Caches"
echo "done cleaning for application caches"
Facebook Comments Box