Skip to main content

Monitoring your containers in an AKS cluster with Prometheus

Monitoring and alerting is arguably one of the most important thing in Cloud Engineering and DevOps. It's the difference between your clients stack being up and a client being down. Most of us have SLA's to abide by (for good reason). Today we're going to learn how to spin up Prometheus in an AKS cluster to monitor our applications.

Pre-reqs;
1. Intermediate knowledge of Kubernetes
2. An AKS cluster spun up in Azure

Recently AKS supports Prometheus via Helm, so we'll use that for an automated solution to spin this up. This installs kube-prometheus, which is a containerized version of the application. With raw Prometheus, there are a few things that are needed for the operator;

1. Prometheus: Defines a desired deployment.
2. ServiceMonitor: Specifies how groups of services should be monitored
3. Alertmanager: Defines the operator to ensure services and deployments are running by matching the resource

With kube-prometheus, it is all packaged for you. This means configuring all of that is not needed. It also comes with Grafana, an open source UI to monitor and manage your deployments/services.

First, let's go into our Cloud Shell in Azure to confirm we can access our AKS cluster. If you cannot see anything after running kubectl get nodes -o wide, you will need to authenticate by running:


Your --name and --resource-group will of course be different, so please put in the metadata needed for this.

Next we'll have to configure Helm by running helm init in your Cloud Shell. If all is well, you will see that Helm and Tiller is installed. Helm is a package manager much like yum or apt in Linux. Tiller is the component that talks directly to the Kubernetes API for installing your Helm resources into the cluster.

After that, we'll go ahead and install CoreOS into our Kubernetes cluster. CoreOS is the leading container Operating System. It will provide us automation, security, and scalability for critical applications. For more info, please visit: https://coreos.com/why/

helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/




We need a namespace for Prometheus to live, so let's go ahead and create a monitoring namespace by doing the following;

kubectl create namespace monitoring

Now let's go ahead and install the Prometheus operator and kube-prometheus. For this, we will need to give Tiller an RBAC (Role Based Access Control) policy for kube-system namespace and the monitoring namespace.

Please run the following yml files with kubectl apply -f. For example, if you give your YAML files a name of rbac.yml, you will run kubectl apply -f rbac.yml to apply the RBAC policies.


apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: tiller
    namespace: kube-system


apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: tiller
    namespace: monitoring


After the above is ran, you should be able to install Prometheus.

The following two lines will install the Prometheus operator and kube-prometheus

helm install coreos/prometheus-operator --name prometheus-operator --namespace monitoring

helm install coreos/kube-prometheus --name kube-prometheus --set global.rbacEnable=false --namespace monitoring

You should see something similar to the following screenshots;



By installing kube-prometheus, we got the following;

1. One Prometheus server
2. A service monitor that allows us to monitor the cluster itself
3. Grafana installed that has a set of pre-configured dashboards.

Now that we have everything set up, I'd like to access my Grafana dashboard from my localhost. Ideally in a production environment, you'd have a jumpbox in Azure for this purpose. For us demoing, doing it on localhost is fine. You'll have to connect to the AKS cluster by running the following in a PowerShell Window;

az aks get-credentials --name AKSTest10 --resource-group AKSTest

Please run the above with the proper metadata (your AKS cluster name and Resource Group)

Now that we have our server up, we have to connect to the pod that is running the server. We can do this with some Kubernetes port forwarding. Please use the following command;

kubectl --namespace monitoring port-forward $(kubectl get pod --namespace monitoring -l prometheus=kube-prometheus -l app=prometheus -o template --template "{{(index .items 0).metadata.name}}") 9090:9090

We will forward our traffic for our Grafana UI to port 9090. We're also forwarding to our local host so we can access the Grafana dashboard from our computer.


If all is working as it should, you should be able to go to http://localhost:9090/targets and see your UI.

Open up your Cloud Shell again and this time, go to Bash. Run the following to get your username and password to log in.

echo username:$(kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.user}"|base64 --decode;echo) echo password:$(kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.password}"|base64 --decode;echo)

Next we'll forward to port 3000 to the pod that hosts Grafana (please do this on your localhost in another PowerShell window while keeping the other PowerShell window for forwarding traffic on 9090)

kubectl --namespace monitoring port-forward $(kubectl get pod --namespace monitoring -l app=kube-prometheus-grafana -o template --template "{{(index .items 0).metadata.name}}") 3000:3000

Now open up http://localhost:3000/login to log into your dashboard. Log in with your username and password




And there you have it! Your container monitoring solution is up and running!


Comments

Popular posts from this blog

So, you want to be a Cloud Engineer?

In 2019 one of the biggest pieces of tech is the cloud. Whether it be public cloud or private cloud, cloud technologies are here to stay (for now). I predict that Cloud Engineering will be a very big part of IT (and development) for another 5-10 years. Today I want to share with you my journey in becoming a Cloud Engineer and some helpful tips. A career timeline to be a Cloud Engineer can go like so;

Desktop Support > Junior Sysadmin > Sysadmin > Sysadmin/Technical Lead > Engineer >  Cloud Engineer.

Although our career paths may not align, I believe that this progression is very import. Let me tell you why.



Helpdesk/Desktop Support Helpdesk and desktop support get your feet wet. It allows you to understand technology and how it's used in the workplace from a business perspective. It shows you what technologies may be best in the current environment your in and how to support those technologies. It also teaches you soft skills and how to support people from a technic…

Spinning up a Kubernetes cluster with Kubeadm

In today's world, we have several public cloud technologies that will ultimately help us with spinning up these infrastructures. This however comes with a price. Because a public cloud provider (like AWS or Azure) handles the API/master server and networking, you'll get something up quick, but miss some key lessons of spinning up a Kubernetes cluster. Today, I'll help you with that.

There are some pre-reqs for this blog:
1. At least 3 VM's. In my case, I'm using my ESXi 6.7 server at home.
2. Basic knowledge/understanding of what Kubernetes is utilized for.
3. Windows, Mac, or Linux desktop. For this blog, I am using Windows 10.

The first thing you want to do is spin up three virtual machines running Ubuntu18.04. You can use a RHEL based system, but the commands I show and run (including the repos I'm using) will be different.

I have already set up my 3 virtual machines. I gave them static IP addresses as I have found API/configuration issues if the VM shuts do…