DevOps 4: Monitoring with Prometheus and Grafana
As our microservices architecture on Amazon EKS matures, gaining visibility into its health, performance, and efficiency becomes indispensable. This is where an effective observability strategy comes into play, with Prometheus and Grafana at its core. In this installment, we dive into setting up Prometheus to collect metrics across our Kubernetes cluster and using Grafana to create insightful visualizations of these metrics. Together, they form a powerful duo for monitoring our applications and infrastructure.
Why Prometheus and Grafana?
Prometheus is an open-source monitoring solution that collects and stores its metrics as time-series data. It offers a flexible query language, PromQL, for querying this data. Grafana, on the other hand, is an open-source platform for analytics and monitoring that can integrate with Prometheus to provide rich visualizations of the collected data.
This combination allows for detailed monitoring and alerting on the health and performance of both the applications and the infrastructure, enabling developers and operators to detect and respond to issues more quickly and effectively.
Setting Up Prometheus and Grafana on Amazon EKS
Step 1: Deploying Prometheus
We’ll start by deploying Prometheus using the Prometheus Operator, which simplifies the configuration of Prometheus instances within Kubernetes.
Using Helm to Deploy Prometheus Operator
First, add the Prometheus Community Helm chart repository and update your repo list:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Next, install the Prometheus Operator, which includes Prometheus, Alertmanager, and Grafana, using Helm:
helm install prometheus prometheus-community/kube-prometheus-stack
This command deploys Prometheus Operator with default settings, which are suitable for many use cases. You can customize the installation by creating a values file and passing it to Helm with the -f
flag.
Step 2: Accessing Prometheus and Grafana
The Prometheus Operator chart automatically deploys Grafana and configures it to use Prometheus as a data source.
- Accessing Prometheus: To access the Prometheus UI, you can port-forward the Prometheus server service to your local machine:
kubectl port-forward svc/prometheus-operated 9090:9090
Then, visit http://localhost:9090
in your browser.
- Accessing Grafana: Similarly, to access Grafana, port-forward the Grafana service:
kubectl port-forward svc/prometheus-grafana 3000:80
Then, visit http://localhost:3000
in your browser. The default login credentials are usually admin
for both the username and password, which you should change immediately.
Step 3: Configuring Dashboards in Grafana
Once in Grafana, you can start creating dashboards to visualize the metrics collected by Prometheus. Grafana offers a wide range of pre-built dashboards, which you can import via their dashboard IDs found on the Grafana Dashboards page.
To import a dashboard, click the + icon on the left sidebar, select “Import”, and enter the dashboard’s ID.
Conclusion
With Prometheus and Grafana set up, you now have a robust observability framework integrated into your EKS cluster. This setup empowers you to monitor the performance and health of your microservices and infrastructure, ensuring that you can quickly identify and address potential issues before they impact your users.
Gotchas and Tips
- Storage for Prometheus: By default, Prometheus stores its data on a local volume. For production environments, consider configuring persistent storage to prevent data loss.
- Secure Your Dashboards: Grafana and Prometheus should not be exposed publicly without proper authentication and authorization. Look into Kubernetes Ingress controllers or cloud-native solutions to securely expose these services.
- Custom Metrics: Beyond default metrics, you can configure Prometheus to scrape custom metrics from your applications, providing deeper insights into their behavior.
Implementing Prometheus and Grafana significantly enhances your ability to operate and maintain a healthy, efficient microservices architecture on Amazon EKS. By leveraging these tools, you can ensure your applications are performing optimally and are reliable for your users.