Today, let’s discuss a key post-K8S setup task: Monitoring.
Whether it’s a platform, service, software, or hardware, in order to achieve the highest level of availability, we rely on various monitoring methods to help us understand the status of the target. This enables us to respond quickly when issues arise.
If done well, it can even predict potential issues before they occur and proactively address them. This article introduces some basic concepts about the monitoring solution Loki on the K8S platform.
As usual, here are the sections this article will cover:
- Basic Concepts of Loki
- PLG Stack
- Deployment Modes
1. Basic Concepts of Loki
In the past, under Kubernetes, it was common to use the EFK stack with Elasticsearch. However, for smaller K8S clusters, this setup can be resource-intensive and complex to manage. In such cases, choosing an alternative K8S monitoring solution with Grafana Loki as the core can be a more suitable option.
Like EFK, Loki originates from the Prometheus ecosystem and offers high availability and horizontal scalability. The key difference is that Loki doesn’t index the entire log content but provides labels for each log. This makes Loki lighter and requires fewer resources than traditional EFK stacks.
The differences between Loki and other monitoring solutions are as follows:
- It doesn’t index the entire log content, instead storing compressed, unstructured logs and indexing only the metadata. This results in a lighter and more cost-effective operation.
- It seamlessly transitions by utilizing the same indexes and labels already in use with Prometheus.
- Ideal for storing K8S Pod logs.
- Natively supported in Grafana v6.0 and later.
2. PLG Stack
The term ‘PLG stack’ refers to the combination of the following three projects:
- P: Promtail
- L: Loki
- G: Grafana
These three components work together in the following way:
For these three components, their functions can be described as follows:
- Promtail: Equivalent to “Filebeat/Fluentd” in EFK, responsible for collecting logs and sending them to Loki.
- Loki: Equivalent to Elasticsearch, used for storing logs and processing queries.
- Grafana: User interface (UI)
After Promtail collects the logs, it sends them to the main component, Loki. Loki can then use LogQL to directly convert queries into Prometheus metrics, making them available for users to query and visualize in the Grafana UI.
3. Deployment Modes
Three main deployment types include:
(1) Monolithic (All-in-One): All components run within a single container, typically for testing and small-scale use. All components run within a single process, and communication between components occurs via localhost (gRPC). Deployment can be done using Helm. (Recommended for daily usage of up to 100GB).
(2) Simple Scalable Mode (Simple HA): Deployed across multiple read/write replica nodes. (Recommended for daily usage of several terabytes)
- Read node: Handles log query responses (Read Path).
- Write node: Responsible for storing logs and backend indexing (Write Path).
- Gateway node: An Nginx-based load balancer that directs push traffic to the write node and other traffic to read nodes using round-robin.”
(3) Microservices: Each component can run independently in a containerized manner. (Recommended for very large-scale deployments) This mode is suitable for use with Kubernetes, where each component operates independently.
For production environments, it’s currently recommended to use the Simple Scalable Mode or Microservices mode for better performance and scalability.
Traditionally, after setting up a Kubernetes, I would often adopt the EFK stack to deploy the entire monitoring system. However, this process often required a significant amount of effort for deployment and fine-tuning. Moreover, creating suitable dashboards was another challenge. Additionally, the traditional EFK architecture tends to be resource-intensive, requiring more system resources to be allocated during planning.
The Loki-based PLG Stack offers me a better alternative. I can set up the entire system in a Standalone mode for testing environments, and in production, I have more scalable options available. For administrators, this not only simplifies deployment but also provides the advantages of being lightweight and flexible.
I hope to gain a deeper understanding of monitoring and share knowledge during the learning journey. Stay tuned for more insights and information.
Your encouragement keeps me motivated, so please stay tuned for more!