Optimize K8S for Your System: Quick Tips!(En)

Albert Weng
5 min readApr 8, 2024

This article will briefly discuss what to do after installing a Kubernetes cluster and start thinking about how to make the cluster run smoother.

The points provided in this article can also be considered for post-installation adjustments. However, in practice, further adjustments may be needed based on different enterprise requirements, the purpose of the cluster, and its actual use.

The following are the topics that this article will cover:

  1. Why optimization is necessary
  2. The importance of optimization
  3. Basic recommendations
  4. Conclusion

1. Why optimization is necessary

As Kubernetes gains popularity among businesses, the need to fine-tune existing platforms to accommodate various scenarios grows. After all, there’s a diverse range of applications with unique behaviors running on this platform.

So, let’s start by defining the significance of K8S optimization. Optimization refers to the process of making something (such as a design, system, or decision) as perfect, practical, or efficient as possible in its behavior, process, or method.

From the perspective of Kubernetes, our aim is to ensure that application services can operate perfectly and efficiently in the environment, running as normally and effectively as possible. Each organization may have its own definitions, but primarily, it can be composed of the following two components:

  • Performance and reliability of the application: This includes factors such as response time and downtime.
  • Cost of running the application: This includes operational costs such as CPU, RAM, and disk usage.

In simple terms, it’s all about achieving the SLA/SLO of the application services running.

2. The importance of optimization

With the increasing adoption of K8S, many container scheduling, runtime, and automation tasks are becoming more complex.

The primary reasons for adopting K8S typically include:

  • Rapid deployment of applications
  • Reduction of IT costs
  • Easier application updates

By optimizing the Kubernetes platform, the following benefits can be achieved:

  • Cost savings: Since platforms are often built on public clouds, long-term resource wastage costs could exceed 47%.
  • User satisfaction: Improved SLA/SLO
  • More efficient resource utilization: Balancing cost savings with efficiency
  • Environmental responsibility: Less resource wastage

3. Basic recommendations

In practice, optimization can be very time-consuming and challenging. Completing a comprehensive tuning process may require days to weeks of adjustments and debugging.

(1) Achieving Cluster Auto-scaling

Implementing automatic scaling of cluster nodes allows for resource utilization to better match actual workload demands. This ensures that the resources used align closely with your actual needs.

Automatic scaling encompasses several aspects:

  • Cluster Autoscaler: Automatically adjusts the number of cluster nodes to ensure sufficient space for Pod execution.
  • Vertical Pod Autoscaler: Automatically adjusts CPU/RAM usage of Pods.
  • Horizontal Pod Autoscaler (HPA): Automatically scales the number of application service Pods based on observed CPU and other load factors. This approach handles sudden spikes in demand and automatically reduces resources once the demand subsides.
  • Addon Resizer: A simplified version of the Vertical Pod Autoscaler, achieved by adjusting deployments.

Furthermore, leveraging node or pod affinity and anti-affinity can further distribute application services more evenly and maintain high availability.

(2) Optimizing Container Request & Limits

Kubernetes allows specifying the CPU/RAM requirements and limits for containers. Adjusting these parameters appropriately ensures that there are sufficient resources for operation.

However, in practice, it’s advisable not to overly restrict CPU limits.I would recommend letting the cluster handle CPU resource scheduling autonomously.

(3) Accurate Workload Estimation

Properly estimate the actual resource usage by analyzing workloads. Overestimation can lead to increased costs, while underestimation can result in instability of the application system. The following tools can assist in accurate analysis:

  • Prometheus / Grafana
  • Regularly review the relevant content of resource monitoring
  • Third-party tools like “KubeCost”.

Another aspect of optimization lies in optimizing the size of images, which can fundamentally improve the deployed workload.

(4) Utilizing Namespace Quotas and Limits

Defining quotas and limits directly at the namespace level allows for better allocation of resources across departments or projects, ensuring that one project doesn’t consume the majority of resources.

apiVersion: v1
kind: ResourceQuota
name: mem-cpu-demo
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi

[root]# kubectl get resourcequota mem-cpu-demo --output=yaml

(5) Optimizing Storage Costs

Storage significantly impacts the operational costs of Kubernetes. Reduce running costs by using appropriate storage classes and dynamic provisioning. Regularly review unused PVCs to release resources.

(6) Adopting Multi-tenancy

Running all application services on the same Kubernetes cluster can effectively save costs. However, for more efficient service provision, it’s advisable to distribute different types of application services across different clusters.

Multi-tenancy allows for:

  • Different RBAC to protect resource access
  • Isolation using namespaces

(7) Implementing Network Policies

Implementing network policies effectively controls traffic costs and resource access. It can prevent unnecessary inter-Pod communication, reducing the load on the underlying network infrastructure, and ensuring that only necessary traffic is communicated.

(8) Using Pod Disruption Budgets

Using Pod Disruption Budgets (PDB) ensures that pods are not inadvertently disrupted or terminated during maintenance tasks. PDBs control the number of pods that can be disrupted at any given time, ensuring system availability and reliability.

PDBs primarily have two attributes: minAvailable specifies the minimum number of pods that must be maintained within a given time frame, while maxUnavailable specifies the maximum number of pods that can be disrupted at any given time.

(8) Monitoring and Analyzing Costs

Regularly monitor the operational costs of the entire architecture using appropriate tools and services to provide more accurate cost analysis and perform related optimizations.


  • Utilize cost analysis tools provided by public cloud providers.
  • For on-premises Kubernetes deployments, use third-party tools for estimation (e.g., kubecost).

4. Conclusion

Most Kubernetes cluster administrators have their own way of optimizing. It’s not about right or wrong; it’s based on experience and the nature of the services. The suggestions in this article are gathered from various sources.

Personally, I think storage optimization is crucial. As the volume of services grows, managing storage becomes burdensome. Many operational issues stem from backend PV/PVC not binding correctly. Optimizing storage can also help with backup/restore and disaster recovery strategies, maximizing benefits in one go.

That’s all for this article. See you next time!



Albert Weng

You don't have to be great to start, but you have to start to be great