Local volume vs HostPath(en)

Albert Weng

3 min readSep 21, 2023

This article will explain the differences and some considerations regarding the two types of local volume and hostPath.

Let’s get started!

1. HostPath

hostPath allows you mapping files, directories, sockets, and block devices from the node to a Pod.

Here is an example of hostPath usage:

apiVersion: v1
kind: Pod
metadata:
  name: hostpath-example
spec:
  containers:
  - name: hostpath-container
    image: nginx
    volumeMounts:
    - name: hostpath-volume
      mountPath: /var/www/html
  volumes:
  - name: hostpath-volume
    hostPath:
      path: /var/data
      type: Directory

We use hostPath to mount the /var/data directory from the host into the /var/www/html path within the Pod using a Directory type volume.

When using hostPath volumes:

It is recommended to configure hostPath volumes with NodeSelector, but this can become cumbersome when dealing with a large number of Pods or Nodes.
If you apply DirectoryOrCreate or FileOrCreate, make sure that kubelet has the necessary permissions to create files or directories on the node.
If files or directories on the node are created by the root and then mounted into a container, it’s essential to ensure that the container has the appropriate permissions for reading and writing to those files or directories. You may need to adjust permissions within the container to allow the desired access.
Kubernetes scheduler does not consider the size of hostPath volumes, and there is no built-in way to set size limits for hostPath volumes.

2. Local-volume

Local volume operates by mounting local storage resources such as disks, partitions, or folders, allowing Kubernetes to access them as static Persistent Volumes (PVs).

The main purpose is to address and resolve the issues associated with HostPath.

Local volumes require additional logic and handling by the PV controller and scheduler to ensure that when a Pod needs to be rescheduled, it can be placed on the same node where the local volume resides.

One of the key benefits of using local volumes is the assurance that Pods and PVs will always be scheduled to the same worker node, which can be crucial for certain applications with specific data locality requirements.

3. Appropriate scenarios

Preloading Data from Remote Storage: When it is necessary to pre-load data from remote storage into a local directory to accelerate the speed of data access by Pods (caching). In this case, the access is read-only, so data integrity is not a concern. Some AI training processes also adopt this approach.
Local Software-Defined Storage (SDS) Solutions: When a local SDS solution is in place, which inherently provides data replicas, Local volumes can be employed effectively. (ex. ceph, portworx)
Not Suitable for Scalable Environments: Local volumes are not well-suited for environments where scaling and dynamic resource allocation are essential.

4. Considering for Using Local Volumes

When defining Persistent Volumes (PVs), you can use the .spec.nodeAffinity to specify the binding relationship between a local volume and a node.
If the “local-storage storageClass” is utilized, you can use the volumeBindingMode: WaitForFirstConsumer parameter to implement delayed binding. This ensures that the PV controller does not immediately bind the PV to a PVC. Instead, it waits until a Pod that requires this local PV has completed scheduling before performing the binding.

Reference:

Local volume vs HostPath(en)

1. HostPath

2. Local-volume

3. Appropriate scenarios

4. Considering for Using Local Volumes

Written by Albert Weng

No responses yet