I wish to experiment elasticity and scalability. I have given the following command into my system
kubectl create -f https://raw.githubusercontent.com/robertobruzzese/Project_Cloud/main/simple
Thus I have created 6 worker replicas. The system correctly creates 7 pytorchjob pods that I can visualize while they are running with the command
kubectl get pods -l job-name=pytorch-simple -n kubeflow
The number of replicas matches 7, as it is given in the command (simple).
(see figure)
Now I want to experiment scalability and with that I means to set a CPU utilization threshold that when it is reached in the pods the system automatically creates more replicas.
In order to do this autoscaling I have created and applied the following yaml file
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: pytorch-simple-hpa
namespace: kubeflow
spec:
scaleTargetRef:
apiVersion: kubeflow.org/v1
kind: PyTorchJob
name: pytorch-simple
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
And after having created this yaml file object I applied it with the following command :
kubectl apply -f pytorch-simple-hpa.yaml
Unfortunately I did not observe any increase in the number of replicas.
My question is , what is wrong with my work ? How I should increase automatically the number of replicas once the CPU utilization threshold has been set ?
Thank a lot.
Bye
Hi ! The metric server is already installed. Now I know that there are two kinds of autoscaling. Node autoscaling and pod autoscaling. In the first I suppose the scaling happens in the number of nodes while in the second it scales out the number of pods. I want to experiment both. Since I am running a pytorchjob on the pod , that creates a given number of worker nodes, I expect in some way, that if CPU utilization overcomes the threshold it automatically increase the pods (or worker nodes) in respect to the set number. But it does not increase this number neither the nodes. Should I always create an aritificial load ?
Hey Roberto,
Great, if you have the metric server already installed, you could monitor the load on your pods by using the
kubectl top pod -n kubeflowcommand. If the load exceeds 50%, HPA should kick off more pods in your namespace. For node autoscaling, you will need to either install cluster autoscaler or karpenter and they will work on any pending pods in your cluster, i.e. if HPA creates pods but the scheduler do not have any node with available required resources for the pods, the pods will go in pending state and then CAS/Karpenter will create a new node for your cluster and Kubernetes scheduler will place the pending pods in the new node.Please let me know if this still does not work for you as required, or if you have any further questions.