Skip to content

Problem while experimenting elasticity

0

I wish to experiment elasticity and scalability. I have given the following command into my system

kubectl create -f https://raw.githubusercontent.com/robertobruzzese/Project_Cloud/main/simple

Thus I have created 6 worker replicas. The system correctly creates 7 pytorchjob pods that I can visualize while they are running with the command

kubectl get pods -l job-name=pytorch-simple -n kubeflow

The number of replicas matches 7, as it is given in the command (simple). (see figure) this figure shows the command line with 7 replicas Now I want to experiment scalability and with that I means to set a CPU utilization threshold that when it is reached in the pods the system automatically creates more replicas. In order to do this autoscaling I have created and applied the following yaml file

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: pytorch-simple-hpa
  namespace: kubeflow
spec:
  scaleTargetRef:
    apiVersion: kubeflow.org/v1
    kind: PyTorchJob
    name: pytorch-simple
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

And after having created this yaml file object I applied it with the following command :

kubectl apply -f pytorch-simple-hpa.yaml  

Unfortunately I did not observe any increase in the number of replicas. My question is , what is wrong with my work ? How I should increase automatically the number of replicas once the CPU utilization threshold has been set ? Thank a lot. Bye

asked 3 years ago375 views
1 Answer
0

Hello Roberto,

Horizontal Pod Autoscaler will need to fetch the cpu metrics from the Metric Server, which is not installed by default on EKS cluster. Hence please install the metric server first, you can use this workshop documentation to deploy metric server.

Once you deploy metric server, please generate load on your application to observe HPA scale out the replicas.

Please let me know in case of any further queries.

Thanks, Manish

answered 3 years ago
  • Hi ! The metric server is already installed. Now I know that there are two kinds of autoscaling. Node autoscaling and pod autoscaling. In the first I suppose the scaling happens in the number of nodes while in the second it scales out the number of pods. I want to experiment both. Since I am running a pytorchjob on the pod , that creates a given number of worker nodes, I expect in some way, that if CPU utilization overcomes the threshold it automatically increase the pods (or worker nodes) in respect to the set number. But it does not increase this number neither the nodes. Should I always create an aritificial load ?

  • Hey Roberto,

    Great, if you have the metric server already installed, you could monitor the load on your pods by using the kubectl top pod -n kubeflow command. If the load exceeds 50%, HPA should kick off more pods in your namespace. For node autoscaling, you will need to either install cluster autoscaler or karpenter and they will work on any pending pods in your cluster, i.e. if HPA creates pods but the scheduler do not have any node with available required resources for the pods, the pods will go in pending state and then CAS/Karpenter will create a new node for your cluster and Kubernetes scheduler will place the pending pods in the new node.

    Please let me know if this still does not work for you as required, or if you have any further questions.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.