Questions tagged with Amazon Elastic Kubernetes Service

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Hello! # Short summary of context and issue I am using EFS to mount a PV (ReadWriteMany [access-mode](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes)) via a PVC into EKS pods. The issue I'm having is that write updates propagate with big delays across pods: one pod may successfully write a file to the shared directory, but other pods see it some 10-60 seconds later (this delay varies across experiments seemingly at random). ## Experiment & Concrete results I run two simple pods. [Pod1](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/debugging_pods/pod1.yaml) runs first and continuously checks if `/workdir/share_point/example.txt` exists via the `stat` command. [Pod2](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/debugging_pods/pod2.yaml) runs second and writes the file, then does the same checks. As can be seen from the logs below, the file created at `16:52:36.544` is visible in Pod1 only at ~`16:52:57.694` Logs of Pod1: [pod1.log](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/logs/pod1.log) Logs of Pod2: [pod2.log](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/logs/pod2.log) ## Expected results I expected that Pod1 sees the file as soon as it is successfully written, as is the case for Pod2. As far as I understand, this would fit the [consistency](https://docs.aws.amazon.com/efs/latest/ug/how-it-works.html#consistency) model described in the docs. ## Worth Mentioning If I manually `kubectl exec` into the pods and attempt something similar, the problem seems to not be there, see [manual_test.log](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/logs/manual_test.log) 1. Pod2: `echo "Manual test" > /workdir/share_point/manual_test.txt` 2. Pod1: `date +\"%T.%3N\" && stat /workdir/share_point/manual_test.txt` ## Steps to Reproduce In what follows, I provide the simplest setup that reproduces the issue that I have. Following AWS docs, I set up a VPC, an EKS cluster and an EFS as a storage provider for the cluster. Each section below refers to the documentation I've followed and provides the commands used. ### VPC Follows [creating-a-vpc](https://docs.aws.amazon.com/eks/latest/userguide/creating-a-vpc.html). Creates a VPC from a template, will have 2 private and 2 public subnets with suitable configuration to host an EKS cluster. ```sh aws cloudformation create-stack --stack-name public-private-subnets \ --template-url https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-vpc-private-subnets.yaml ``` ### EKS cluster Follows [create-cluster](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html). I specify the cluster name, region, and for simplicity manually copy the subnet IDs of the above VPC. ```sh eksctl create cluster --name my-demo-cluster --region eu-central-1 \ --with-oidc --version 1.24 --node-ami-family Ubuntu2004 \ --vpc-private-subnets private_subnet1_id,private_subnet2_id \ --vpc-public-subnets public_subnet1_id,public_subnet2_id \ --node-private-networking --managed ``` ### EFS setup Follows the [efs-csi-page](https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html) #### Create a Policy `curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/docs/iam-policy-example.json` ```sh aws iam create-policy \ --policy-name AmazonEKS_EFS_CSI_Driver_Policy \ --policy-document file://iam-policy-example.json ``` #### Create a ServiceAccount Replace account-id accordingly in the command below. ```sh eksctl create iamserviceaccount \ --cluster my-demo-cluster \ --namespace kube-system \ --name efs-csi-controller-sa \ --attach-policy-arn arn:aws:iam::account-id:policy/AmazonEKS_EFS_CSI_Driver_Policy \ --approve \ --region eu-central-1 ``` #### Install the EFS CSI Driver ```sh helm repo add aws-efs-csi-driver https://kubernetes-sigs.github.io/aws-efs-csi-driver/ helm repo update ``` ```sh helm upgrade -i aws-efs-csi-driver aws-efs-csi-driver/aws-efs-csi-driver \ --namespace kube-system \ --set image.repository=602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/aws-efs-csi-driver \ --set controller.serviceAccount.create=false \ --set controller.serviceAccount.name=efs-csi-controller-sa ``` #### Creating the EFS, SG and mount points For simplicity, manually copy the subnet IDs of the VPC. `./complete_efs_setup.sh private_subnet1_id private_subnet2_id` ### Kubernetes StorageClass and PVC Replace the Filesystem ID in [kubernetes_storage/efs-storageclass.yaml](https://github.com/RonaldGalea/my-eks-issue/blob/main/issue_preparation/kubernetes_storage/efs-storageclass.yaml): `kubectl apply -f kubernetes_storage/efs-storageclass.yaml` `kubectl apply -f kubernetes_storage/efs-pvc.yaml` ### Deploy pods `kubectl apply -f debugging_pods/pod1.yaml` After the first one is running: `kubectl apply -f debugging_pods/pod2.yaml` ### Exec in pods `kubectl exec --stdin --tty pod1 -- /bin/bash` `kubectl exec --stdin --tty pod2 -- /bin/bash` ### Relevant system information Output of `aws --version` ``` aws-cli/2.8.7 Python/3.9.11 Linux/5.15.0-58-generic exe/x86_64.ubuntu.22 prompt/off ``` Output of `eksctl version` ``` 0.125.0 ``` Output of `helm version` ``` version.BuildInfo{Version:"v3.10.3", GitCommit:"835b7334cfe2e5e27870ab3ed4135f136eecc704", GitTreeState:"clean", GoVersion:"go1.18.9"} ``` ### Thank you I'd be very thankful for any hint/pointer as to where the issue may lie. Thank you in advance.
0
answers
1
votes
38
views
Ronald
asked 2 months ago
I installed AWS Load Balancer Controller through Helm. The ingress is created but the ALB is not and I am getting an error. I followed the guide below. -> https://docs.aws.amazon.com/ko_kr/eks/latest/userguide/aws-load-balancer-controller.html * Deployment / Service - logs ERROR {"level":"error","ts":1674024616.2905765,"logger":"controller.ingress","msg":"Reconciler error","name":...,"namespace":...,"error":"UnauthorizedOperation: You are not authorized to perform this operation.\n\tstatus code: 403} * ingress ERROR Warning FailedBuildModel 19s ingress Failed build model due to UnauthorizedOperation: You are not authorized to perform this operation. status code: 403
4
answers
0
votes
267
views
ari
asked 2 months ago
Kubernetes version: 1.23 Hi everyone, I have several services running inside AWS EKS and they are exposed through one ingress (AWS Load Balancer Controller). My ingress file is: ``` apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]' alb.ingress.kubernetes.io/certificate-arn: arn-of-certificate alb.ingress.kubernetes.io/healthcheck-path: /healthcheck alb.ingress.kubernetes.io/ssl-redirect: '443' name: app-ingress namespace: namespace spec: rules: - host: my-domain.com http: paths: - path: /app1 pathType: Prefix backend: service: name: app1-service port: name: app1-port - path: /app2 pathType: Prefix backend: service: name: app2-service port: name: app2-port - path: /app3 pathType: Prefix backend: service: name: app3-service port: name: app3-port ... tls: - hosts: - my-domain.com ``` everything works fine, but I want, for example, app3 to be more private. I want to be able to set specified IP addresses to be able to access that application. I haven't found anything helpful regarding this. For example if random person tries to access app1 he/she should be able to with https://my-domain.com/app1 but if he/she tries https://my-domain.com/app3 and his/hers IP address is not in allowed IP addresses, the access will be denied. The thing is, I want one ALB for several applications. Anything would be helpful, some links or what should I be looking for. I'm wondering if this is even possible? Or the only solution is to make multiple ALBs and for the app3 change the network configuration to allow selected IPs?
2
answers
0
votes
63
views
mc2609
asked 2 months ago
Hi AWS, I have installed Grafana on EKS cluster via Helm. I need to add ingress to it so that I can access Grafana via DNS. Please provide me any link for the AWS blog post or steps to perform the same. Thanks Arjun Goel
1
answers
0
votes
21
views
profile picture
asked 2 months ago
I am building an ALB ingress controller for my cluster and it is stuck on pending 0/1. The output of kubectl get pod (pod name) -n kube-system is Warning FailedScheduling 116s default-scheduler 0/3 nodes are available: 3 Too many pods, 3 node(s) had untolerated taint {eks.amazonaws.com/compute-type: fargate}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling. 1) what could be the issue? 2) How can I fix it ?
1
answers
0
votes
41
views
Joash
asked 2 months ago
What does EKS fargate mean by one pod one node? how many replicas of my deployment can I have in one fargate class? I made a deployment which resulted into pending pods kubectl describe pod frontend-app-79fd8cfd9f-4jjg5 -n dev gave me the following output: Warning FailedScheduling 3m7s default-scheduler 0/2 nodes are available: 2 Too many pods, 2 node(s) had untolerated taint {eks.amazonaws.com/compute-type: fargate}. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
2
answers
0
votes
291
views
Joash
asked 2 months ago
I am trying to access S3 buckets (DEV env) from EMR on EKS (INT env) cluster running on different accounts. I have created the IAM roles and configurations following the guide on [EMR on EKS cross-account access setup guide](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/security-cross-account.html) . When I start my spark job I get error logs on S3 Bucket read operation that the access is denied. On further debugging, I also get access denied error when I manually do aws s3 ls inside the spark job pod shell. For DEV account, the IAM role is **TestEMRCA** with following policy and trust relationship ``` // permission policy { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:Get*", "s3:List*", "s3-object-lambda:Get*", "s3-object-lambda:List*" ], "Resource": "*" } ] } // trust policy { "Version": "2012-10-17", "Statement": [ { "Sid": "AR", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::INT_ID:role/emr_on_eks" //Job Execution Role }, "Action": "sts:AssumeRole" } ] } ``` For INT account, the IAM role is **emr_on_eks** with following policy and trust relationship. It is also the job execution role for the EMR to run jobs ``` // permission policy { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "logs:CreateLogStream", "logs:DescribeLogGroups", "logs:DescribeLogStreams", "s3:ListBucket", "logs:PutLogEvents" ], "Resource": "*" }, { "Sid": "VisualEditor2", "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "arn:aws:iam::DEV_ID:role/TestEMRCA" } ] } // trust policy { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "elasticmapreduce.amazonaws.com" }, "Action": "sts:AssumeRole" }, { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::DEV_ID:oidc-provider/<OIDC_URL>/id/<OIDC_ID>" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringLike": { "<OIDC_URL>/id/<OIDC_ID>:sub": "system:serviceaccount:<NAMESPACE>:<SA>" } } } ] } ``` To test if there is problem in IAM role in DEV account, I created a new ROLE and associated that in service account in EKS cluster on INT account. When I run pod shell annotated with that service account, I can access the buckets (using aws s3 ls). I don't know what I'm missing in case of EMR on EKS as there is only one tutorial from AWS I found and followed, I hope someone can help me. UPDATE: I tried to manually assume the DEVB role and then set the AWS env vars. I can access the s3 buckets, If I manually do set AWS env var which are as follows aws sts assume-role --role-arn arn:aws:iam::DEV_ID:role/TestEMRCA --role-session-name s3-access-example export AWS_ACCESS_KEY_ID=VAL_FROM_ABOVE_CMD export AWS_SECRET_ACCESS_KEY=VAL2 export AWS_SESSION_TOKEN=VAL3 Doing this I can access the buckets but this is manual thing which I don't want to do. Since EMR on EKS has this conf param `--conf spark.hadoop.fs.s3.customAWSCredentialsProvider=com.amazonaws.emr.AssumeRoleAWSCredentialsProvider --conf spark.kubernetes.driverEnv.ASSUME_ROLE_CREDENTIALS_ROLE_ARN=arn:aws:iam::DEV_ID:role/TestEMRCA --role-session-name --conf spark.executorEnv.ASSUME_ROLE_CREDENTIALS_ROLE_ARN=arn:aws:iam::DEV_ID:role/TestEMRCA --role-session-name"` to automatically assume the role, if I correctly understand. I think either this is a bug or am i still missing something.
1
answers
0
votes
44
views
asked 2 months ago
We enabled the stickiness for NLB in the K8s yaml using annotation below, but when we check the target group, it always showing Stickiness is off. Do you know what is the reason? service.beta.kubernetes.io/aws-load-balancer-type: "external" service.beta.kubernetes.io/aws-load-balancer-name: fcubs-gateway service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: stickiness.enabled=true,stickiness.type=source_ip service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: proxy_protocol_v2.enabled=true service.beta.kubernetes.io/aws-load-balancer-attributes: load_balancing.cross_zone.enabled=true
0
answers
0
votes
29
views
AWS
An Tran
asked 2 months ago
I have a fargate class in EKS but my Core DNS pods are in a pending state. How do I go about the issue?
2
answers
0
votes
253
views
Joash
asked 2 months ago
I have an angular and spring boot application that I'd like to deploy on EKS. Angular is on port 80 with an Nginx image in front of it. spring-boot is on port 8000. The container images are on ECR. I want to use fargate class. How do I go about deploying the application as a whole?
1
answers
1
votes
130
views
Joash
asked 2 months ago
Hi team, I want to create a Red Hat OpenShift Service on AWS (ROSA) cluster and I'm not sure if I should give - Machine CIDR: - Pod CIDR: the same IP CIDR range which is the subnet CIDR, since machines and pods are running inside the same VPC, inside the same subnets. or those values should be different? my understanding is that machines and pods are running inside the same subnet so - Machine CIDR: - Pod CIDR: logically have the same IP range which is the IP range of the subnet in which they are running. for service CIDR can use always this value? - Service CIDR: 172.30.0.0/16 is there a specific workshop that walkthrough detailed steps on how to create a ROSA cluster inside **private VPC** with a **private link** thank you!
1
answers
1
votes
66
views
Jess
asked 2 months ago
Hi all. Seeing that EKS no longer supports docker from version 1.24, is it advisable to use cri-dockerd? How does one go a bout configuring it in EKS? and are the any other alternatives for this? Thanks in advance.
0
answers
0
votes
23
views
asked 3 months ago