APPMESH + ECS_FARGATE - When starting task in ECS Fargate integrated with AWS appmesh, cloudmap service discovery instance is not getting registered (when checked Target group throws 503 error)

0

When running java microservice in ECS Fargate, application starts and running smoothly. While try integrating AWS appmesh with ECS Fargate, ecs task is running for few minutes and after that task getting restarted continuously. Have found the following line cloudwatch logs for envoy container

level=error msg="Couldn't determine the AZ ID due to: unable to fetch placement/availability-zone-id from IMDSv1, Get "http://169.254.169.254/latest/meta-data/placement/availability-zone-id": dial tcp 169.254.169.254:80: connect: invalid argument" Adding this information for reference: Envoy_Container health status is HEALTHY & Application_Container health status is UNKNOWN Not sure where I'm currently getting stuck any solution for getting out of this issue and start using ECS Fargate service with AWS Appmesh Integrated?

3 Answers
0

Hello. I did get the same error message in the logs for my instance, but worked fine anyway. Which container quit first when your task goes down ? If your java app is not healthy or quits, then they both do, as you want both to be "essential" in the task definition. What logs do you get when it quits?

I have that exact setup and a "demo" to use in combination of ECS Compose-X and not major issues at all with AppMesh + ECS + Fargate + ALB.

Maybe the wrong ports are set ?

profile picture
answered 2 years ago
  • Which container quit first when your task goes down ? Only java app container goes to unknown state, added proxy(healthy status) as a dependent container to java container and proxy always stays healthy till getting into stopped state

    Maybe the wrong ports are set ? Container is listening in port 8080, so made the same setting in virtual router of appmesh listens to port 8080

0

Appication conatiner heath - UNKNOWN

Serive discovery service instance health - UNKNOWN

AWS_INIT_HEALTH_STATUS(attribute from service discovery service) - UNHEALTHY

App container log: :: Spring Boot :: (v2.3.1.RELEASE) 16 12:20:27.451 INFO 1 --- [main] c.h.s.SearchserviceApplication : Starting SearchserviceApplication v0.0.1-SNAPSHOT on ip-10-0-20-215.ec2.internal with PID 1 (/app.jar started by root in /) 16 12:20:27.527 INFO 1 --- [main] c.h.s.SearchserviceApplication : The following profiles are active: dev 16 12:20:41.042 INFO 1 --- [main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 8080 (http) 16 12:20:41.136 INFO 1 --- [main] o.apache.catalina.core.StandardService : Starting service [Tomcat] 16 12:20:41.137 INFO 1 --- [main] org.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/9.0.36] 16 12:20:41.541 INFO 1 --- [main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext 16 12:20:41.542 INFO 1 --- [main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 13701 ms 16 12:20:53.352 DEBUG 1 --- [main] s.w.s.m.m.a.RequestMappingHandlerMapping : 21 mappings in 'requestMappingHandlerMapping' 16 12:20:54.133 INFO 1 --- [main] o.s.b.a.e.web.EndpointLinksResolver : Exposing 2 endpoint(s) beneath base path '/actuator' 16 12:20:54.544 INFO 1 --- [main] pertySourcedRequestMappingHandlerMapping : Mapped URL path [/v2/api-docs] onto method [springfox.documentation.swagger2.web.Swagger2Controller#getDocumentation(String, HttpServletRequest)] 16 12:20:57.142 DEBUG 1 --- [main] s.w.s.m.m.a.RequestMappingHandlerAdapter : ControllerAdvice beans: 0 @ModelAttribute, 0 @InitBinder, 1 RequestBodyAdvice, 1 ResponseBodyAdvice 16 12:20:57.927 DEBUG 1 --- [main] o.s.w.s.handler.SimpleUrlHandlerMapping : Patterns [/webjars/, /] in 'resourceHandlerMapping' 16 12:20:57.942 DEBUG 1 --- [main] .m.m.a.ExceptionHandlerExceptionResolver : ControllerAdvice beans: 1 @ExceptionHandler, 1 ResponseBodyAdvice 16 12:20:58.832 INFO 1 --- [main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path '' 16 12:20:58.834 INFO 1 --- [main] d.s.w.p.DocumentationPluginsBootstrapper : Context refreshed 16 12:20:59.138 INFO 1 --- [main] d.s.w.p.DocumentationPluginsBootstrapper : Found 1 custom documentation plugin(s) 16 12:20:59.652 INFO 1 --- [main] s.d.s.w.s.ApiListingReferenceScanner : Scanning for api listing references 16 12:21:01.229 INFO 1 --- [main] c.h.s.SearchserviceApplication : Started SearchserviceApplication in 37.392 seconds (JVM running for 41.148)

Proxy container log time="16T11:30:53Z" level=info msg="App Mesh Environment Variables: [APPMESH_VIRTUAL_NODE_NAME=mesh/hypaiq/virtualNode/mesh-searchservice-vnode]" time="16T11:30:53Z" level=info msg="Envoy Environment Variables: []" time="16T11:30:53Z" level=info msg="Agent Environment Variables: []" time="16T11:30:53Z" level=error msg="Couldn't determine the AZ ID due to: unable to fetch placement/availability-zone-id from IMDSv1, Get "http://169.254.169.254/latest/meta-data/placement/availability-zone-id": dial tcp 169.254.169.254:80: connect: invalid argument" time="16T11:30:53Z" level=info msg="Generated Envoy Bootstrap Yaml Config: admin:\n accessLog:\n - typedConfig:\n '@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog\n path: /tmp/envoy_admin_access.log\n address:\n socketAddress:\n address: 0.0.0.0\n portValue: 9901\nclusterManager:\n outlierDetection:\n eventLogPath: /dev/stdout\ndynamicResources:\n adsConfig:\n apiType: GRPC\n grpcServices:\n - googleGrpc:\n callCredentials:\n - fromPlugin:\n name: envoy.grpc_credentials.aws_iam\n typedConfig:\n '@type': type.googleapis.com/envoy.config.grpc_credential.v3.AwsIamConfig\n region: <REGION>\n serviceName: appmesh\n channelArgs:\n args:\n grpc.http2.max_pings_without_data:\n intValue: "0"\n grpc.keepalive_time_ms:\n intValue: "10000"\n grpc.keepalive_timeout_ms:\n intValue: "20000"\n channelCredentials:\n sslCredentials:\n rootCerts:\n filename: /etc/pki/tls/cert.pem\n credentialsFactoryName: envoy.grpc_credentials.aws_iam\n statPrefix: ads\n targetUri: appmesh-envoy-management.<REGION>.amazonaws.com:443\n transportApiVersion: V3\n cdsConfig:\n ads: {} \n initialFetchTimeout: 0s\n resourceApiVersion: V3\n ldsConfig:\n ads: {} \n initialFetchTimeout: 0s\n resourceApiVersion: V3\nlayeredRuntime:\n layers:\n - name: static_layer_0\n staticLayer:\n envoy.features.enable_all_deprecated_features: true\n envoy.reloadable_features.http_set_tracing_decision_in_request_id: true\n envoy.reloadable_features.no_extension_lookup_by_name: false\n re2.max_program_size.error_level: 1000\n - adminLayer: {} \n name: admin_layer\nnode:\n cluster: mesh/hypaiq/virtualNode/mesh-searchservice-vnode\n id: mesh/hypaiq/virtualNode/mesh-searchservice-vnode\n metadata:\n aws.appmesh.platformInfo:\n AvailabilityZone: <REGION>a\n ecsPlatformInfo:\n ecsClusterArn: arn:aws:ecs:<REGION>:xxxxxxxxxx:cluster/mesh-searchservice-cluster\n ecsLaunchType: AWS_ECS_FARGATE\n ecsTaskArn: arn:aws:ecs:<REGION>:xxxxxxxxxx:task/mesh-searchservice-cluster\n systemInfo:\n systemKernelVersion: 4.14.276-211.499.amzn2.x86_64\n systemPlatform: x86_64\n aws.appmesh.task.interfaces:\n ipv4:\n eth0:\n - 169.254.172.2/22\n eth1:\n - 10.0.20.212/24\n lo:\n - 127.0.0.1/8\n ipv6:\n eth0:\n - fe80::107e:24ff:fe49:c69f/64\n eth1:\n - fe80::4a:3aff:fe3f:2d6d/64\n lo:\n - ::1/128\n\n" [16 11:30:53.364][1][info] [AppNet Agent] Executing command: [/usr/bin/envoy -c /tmp/envoy-config-3872489680.yaml -l info --drain-time-s 20]

[16 11:30:53.455][18][info][main] [source/server/server.cc:786] runtime: layers:

  • name: static_layer_0 static_layer: re2.max_program_size.error_level: 1000 envoy.reloadable_features.no_extension_lookup_by_name: false envoy.features.enable_all_deprecated_features: true envoy.reloadable_features.http_set_tracing_decision_in_request_id: true
  • name: admin_layer admin_layer:

{} [16 11:30:53.458][18][info][admin] [source/server/admin/admin.cc:134] admin address: 0.0.0.0:9901 [16 11:30:53.459][18][info][config] [source/server/configuration_impl.cc:127] loading tracing configuration [16 11:30:53.459][18][info][config] [source/server/configuration_impl.cc:87] loading 0 static secret(s) [16 11:30:53.459][18][info][config] [source/server/configuration_impl.cc:93] loading 0 cluster(s) [16 11:30:53.461][18][info][config] [source/server/configuration_impl.cc:97] loading 0 listener(s) [16 11:30:53.461][18][info][config] [source/server/configuration_impl.cc:109] loading stats configuration [16 11:30:53.462][18][info][runtime] [source/common/runtime/runtime_impl.cc:462] RTDS has finished initialization [16 11:30:53.462][18][info][upstream] [source/common/upstream/cluster_manager_impl.cc:205] cm init: initializing cds [16 11:30:53.464][18][warning][main] [source/server/server.cc:761] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections [16 11:30:53.467][18][info][main] [source/server/server.cc:882] starting main dispatch loop [16 11:30:53.569][18][warning][misc] [source/common/protobuf/message_validator_impl.cc:21] Deprecated field: type envoy.config.cluster.v3.Cluster Using deprecated option 'envoy.config.cluster.v3.Cluster.http2_protocol_options' from file cluster.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/version_history/version_history for details. If continued use of this field is absolutely necessary, see https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/runtime#using-runtime-overrides-for-deprecated-features for how to apply a temporary and highly discouraged override. [16 11:30:53.569][18][info][upstream] [source/common/upstream/cds_api_helper.cc:30] cds: add 3 cluster(s), remove 0 cluster(s) [16 11:30:53.573][18][info][upstream] [source/common/upstream/cds_api_helper.cc:67] cds: added/updated 3 cluster(s), skipped 0 unmodified cluster(s) [16 11:30:53.573][18][info][upstream] [source/common/upstream/cluster_manager_impl.cc:209] cm init: all clusters initialized [16 11:30:53.573][18][info][main] [source/server/server.cc:863] all clusters initialized. initializing init manager [16 11:30:53.640][18][info][upstream] [source/server/lds_api.cc:77] lds: add/update listener 'lds_ingress_0.0.0.0_15000' [16 11:30:53.641][18][info][upstream] [source/server/lds_api.cc:77] lds: add/update listener 'lds_egress_0.0.0.0_15001' [16 11:30:53.643][18][info][config] [source/server/listener_manager_impl.cc:789] all dependencies initialized. starting workers [16 11:35:49.716][1][info] [AppNet Agent] Draining Envoy listeners... [16 11:35:49.717][1][info] [AppNet Agent] Waiting 20s for Envoy to drain listeners. [16 11:36:09.718][18][warning][main] [source/server/server.cc:821] caught ENVOY_SIGTERM [16 11:36:09.718][18][info][main] [source/server/server.cc:952] shutting down server instance [16 11:36:09.718][18][info][main] [source/server/server.cc:887] main dispatch loop exited

answered 2 years ago
0

I had same situation and I made it to check following.

  • set environments variables.
  1. AWS_REGION
  2. APPMESH_RESOURCE_ARN
  3. ENABLE_ENVOY_STATS_TAGS
  • make sure taskDefinition is set proxyConfiguration if you use aws cdk typescript , sample code is following.
    const taskDefinition = new FargateTaskDefinition(
      this,
      `fargate-task`,
      {
        executionRole,
        taskRole,
        cpu: 512,
        memoryLimitMiB: 2048,
        proxyConfiguration: new AppMeshProxyConfiguration({
          containerName: 'envoy',
          properties: {
            appPorts: [containerPort],
            proxyEgressPort: 15001,
            proxyIngressPort: 15000,

            // The App Mesh proxy runs with this user ID, and this keeps its
            // own outbound connections from recursively attempting to infinitely proxy.
            ignoredUID: 1337,

            // This GID is ignored and any outbound traffic originating from containers that
            // use this group ID will be ignored by the proxy. This is primarily utilized by
            // the FireLens extension, so that outbound application logs don't have to go through Envoy
            // and therefore add extra burden to the proxy sidecar. Instead the logs can go directly
            // to CloudWatch
            ignoredGID: 1338,

            egressIgnoredIPs: [
              '169.254.170.2', // Allow services to talk directly to ECS metadata endpoints
              '169.254.169.254', // and EC2 instance endpoint
            ],

            // If there is outbound traffic to specific ports that you want to
            // ignore the proxy those ports can be added here.
            egressIgnoredPorts: [],
          },
        }),
      },
    );
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions