EndpointConnectionError from Python app when trying to get sampling rules

0

I have a Python / Flask application running in a Docker container under Elastic Beanstalk. The X-Ray daemon (v3.2.0) is installed inside the container. The daemon's logs show that it is successfully sending batches of data to the X-Ray service. However, my application logs show a failure when the Python X-Ray SDK (v2.6.0) attempts to get sampling rules, with a stack dump ending as follows:

File "/usr/local/lib/python3.8/site-packages/botocore/endpoint.py", line 269, in _send
return self.http_session.send(request)
File "/usr/local/lib/python3.8/site-packages/botocore/httpsession.py", line 283, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://127.0.0.1:2000/GetSamplingRules"
[2021-02-11 19:53:22 +0000] [17] [INFO] No effective centralized sampling rule match. Fallback to local rules.

I can confirm that the daemon is running, although a request to the URL shown above returns a 403 Forbidden error. I don't know if that's expected or not.

Can anyone suggest what might be going on here? Is there an incompatibility between the 3.x daemon that's running and the 2.x SDK? Any help would be greatly appreciated. Thank you!

已提问 3 年前1112 查看次数
4 回答
0

Hi jgarbers1
The X-Ray daemon needs to expose 2 ports (UDP and TCP) when it is running in a container. The UDP port is used by the X-Ray SDK to send trace data to daemon whereas the TCP port is used for fetching the sampling rules. From your description of the problem, it seems like you may have opened the daemon's UDP port but not the TCP port.
Can you verify this and try exposing both the ports? If you're using a Dockerfile, you can follow this doc to do so: https://docs.aws.amazon.com/xray/latest/devguide/xray-daemon-ecs.html#xray-daemon-ecs-build

If you still experiance the issue, please provide details on the configuration of your app and daemon.
Thanks!

AWS
已回答 3 年前
0

Thanks for the help, prashataws! I'll review the information at that link today. In the meantime, though, to clarify:the application using the SDK and the daemon are both running in the same container, so it doesn't seem like any ports would need to be exposed...? I know this is somewhat at odds with the Docker "one process per container" guideline, but I didn't want to take on the task of converting both my front end and worker applications into multi-container EB projects.

Could the "could not connect" situation just be transient, if the SDK is trying to connect to it before it's completely up and running?

已回答 3 年前
0

I see. If you have the app and the daemon running in the same container and you didn't need to expose the UDP port for sending segments to daemon, then you may not need to expose the TCP port for sampling rules as well. But it would be good to try with exposing the ports in my opinion.
What makes you think this issue could be transient? Do you see the error only during the first few calls to your application and then it works fine afterwards? Ideally the daemon process should be up and running before the application starts creating segments/subsegments.

AWS
已回答 3 年前
0

I'll experiment with the ports shortly. It's been a few weeks since I had the problem, and my notes aren't detailed enough for me to recall whether the errors sort of went away or not. I do start the daemon before launching my app, but it's possible that the daemon is still getting things together at the time the app starts trying to talk to it. I'll be revising the system later in the week and follow up here if I'm still having issues. Thanks again for the help!

已回答 3 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则