Skip to content

AWS Neptune OpenSearch Replication

0

We are using https://docs.aws.amazon.com/neptune/latest/userguide/full-text-search-cfn-setup.html for Neptune to OpenSearch Replication but now we are encountering few errors in Lambda below are the errors and this is issue is intermittent java.net.SocketException: Connection reset: java.lang.RuntimeException java.lang.RuntimeException: java.net.SocketException: Connection reset at com.sunpharma.streamreplicator.lambda.LambdaHandler.handleRequest(LambdaHandler.java:77) at com.sunpharma.streamreplicator.LambdaHandler.handleRequest(LambdaHandler.java:102) at jdk.internal.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source)


Unable to execute HTTP request: Connect to monitoring.ap-south-1.amazonaws.com:443 [monitoring.ap-south-1.amazonaws.com/15.206.137.240, monitoring.ap-south-1.amazonaws.com/15.206.137.255, monitoring.ap-south-1.amazonaws.com/15.206.137.249, monitoring.ap-south-1.amazonaws.com/15.206.137.202, monitoring.ap-south-1.amazonaws.com/15.206.137.203, monitoring.ap-south-1.amazonaws.com/15.206.137.210, monitoring.ap-south-1.amazonaws.com/15.206.137.244, monitoring.ap-south-1.amazonaws.com/15.206.137.247] failed: connect timed out: com.amazonaws.SdkClientException

2 Answers
0

The errors you're encountering with Neptune to OpenSearch replication appear to be related to network connectivity issues. These connection problems are likely due to VPC configuration issues between your Lambda function and the required AWS services.

Based on the error messages, there are two main issues:

  1. Connection reset errors when the Lambda function is trying to communicate with Neptune or OpenSearch
  2. Connection timeout when trying to reach CloudWatch monitoring endpoints

Here are some potential solutions:

  1. Verify VPC configuration:
  • Ensure that the OpenSearch cluster and Lambda function are located in the same VPC as your Neptune DB cluster
  • Confirm that the OpenSearch cluster is configured in VPC mode (not Internet mode)
  • Check that security groups allow proper communication between Lambda, Neptune, and OpenSearch
  1. Check security group rules:
  • Make sure the security group for your Lambda function allows outbound traffic to Neptune (port 8182) and OpenSearch (port 443)
  • Verify that the security groups for Neptune and OpenSearch allow inbound traffic from the Lambda function's security group
  1. Network connectivity:
  • Ensure your VPC has proper NAT Gateway configuration to allow the Lambda function to access AWS services like CloudWatch
  • If your Lambda is in a private subnet, it needs a route to access AWS services via a NAT Gateway or VPC endpoints
  1. VPC Endpoints:
  • Consider adding VPC endpoints for services like CloudWatch, Neptune, and OpenSearch to improve connectivity reliability
  • This can help resolve the timeout issues when connecting to monitoring.ap-south-1.amazonaws.com

Since the issue is intermittent, it suggests potential resource constraints or temporary network issues. You might want to increase the Lambda function's timeout and memory allocation to handle these occasional connectivity problems.
Sources
Amazon Neptune-to-OpenSearch replication - Amazon Neptune
Lambda (inside VPC) using Neptune API results in error: EndpointConnectionError: Could not connect to the endpoint URL | AWS re:Post
Cannot curl OpenSearch Service Endpoint from VPN Client Endpoint | AWS re:Post

answered 2 months ago
  • Since the issue is intermittent which means no issue with network as it is working as expected many times and some time only we are getting this issue. and about the Lambda timings we have set 15 minas and the memory 2048 Please find the below REPORT RequestId: fae489a6-0cfb-41db-a864-af24f24fb713 Duration: 108.12 ms Billed Duration: 109 ms Memory Size: 2048 MB Max Memory Used: 380 MB

  • Hi - I'm checking to see if I can find out what the root cause of that error might be.

0

Looking at the error messages it looks like the Stream poller is failing with a timeout error while publishing metrics to CloudWatch. This can happen if:

  • The VPC endpoint for monitoring is removed, or has had its configuration changed.
  • The VPC Security Group rules are not allowing outbound traffic to CloudWatch endpoints
  • Something like network ACLs are blocking the connection
  • Something else is causing the writes to timeout

Please also note that the Java version of the poller was deprecated about two years ago, and the Python version is the recommended/supported one to use.

AWS
MODERATOR
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.