I have set up OpenSearch Serverless collections in our two non-production accounts and have verified that our Python app, running in ECS, can call its API using opensearch-py.
Today, I tried to replicate the same setup in our production account but cannot successfully connect to its endpoint using opensearch-py. I have double-checked that..
- Network access policy allows VPC access using the VPC endpoint (created in the app's VPC) against the collection's endpoint and that network reachability analyzer can find a network path from a running container's ENI to the VPC endpoint
- Data access policy allows the correct role to perform all operations on the collection and all indices
One oddity that I noticed is the VPC endpoint that I created is gone from OpenSearch Serverless console's list of VPC endpoints. It's not there in AWS CLI's output either:
aws opensearchserverless list-vpc-endpoints
{
"vpcEndpointSummaries": []
}
In our non-production accounts, the VPC endpoint is listed in the command output.
I was able to find the VPC endpoint in the production account's VPC console, so I tried re-creating it. I was able to see it in OpenSearch Serverless console right after it was created, but it's gone again.
The collection is configured to allow public access to Dashboards and I can log in to it using the link found in OpenSearch Serverless console and run queries using the dev tools.
What could be the cause of the 401 error? What can I look into further?
Python code I used to try to connect to the collection in production:
>>> import os, boto3
>>> from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
>>> host = REDACTED
>>> service, region = ('aoss', 'us-west-2')
>>> credentials = boto3.Session().get_credentials()
>>> auth = AWSV4SignerAuth(credentials, region, service)
>>> client = OpenSearch(
hosts=[{'host': host, 'port': 443}],
http_auth=auth,
use_ssl=True,
verify_certs=True,
connection_class=RequestsHttpConnection,
pool_maxsize=20,
)
>>> q = "miller"
>>> query = {
'size': 5,
'query': {
'multi_match': {
'query': q,
'fields': ['title^2', 'director']
}
}
}
>>> client.search(query)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/opensearchpy/client/utils.py", line 178, in _wrapped
return func(*args, params=params, headers=headers, **kwargs)
File "/usr/local/lib/python3.7/site-packages/opensearchpy/client/__init__.py", line 1556, in search
body=body,
File "/usr/local/lib/python3.7/site-packages/opensearchpy/transport.py", line 408, in perform_request
raise e
File "/usr/local/lib/python3.7/site-packages/opensearchpy/transport.py", line 376, in perform_request
timeout=timeout,
File "/usr/local/lib/python3.7/site-packages/opensearchpy/connection/http_requests.py", line 222, in perform_request
response.headers.get("Content-Type"),
File "/usr/local/lib/python3.7/site-packages/opensearchpy/connection/base.py", line 302, in _raise_error
status_code, error_message, additional_info
opensearchpy.exceptions.AuthenticationException: AuthenticationException(401, '')
I found that enabling public access to the collection's endpoint in the network policy allowed the Python code to connect successfully. So I believe the data access policy is configured correctly.
The VPC endpoint is still missing from OpenSearch Serverless console. When I tried to traceroute to it from within an ECS container, the IP address resolved to 52.x.x.x. When I tried the same in one of the non-production envs, it resolved to 10.x.x.x. So there definitely seems to be an issue with the VPC endpoint setup. I'll see if I can open a support ticket.
I was not able to open a support ticket as our account is on the basic support plan, but I was able to fix it by giving my user full access to Route53 and re-creating the VPC endpoint. I'll post this as a separate answer.