Neptune Signed IAM URL Expiry

0

I'm trying to work with Neptune and Spark, following the Iam authentication guides. However, I have a large dataset I'm working with that I'd like to write with the neo4j spark connector, but this would take longer than the 5 minute expiry of the authenticated URL. I have no problems when I add on the X-Amz-Expires tag to a HTTPS request (like going to /status endpoint), however when I try with bolt I have never had success, below is my class that is failing. Is the expires tag supported for this, or am I forced to exit spark and instead use the credential refreshing provided by Neo4j?

The URL I'm attempting to hit is of the format bolt://{cluster_endpoint}:8182/opencypher

class NeptuneAuthToken(Auth):
  def __init__(
    self,
    credentials: Credentials,
    region: str,
    url: str,
    with_expire: bool = True,
    **parameters
  ):
    # Do NOT add "/opencypher" in the line below if you're using an engine version older than 1.2.0.0
    request = AWSRequest(method=HTTP_METHOD, url=url)
    request.headers.add_header("Host", _host_from_url(request.url))
    if with_expire:
      request.headers.add_header(X_AMZ_EXPIRES, "1200")
    sigv4 = SigV4Auth(credentials, SERVICE_NAME, "us-west-2")
    sigv4.add_auth(request)

    auth_obj = {
      hdr: request.headers[hdr]
      for hdr in [AUTHORIZATION, X_AMZ_DATE, X_AMZ_SECURITY_TOKEN, HOST] + ([X_AMZ_EXPIRES] if with_expire else [])
    }
    auth_obj[HTTP_METHOD_HDR] = request.method
    creds: str = json.dumps(auth_obj)
    super().__init__(SCHEME, DUMMY_USERNAME, creds, REALM, **parameters)
MarkT
asked 14 days ago79 views
1 Answer
0

You can find the latest info on Bolt support for Neptune here: https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-opencypher-bolt.html

There's an orange box at the beginning of that doc that provides the current status on credential refresh in the Bolt drivers.

As a workaround, you may want to investigate using the Neptune Data SDK within boto3: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/neptunedata/client/execute_open_cypher_query.html

profile pictureAWS
answered 14 days ago
profile picture
EXPERT
reviewed 14 days ago
  • Though I've seen this note - it seems disgenuine to link to an issue that's closed (https://github.com/neo4j/neo4j-python-driver/issues/834) but doesn't actually resolve beyond a work-around (auth refreshes), is there an issue open for this or any work that might happen?

  • I know things were being worked on to address this in the Neo4j drivers. Seems 5.14.0 now supports re-authentication: https://github.com/neo4j/neo4j-python-driver/pull/890

  • That's what I'm references re: auth refreshes unfortunately - it's just a 'hack' to refresh the creds when they're about to expire on the driver side, but for the spark access it would require the auth to maintain an extended validity window, like through X-AMZ-EXPIRES, rather than refreshing it on an ongoing basis

  • Not really, a "hack" so much as this is how sessioned IAM credentials work. You can assume an IAM role, but the maximum length of time that the session is valid is 12 hours. After credentials expire, you'll need a way to either refresh them proactively in your app or implement a refresh/retry mechanism (typically easier). So they way they've created a "refresher" is for that purpose.

    Tangentially, are you attempting to use Spark to run analytics against data stored in Neptune? Have you investigated Neptune Analytics?

  • If we were talking 12 hours, I would agree, but the signed URLs are the issue with them being for 5 minutes only, and there's no refresh mechanism that can be used on a spark cluster. I'll look into neptune analytics, but ultimately this is primarily a complaint for huge dataset loading, which is really easy with spark

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions