Neptune Writer Instance does not recover freeable memory after successful load

0

Hi, I am experimenting with loading openCypher data from S3 (2 MB of node data, and about 12 MB of edge data) into a Neptune instance we have set up. I am using the %load line magic in a Neptune Notebook to perform the load. The loads are successful, but the freeable memory of our Writer instance (db.t3.medium) does not recover after successfully loading the data, which eventually leads to failed loads due to out-of-memory errors.

These are the out-of-memory errors I get when loading additional data using the %load line magic (output from %load_status <load-id> --details --errors):

{
  "status": "200 OK",
  "payload": {
    "feedCount": [
      {
        "LOAD_FAILED": 1
      }
    ],
    "overallStatus": {
      "fullUri": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
      "runNumber": 1,
      "retryNumber": 0,
      "status": "LOAD_FAILED",
      "totalTimeSpent": 7,
      "startTime": 1663677230,
      "totalRecords": 153958,
      "totalDuplicates": 0,
      "parsingErrors": 0,
      "datatypeMismatchErrors": 0,
      "insertErrors": 153958
    },
    "failedFeeds": [
      {
        "fullUri": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
        "runNumber": 1,
        "retryNumber": 0,
        "status": "LOAD_FAILED",
        "totalTimeSpent": 4,
        "startTime": 1663677233,
        "totalRecords": 153958,
        "totalDuplicates": 0,
        "parsingErrors": 0,
        "datatypeMismatchErrors": 0,
        "insertErrors": 153958
      }
    ],
    "errors": {
      "startIndex": 1,
      "endIndex": 5,
      "loadId": "<load-id>",
      "errorLogs": [
        {
          "errorCode": "OUT_OF_MEMORY_ERROR",
          "errorMessage": "Out of memory error. Resume load and try again.",
          "fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
          "recordNum": 0
        },
        {
          "errorCode": "OUT_OF_MEMORY_ERROR",
          "errorMessage": "Out of memory error. Resume load and try again.",
          "fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
          "recordNum": 0
        },
        {
          "errorCode": "OUT_OF_MEMORY_ERROR",
          "errorMessage": "Out of memory error. Resume load and try again.",
          "fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
          "recordNum": 0
        },
        {
          "errorCode": "OUT_OF_MEMORY_ERROR",
          "errorMessage": "Out of memory error. Resume load and try again.",
          "fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
          "recordNum": 0
        },
        {
          "errorCode": "OUT_OF_MEMORY_ERROR",
          "errorMessage": "Out of memory error. Resume load and try again.",
          "fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
          "recordNum": 0
        }
      ]
    }
  }
}

This is the freeable memory metric of the Writer instance at the time I recieved the out-of-memory errors above: Freeable Memory metric of the writer instance since the creation of the Neptune cluster

After restarting the Writer Instance and loading some data from S3, the same starts to happen again.

Freeable Memory metric of the writer instance after rebooting and loading some data

  • Can you please provide further details on the exact error(s) that you're receiving on the failed bulk load jobs? If using the notebooks, you can use the %load_status <load_id> --details --errors to see the full error output. Is this where you are seeing out-of-memory errors? Or are you seeing those errors when running other queries? If the latter, can you provide examples of the types of queries that you're attempting to execute when receiving out-of-memory errors. Thank you!

  • Thanks Taylor! I have added the errors on the loads that I received when the freeable memory metric was at its lowest point yesterday. Queries on the data that was successfully loaded into Neptune always worked without issues.

  • Likely would need some further info on your account and your cluster/instance(s) to look into this further. Are you able to open a support case? If so, please do and I'll be on the lookout for that.

kutschs
asked 2 years ago254 views
1 Answer
0

Hello,

I understand that you are trying to load openCypher data from S3. The loads are successful, but the freeable memory of their writer instance does not recover after successfully loading the data, which eventually leads to failed loads due to out-of-memory errors.

A note here needs to be made that the t3.medium instance type that is being used, is intended for development and test scenarios only.

  • This instance type has 4GiB of memory and may run into out of memory errors resulting in instance restarts for production workloads. Bulk loads are highly resource intensive and not expected to run optimally on small Neptune instances.

It is recommended to consider using a larger instance such as a r5.large to run the bulk loads. After the bulk load is complete, the instances can be scaled back down for normal db usage. Please refer to the following documentation on performing temporary instance scaling [1][2].


References:

[1] https://docs.aws.amazon.com/neptune/latest/userguide/best-practices-general-basic.html#best-practices-loader-tempinstance

[2] https://docs.aws.amazon.com/neptune/latest/userguide/best-practices-general-basic.html#best-practices-resize-instance

profile pictureAWS
SUPPORT ENGINEER
Yash_C
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions