Big gremlin insert query (sometimes) results with InternalFailureException

0

Hi,

we are sometimes experiencing issues with big insert queries. We are running Neptune Engine 1.1.0.0 on a db.r5.large instance.

Our initial approach is to insert new data which includes masterdata also (possibly new). To ensure the masterdata will not be created twice, we use appropriate coalesce-steps. A resulting query includes ~ 8000 gremlin steps for example. This query takes more than 20 seconds to execute on an empty database.

Our second approach was to assume that all master data already existed in the database. Therefore no more coalesce statement is needed to insert our new data. The corresponding query of our first attempt reduces then to ~ 5000 gremlin steps. This query fails with an InternalFailureException. A descriptive reason is not given.

To analyze why the query of our second approach fails, we cut it in half which first also fails (InternalFailureException). Next we tried the quarter of the query, which then succeeded. We retried to run again half of the query which than also succeeded. After that we even were able to run the full query. Please notice, that after each try we reset the database to only include our masterdata.

We are wondering why our insert statement runs over 20 seconds on an empty database. Are we doing something wrong here?

On the other hand we are wondering why our second approach sometimes work and sometimes fails with the InternalFailureException. It seems that this behavior is undeterministic. What can be the reason for this? How can we fix the query, so that it works everytime we retry on our reset database, including our masterdata?

You can access the queries here:

  1. Initial, full query with COALESCE steps: https://pastebin.com/d4u6edTH
  2. Query to drop all data except master data (to execute after execution of query 1): https://pastebin.com/z8iLeumM
  3. Our second approach query without COALESCE: https://pastebin.com/71WCC1iM
asked 2 years ago452 views
1 Answer
1

Hello - it would be very helpful to see an example query that is failing. In general, the way Gremlin steps are processed uses space on the worker thread's stack. That stack space can run out for very long queries. It is possible that is what you are running into but it could be something else. If you are not needing to do "upsert" checks (using coalesce), an alternative is to inject a list of maps into the query. Each map contains all of the information for a vertex or an edge. There are several ways a query can be built to take advantage of the "map injection" pattern. There is some discussion of it here. For large creates, using a map can be quite effective and avoids any possibility of hitting the issue with Gremlin queries becoming very long.

We can discuss further by e-mail or PM if you are able to share a sample query but it may also be best to open a support case so that someone can investigate the exact cause of the exception/

AWS
AWS-KRL
answered 2 years ago
  • I was not able to find a way to send you a PM. Is there any way to get in contact without posting details public?

  • Are you able to open a support case?

  • We contacted our service provider to create an support case. Nevertheless, I also edited my question to attach the queries via pastebin.

  • Thanks for sharing the queries. I think it is quite likely that you are running into the issue discussed at the link I shared. The support team will be able to look at that once the support case arrives. Especially, if you don't need the coalesce(), the map injection pattern is most likely a very good alternative approach to take over using a large number of Gremlin steps. I have used it to create many thousands of vertices and properties in a single query.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions