InternalFailureException on SPARQL Update query

0

I have two Neptune dbs, with the same engine version (1.2.0.2), size (db.r5.xlarge) and parameters. The only thing I can tell is different is their region (us-west-1 vs us-east-1). I am submitting the same SPARQL query to both. One responds fine, on the other I repeatedly get this 500 error:

Internal Server Error (500): {"requestId":"b8c4da08-36bb-a293-33d5-ac4065c22cad","code":"InternalFailureException","detailedMessage":"An unexpected error occurred while handling RDF data.”}

The SPARQL query only sets a few triples in one named graph, but one of the triples has a large JSON object literal value. I suspect that the length of the JSON object (~5000 lines when pretty printed) is what's causing the error. This is because when I reduce the size of that JSON object significantly (to ~1000 lines), the query succeeds on both databases. However, I'm confused why (1) both databases don't give the same error, and (2) why the error message doesn't have any more specific information than "InternalFailureException". Any help debugging would be greatly appreciated.

Here is an abbreviated version of the SPARQL update request from my CloudWatch logs (with some info redacted):

1690932548620, REDACTED_IP:37829, REDACTED_IP:8182, HTTP_POST, [unknown], [unknown], "HttpObjectAggregator$AggregatedFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: CompositeByteBuf(ridx: 0, widx: 75532, cap: 75532, components=10)) POST /sparql HTTP/1.1 accept: */* content-type: application/x-www-form-urlencoded User-Agent: node-fetch/1.0 (+https://github.com/bitinn/node-fetch) Accept-Encoding: gzip,deflate Connection: close Host: NEPTUNE_ENDPOINT_URL:8182 content-length: 75532", "update=CLEAR SILENT GRAPH <https://example.com/data/WeaviateOpenApiDescription>; INSERT {   GRAPH <https://example.com/data/WeaviateOpenApiDescription> {     <https://example.com/data/WeaviateOpenApiDescription> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://standardknowledge.com/ontologies/core/OpenApiDescription>.     <https://example.com/data/WeaviateOpenApiDescription> <https://standardknowledge.com/ontologies/core/integration> <https://example.com/integrations/Weaviate>.     <https://example.com/data/WeaviateOpenApiDescription> <https://standardknowledge.com/ontologies/core/openApiDescription> '{'openapi':'3.0.1','info':{'title':'Weaviate','description':'Cloud-native, modular vector search engine','contact':{'name':'Weaviate','url':'https://github.com/semi-technologies','email':'hello@semi.technology'},'version':'1.15.3'},'externalDocs':{'url':'https://github.com/semi-technologies/weaviate'},'servers':[{'url':'/v1'}],'security':[{},{'oidc':[]}],'tags':[{'name':'objects'},{'name':'batch','description':'These operations allow to execute batch requests for Objects and Objects. Mostly used for importing large datasets.'},{'name':'graphql'},{'name':'meta'},{'name':'schema','description':'These operations enable manipulation of the schema in Weaviate schema.'}],'paths':{'/objects':{'get':{'tags':['objects'],'summary':'Get a list of Objects.','description':'Lists all Objects in reverse order of creation, owned by the user that belongs to the used token.','operationId':'objects.list','parame...loat64'}}}},'description':'filter within a distance of a georange'}},'securitySchemes':{'oidc':{'type':'oauth2','description':'OIDC (OpenConnect ID - based on OAuth2)','flows':{'implicit':{'authorizationUrl':'http://to-be-configured-in-the-application-config','scopes':{}}}}}}}'^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#JSON>.     <https://concierge.standard.storage/data/WeaviateOpenApiDescription> <http://purl.org/dc/terms/created> ?now.     <https://concierge.standard.storage/data/WeaviateOpenApiDescription> <http://purl.org/dc/terms/modified> ?now.   } } WHERE { BIND(NOW() AS ?now) }"
  • Thanks for reporting this @adlerfaulkner. We are doing some investigating.

asked 9 months ago247 views
1 Answer
1

Your issue seems like a bug or some kind of limitation in Amazon Neptune SPARQL engine that isn't properly handled. Normally, RDF stores have limits in terms of the size of the literals they can handle, but these limits are usually documented and result in more informative error messages.

You mentioned a few possible causes in your description:

  • Size of the JSON object: You noticed that reducing the size of your JSON object significantly caused the query to succeed on both databases. It might be possible that the size of the JSON object is causing an error.

  • Different regions: It's also possible that the difference in behavior between the two databases is related to them being in different regions, although this would be unusual.

Here are few suggestions to further debug the issue:

  • Size limit: Try to gradually reduce the size of your JSON literal to see if there's a specific limit where the error occurs. This could give you an indication of whether there's a specific size limit.

  • JSON complexity: It's also possible that the complexity of the JSON object (in terms of nested objects, arrays, etc.) could be causing the error. Try simplifying the JSON object to see if that resolves the issue.

  • Contact AWS Support: If the above strategies do not work, I would recommend reaching out to AWS support. They have access to more detailed logs and internal metrics and can potentially provide more insight into the issue.

Keep in mind that handling large JSON literals in RDF triples might not be the best practice. If you find that the size of the JSON object is the issue, you might want to consider a different data modeling approach. For instance, you could represent your JSON data as a collection of smaller triples rather than one large JSON literal.

profile picture
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions