Neptune takes forever to create 5000 nodes with opencypher

0

with data an array of 5000 elements, the following script will take FOREVER to be done, but with neo4j, I can run it within 30 seconds, what is wrong with the AWS Neptune Opencypher interface??

            UNWIND $data AS param
            MERGE (n:{label} {{id: param.id}})
            ON CREATE SET n = param
            ON MATCH SET n += param

The python code:

        def merge_nodes(tx):
            query = f"""
            UNWIND $data AS param
            MERGE (n:{label} {{id: param.id}})
            ON CREATE SET n = param
            ON MATCH SET n += param
            """
            # print(query)
            tx.run(query, data=data)
        with self.driver.session() as session:
            return session.write_transaction(merge_nodes)
1개 답변
1
수락된 답변

Any mutation query in Neptune will get executed single-threadedly. If looking for a more performant method for writing data to Neptune in batches, I would suggest using Neptune's bulk loader [1], or issuing concurrent/parallel write requests in smaller batches. In our testing, multiple parallel write requests in batches of 100 to 200 objects per request and with parallelization that matches the execution thread count on Neptune's writer instance (2x the number of vCPUs) would get you the most optimal write-throughput.

Both the bulk loader and the use of parallel requests will scale linearly with the size of the writer instance (more vCPUs means more query execution threads). So if you need to perform batch loads temporarily, you should scale up the writer instance for the loads and then you can scale it back down afterwards for steady state write workloads.

[1] https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load.html

profile pictureAWS
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠