Neptune takes forever to create 5000 nodes with opencypher

0

with data an array of 5000 elements, the following script will take FOREVER to be done, but with neo4j, I can run it within 30 seconds, what is wrong with the AWS Neptune Opencypher interface??

            UNWIND $data AS param
            MERGE (n:{label} {{id: param.id}})
            ON CREATE SET n = param
            ON MATCH SET n += param

The python code:

        def merge_nodes(tx):
            query = f"""
            UNWIND $data AS param
            MERGE (n:{label} {{id: param.id}})
            ON CREATE SET n = param
            ON MATCH SET n += param
            """
            # print(query)
            tx.run(query, data=data)
        with self.driver.session() as session:
            return session.write_transaction(merge_nodes)
質問済み 6ヶ月前285ビュー
1回答
1
承認された回答

Any mutation query in Neptune will get executed single-threadedly. If looking for a more performant method for writing data to Neptune in batches, I would suggest using Neptune's bulk loader [1], or issuing concurrent/parallel write requests in smaller batches. In our testing, multiple parallel write requests in batches of 100 to 200 objects per request and with parallelization that matches the execution thread count on Neptune's writer instance (2x the number of vCPUs) would get you the most optimal write-throughput.

Both the bulk loader and the use of parallel requests will scale linearly with the size of the writer instance (more vCPUs means more query execution threads). So if you need to perform batch loads temporarily, you should scale up the writer instance for the loads and then you can scale it back down afterwards for steady state write workloads.

[1] https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load.html

profile pictureAWS
回答済み 6ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ