By using AWS re:Post, you agree to the Terms of Use

Questions tagged with Database

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Neptune and Cypher - Poor Query Performance

I am wanting to use Neptune for an application with cypher as my query language. I have a pretty small dataset of around ~8500 nodes and ~8500 edges edges. I am trying to do what seem to be fairly straightforward queries, but the latency is very high (~6-8 seconds for around 1000 rows). I have tried with various instance types, enabling and disabling caches, enabling and disabling the OSGP index to no avail. I'm really at a loss as to why the query performance is so poor. Does anyone have any experience with poor query query performance using Neptune? I feel I must be doing something incorrect to have such high query latency. Here is some more detailed information on my graph structure and my query. I have a graph with 2 node types `A` and `B` and a single edge type `MAPS_TO` which always is directed from an `A` node to a `B` node. The relation `MAPS_TO` is many to many, but with the current dataset it is primarily one-to-one, i.e. the graph is mainly disconnected subgraphs of the form: ``` (A)-[MAPS_TO]-(B) ``` What I would like to do is for all A nodes to collect the distinct B nodes which they map to satisfying some conditions. I've experimented with my queries a bit and the fastest one I've been able to arrive at is: ``` MATCH (a:A) WHERE a.Owner = $owner AND a.IsPublic = true WITH a MATCH (a)-[r:MAPS_TO]->(b:B) WHERE (b)<-[:MAPS_TO {CreationReason: "origin"}]-(:A {Owner: $owner}) OR (b)<-[:MAPS_TO {CreationReason: "origin"}]-(:A {IsPublic: true}) WITH a, r, b ORDER BY a.AId SKIP 0 LIMIT 1000 RETURN a { .AId } AS A, collect(distinct b { B: {BId: b.BId, Name: b.Name, other properties on B nodes...} R: {CreationReason: r.CreationReason, other relation properties} }) ``` The above query takes ~6 seconds (looking at explain and http timing) on the `t4g.medium` instance type. I tried upping to a `r5d.2xlarge` instance type and this cut the query time in half to 3-4 seconds. However, using such a large instance type seems quite excessive for such a small amount of data. Really I am just trying to figure out why my query seems to perform so poorly. It seems to me that with the amount of data I have it should not really be possible to have a Neptune configuration with such performance. EDIT for more info: We are using the `t4g.medium` instance type with 3 reader instances and the query going to the reader instances. Again we have around ~8500 nodes split approximately equally into `A` nodes and `B` nodes. There are around ~8500 edges of a single type `MAPS_TO` all going from `A` to `B`. The output of the status endpoint for OpenCypher is: ``` {'status': 'healthy', 'startTime': 'Mon Sep 19 18:56:50 UTC 2022', 'dbEngineVersion': '1.1.1.0.R5', 'role': 'reader', 'dfeQueryEngine': 'viaQueryHint', 'gremlin': {'version': 'tinkerpop-3.5.2'}, 'sparql': {'version': 'sparql-1.1'}, 'opencypher': {'version': 'Neptune-9.0.20190305-1.0'}, 'labMode': {'ObjectIndex': 'enabled', 'ReadWriteConflictDetection': 'enabled'}, 'features': {'ResultCache': {'status': 'enabled'}, 'IAMAuthentication': 'disabled', 'Streams': 'disabled', 'AuditLog': 'enabled'}, 'settings': {'clusterQueryTimeoutInMs': '120000'}} ``` I have tried this with the `ObjectIndex` enabled and disabled and do not see much difference in performance. I have also tried the query on a larger instance type, the `r5d.2xlarge` to see if performance was improved by the result cache. The response time roughly cut in half, but that still seems to be very slow and a larger instance type then should be necessary. The only thing being run against the database currently are the above queries so I do not see how it could be a concurrency issue. We have looked at the output of explain (too long to post). It is not clear to me that there is a single place where the query is spending a large amount of time. The `DFEPipelineJoin` taking the longest makes sense to me based on the description in the documentation. What is not clear to me would be how to eliminate all the `DFEPipelineJoin`'s from the query.
2
answers
0
votes
42
views
asked 13 days ago

How to rename a database (Not the instance) in mssql

I'm just trying to rename a created database on a 2019 MS SQL running on RDS. I have a database named `mydb` that for multi-tenant purposes, I need to rename to `mydb2` as a workaround for hardcoded values in the app. However, in SSMS, when i try to rename the database, I get this error: ``` TITLE: Microsoft SQL Server Management Studio ------------------------------ Unable to rename mydb2. (ObjectExplorer) ------------------------------ ADDITIONAL INFORMATION: Rename failed for Database 'mydb'. (Microsoft.SqlServer.Smo) For help, click: https://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=16.100.47021.0&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&EvtID=Rename+Database&LinkId=20476 ------------------------------ An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo) ------------------------------ User does not have permission to alter database 'mydb', the database does not exist, or the database is not in a state that allows access checks. (Microsoft SQL Server, Error: 5011) For help, click: https://docs.microsoft.com/sql/relational-databases/errors-events/mssqlserver-5011-database-engine-error ------------------------------ BUTTONS: OK ------------------------------ ``` An using for forceful ways like changing the db to single-access, then altering the db doesn't work either because the error says my `admin` user doesn't have permissions to change that on a table. Is the background user `rdsadmin` logged in and preventing me from changing this? Could I restart the service and close any zombie connections? Maybe I make a copy of the database and give it another name? These seems like overkill but I'm not sure what permissions on RDS are preventing me.
1
answers
0
votes
18
views
asked 14 days ago

Can't Modify RDS Postgres "R6i" Stand-Alone instance to Multi-AZ, Fails

I am working on some terraform to launch RDS instances. As part of this project I need to import data from another database. The recommended setup from AWS to have a FAST `pg_restore` is to launch the RDS instance in Standalone mode for the import and then modify to Mutli-AZ there after. I have done this process before in my career and the methodology does work quite well. With this project we have chosen the R6i instance types as the match quite well with where we are migrating from. How ever I feel that we have found a bug in the AWS provisioning system. If you launch an RDS Postgres `db.r6i.24xlarge` instance type in the Stand-Alone Configuration and the modify it to be Multi-AZ you get the following error in the Web Console RDS - Log & Events Tab. ``` Instance type db.r6i.24xlarge in availability zone us-east-1c is unsupported; putting database instance into available ``` We know that to be false since the Main DB launched just fine. I have tested using the AWS Web Console instead of the Terraform code and I can reproduce this. I have also found via the Web Console that you can get a Multi-AZ DB, but only if you start with that configuration. So the steps to reproduce: 1. Launch a, RDS Postgres `db.r6i.24xlarge` instance in Stand-Alone. 2. After Completion; Modify the instance to be Multi-AZ. The process with take a little over 30 min, but then you will get the above error. My question: Is this a Bug?, and how can we get past it? Update: I also Attempted the same tests in the Default VPC. In the Default both Starting as a Mutli-AZ AND Starting as a Stand-Alone and Modify failed. :-(
1
answers
0
votes
31
views
asked 14 days ago