Browse through the questions and answers listed below or filter and sort to narrow down your results.
Amazon Redshift concurrency scaling - How much time it takes to complete scaling and setting threshold to trigger it
Hi Team, I have an existing redshift cluster, where I want to enable concurrency scaling. I had few queries related to the same : 1. My cluster having 2 on demand ra3.4xlarge nodes is running since March 2021. AWS docs mentions that the running redshift cluster accrues 1 hour of free cluster usage credit every 24 hours which never expire. Does it mean that, my cluster would already have 18 months * 30 credit usage hours already accrued, since concurrency scaling was never enabled for this cluster.? 2. When the does the concurrency scaling feature kicks in ? Is it only when the queries starts getting queued up ? Can we define some kind of threshold like cpu %utilization or memory % utilization, which would automatically start the concurrency scaling process ? 3. How much time does it take for cluster to complete the autoscaling process and start serving queries ? Thanks!
Migrating partitioned table from postgres to Redshift with pglogical
I've created a DMS task of CDC and Full Load, migrating data from postgres 14 to Redshift. According to the documentation, when using pglogical and creating postgres publication with 'publish_via_partition_root' parameter of my partitioned table, changes should be published to the parent table and to to child tables. However, the data is still migrated to the child tables in Redshift and not to the parent table. Am I missing something thats needs to be configured or is it just not possible in DMS?
RedShift Serverless timeout when connecting with python
I tried to use redshift_connector to connect to my redshift cluster using python but I get a timeout error.. ``` import redshift_connector conn = redshift_connector.connect( host="default.XXXXXXXXX.us-east-1.redshift-serverless.amazonaws.com", database='dev', access_key_id="XXXXXXXXX", secret_access_key="XXXXXXX", port=5439, region="us-east-1" ) ``` Result : ``` redshift_connector.error.InterfaceError: ('communication error', TimeoutError(10060, ``` 1- I am using my default workspace which uses my default VPC which has open inbound rules for IP and ports 2- I enabled public access to the workspace wasted 3 hours on this and finaly used google bigquery ...
Amazon Schema Conversion Tool Connection error with Redshift as target database
I have selected Oracle DB as my source database and AWS Redshift as target in Schema Conversion Tool. Connection to source is successful though while connecting to Redshift I am getting an error. Where can I find the server name in Redshift service dashboard ? Server port - 5439 Database - dev Connection name - Redshift
Problem with Federated Query to RDS Assert code: 1000
We are connecting Redshift with RDS using Federated Queries. When we try to query very simple tables like Month (id (int4) / name (text)) or Practices (text,text,text) from Postgres, we have errors like: ERROR: ----------------------------------------------- error: Assert code: 1000 context: reltuples >= 0.0 - Number of rows cannot be negative query: 0 location: pgclient.cpp:288 process: padbmaster [pid=14019] ----------------------------------------------- [ErrorId: 1-630676b5-594e911339e1d0341291f074] One useful information: we enabled enable_case_sensitive_identifier = True because the names of the tables on RDS were PascalCase. The tables are small, so I don't know if a query optimization engine is causing such errors. Any information would help us. Thanks in advance.
[Python UDF] Failed to import library after CREATE LIBRARY
I want to write a Python UDF that uses `scikit-learn`. Here's the command that I'm running. ```sql CREATE OR REPLACE LIBRARY scikit_learn LANGUAGE plpythonu FROM '...' ``` I've uploaded the a 2.7 Python package in a `.zip` file to http://file.io. Afterwards, I'm trying to run a function that uses `sklearn` but I'm getting: `ImportError: No module named sklearn.covariance. Please look at svl_udf_log for more information`. Why does it fail to import even though it downloaded the package?
Complex lag windows function help needed
Im racking my brain about how to do this and I dont think its possible but not sure what to even search for. So the below is a subset of the data (using example numbers) I am trying to use the lag function to populate the flag column. ``` Status Flag Connected -> Exiting Maillink Exiting -> Not equal to Connected OnCorp ( All rows stay OnCorp until we see Connected) Connected -> Reconnecting Mailink (It should stay Mailink for all rows until we see "Exiting" status ) ``` Is it possible in Sql? Using lag function in case statement to compare previous state is causing error [Sample Data] : https://i.stack.imgur.com/tdTOm.png Sql Code (But erroring out ) -: ``` WITH base as ( select o.computername, o.currentuser, o.datetime, message, CASE WHEN LEN(split_part(substring(split_part(message, 'STATE', 2), 13), ',', 1)) <> 0 THEN split_part(substring(split_part(message, 'STATE', 2), 13), ',', 1) WHEN (message like 'WARNING%' OR message like 'OpenVPN%') THEN 'Start' WHEN message = 'SIGTERM[hard,] received, process exiting' THEN 'Exit1' WHEN message = 'Closing TUN/TAP interface' THEN 'Exit2' ELSE 'NO Status' END State1, CASE WHEN State1 = 'Start' THEN 1 WHEN State1 = 'RESOLVE' THEN 2 WHEN State1 = 'WAIT' THEN 3 WHEN State1 = 'AUTH' THEN 4 WHEN State1 = 'GET_CONFIG' THEN 5 WHEN State1 = 'ASSIGN_IP' THEN 6 WHEN State1 = 'ADD_ROUTES' THEN 7 WHEN State1 = 'CONNECTED' THEN 8 WHEN State1 = 'EXITING' THEN 9 END orderofoperation --row_number() over (partition by o.computername,o.currentuser,DATE(o.datetime) order by o.computername,o.currentuser,o.datetime) as rownumber from maillink_openvpn_logs_ext_schema.open_vpn_filtered o where o.message != 'message' and currentuser = 'wuellie' --and State1 in ('EXITING' ,'OpenVPN','RESOLVE','WAIT','AUTH','ASSIGN_IP','GET_CONFIG','ADD_ROUTES') order by o.datetime,orderofoperation ) select final.*, case when final.previous_record_state is null then ‘oncorp’ when final.currentstate = ‘CONNECTED’ then ‘maillink’ when final.previous_record_state is not null and final.previous_record_state not in (‘CONNECTED’) then lag(flag) partition by computername,currentuser, nextstateddatetime ::DATE order by nextstateddatetime when previous_record_state in (‘EXITING’) and lag(flag) partition by computername,currentuser, nextstateddatetime ::DATE order by nextstateddatetime = ‘maillink’ then ‘oncorp’ else lag(flag) partition by computername,currentuser, nextstateddatetime ::DATE order by nextstateddatetime end as flag from ( select b.computername,b.currentuser,b.State1 currentstate,b.datetime, lag(State1) over (partition by b.computername,b.currentuser,(b.datetime) ::DATE order by b.computername,b.currentuser, (b.datetime) ::TIMESTAMP,orderofoperation) as previous_record_state --lag(datetime) over (partition by b.computername,b.currentuser,b.datetime ::DATE order by b.computername,b.currentuser, (b.datetime) ::TIMESTAMP,orderofoperation) as nextdatetime from base b order by b.datetime,orderofoperation )final ``` ``` Computername username State datetime onCorp/mailink xyx ads start 2022-07-06T06:45:51 ONCORP xyx ads RESOLVE 2022-07-06T07:06:45 ONCORP xyx ads WAIT 2022-07-06T07:06:45 ONCORP xyx ads AUTH 2022-07-06T07:07:00 ONCORP xyx ads GET_CONFIG 2022-07-06T07:07:00 ONCORP xyx ads ADD_ROUTES 2022-07-06T07:07:01 ONCORP xyx ads CONNECTED 2022-07-06T07:07:01 MAILINK xyx ads EXITING 2022-07-06T07:07:01 MAILINK xyx ads RESOLVE 2022-07-06T07:07:46 ONCORP xyx ads WAIT 2022-07-06T07:07:46 ONCORP xyx ads AUTH 2022-07-06T07:07:50 ONCORP xyx ads GET_CONFIG 2022-07-06T07:07:51 ONCORP xyx ads ADD_ROUTES 2022-07-06T07:07:51 ONCORP xyx ads CONNECTED 2022-07-06T07:07:52 MAILINK xyx ads RECONNECTED 2022-07-06T07:08:01 MAILINK xyx ads WAIT 2022-07-06T07:08:02 MAILINK xyx ads AUTH 2022-07-06T07:08:09 MAILINK xyx ads RECONNECTED 2022-07-06T07:08:10 MAILINK xyx ads CONNECTED 2022-07-06T07:08:15 MAILINK xyx ads RECONNECTED 2022-07-06T07:08:20 MAILINK xyx ads CONNECTED 2022-07-06T07:09:01 MAILINK xyx ads EXITING 2022-07-06T07:10:50 MAILINK xyx ads START 2022-07-06T07:11:50 ONCORP ```
Spectrum on Redshift Shift Serverless with Enhanced VPC routing
For Amazon Redshift provisioned clusters, traffic between Spectrum and Amazon S3 does not use enhanced VPC routing even if enhanced VPC routing is enabled on the cluster. Does Redshift Serverless use enhanced VPC routing if it is enabled on the Workgroup?  https://docs.aws.amazon.com/redshift/latest/mgmt/spectrum-enhanced-vpc.html
Incremental Copy of RDS Aurora MySql Table to RedShift
Hi!!! To copy/incrementally copy data from Aurora MySql 5.7 to RedShift, can anyone share implementation steps? i couldnt do it with DataPipeline, Glue, DMS due to Certificate Issues at the Db. 1. Scheduled Jobs that does S3 exports of initial/changed data. This is the least expensive as you can decide what tables to export data from and how often. You can decide to do that every 10 mins or so. AWS Batch can be used with Lambda to accomplish this. - any references to Achieve this? Any other better optimised ways? Thanks!!
Glue Job - how to read from jdbc redshift tables in parallel?
In the aws documnet * https://docs.aws.amazon.com/glue/latest/dg/run-jdbc-parallel-read-job.html In reading from jdbc tables in parallel it's properties are ignored at redshift. therfore, how to read from jdbc redshift tables in parallel? Is there anyone who knows?
How do I solve a "Failed" message on a Redshift Serverless workgroup?
My workgroup shows a status of "Failed", and has done for a few days. I can still access the data in the Query Editor. I can successfully create a new workgroup, and the namespace associated with the workgroup is showing as Available. There are no controls for restarting or recreating the workgroup, or any way I can see to get out of the failed state. I think a related issue is that the workgroup no longer has a VPC security group or subnets associated with it, which seems to have happened around the time the workgroup changed from publicly accessible back to private. I didn't change this. I can't add a VPC security group or subnets because attempting to change those settings shows an error banner saying "ValidationException: Cannot update multiple configurations at the same time for workgroup default", despite having only completed the minimum required fields.
DMS 3.4.7 Fails on some tables upload to Redshift : Access denied to KMS key
Hello, Since the upgrade to version 3.4.7 from 3.4.6 of DMS. Some of our migration tasks are failing, only on some specific tables. We migrate tables from AuroraMySQL, MySQL RDS and Postgres RDS databases, to a Redshift cluster, all on the same VPC. Since the upgrade, some of the tables fail to be migrated, whereas others are successful, when looking at the logs, it seems that DMS is denied an access to a KMS key, used to write to S3 before the Redshift load. When looking at permissions, DMS can indeed use the key for encrypting data to S3, and has been able to upload files for other tables. The tables that fail are on multiple replication tasks, running on different types of databases. The successful tables even end up on Redshift. Nothing particular has been found with the failing tables, they don't use any other KMS key. Nothing particularly relevant was found on Cloudtrail. And reverting to 3.4.6 solved the issue. Is there something on this update related to how DMS handles keys, or writing tables to the S3 bucket for Redshift targets ? Thanks a lot, Best regards,