Questions tagged with Amazon Redshift
Content language: English
Sort by most recent
Redshift gives me the error
select table_schema,table_name,LISTAGG(column_name,', ')
within group (order by ordinal_position asc)
from information_schema.columns
where table_name = 'abcde'
and table_schema = 'xyz'
group by 1,2
i tried to create mytable
insert into mytable
select table_schema , table_name , ordinal_position as colpos,
column_name as ColumnName
from information_schema.columns
where table_name = 'abcde'
and table_schema = 'xyz'
group by 1,2
gives me error:
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
i would want to acheive this which would will be later used in my stored proc.
table_schema , tablename, distkey, sortkey, columns
xyz abcde col1 col2,col3 col1,col2,col3,col4,col5,col6,col7
i also tried with
select schema_name as databasename,table_name as tablename,ordinal_position as colpos,column_name
from pg_catalog.svv_all_columns where database_name='prod123' and schema_name='xyz' and
table_name='abcde' order by 1,2,3,4
get the error:
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_column_privilege(oid,smallint,text)" not supported.
Failed to get redshift columns from *******
thanks
KN
I have lookup table that contains databasename and tablename with indicators. I want to loop through each table first to check for duplicates and if found will have to do some clean up.
Can someone help me with the sP?
CREATE OR REPLACE PROCEDURE record_example()
LANGUAGE plpgsql
AS $$
DECLARE
rec RECORD;
BEGIN
FOR rec IN select db_nm,tbl_nm from Tbl_Lookup where dupcheck_ind <> 'Y'
LOOP
DML1;
DML2;
DML3;
END LOOP;
END;
$$;
Thanks
KN
Is it possible to disable auto vacuum on RedShift Cluster?
I have a Step Function that uses the ExecuteStatement integration from the Redshift Data API. I want to execute a SQL statement that includes named parameters, which I want to use JSONPath to pass in to the API.
According to the API reference, the Parameters should look like this:
```json
{
"Parameters": [
{
"name": "string",
"value": "string"
}
]
}
```
However, when I try to pass parameters in this way, the Step Function refuses to start the task:
```
..."Parameters":[{"name":"foo","value":"bar"},{"name":"baz","value":"bat"}] ...could not be used to start the Task: [The field "name" is not supported by Step Functions]
```
Is there any way to pass parameters in without using the field "name"?
In Redshift, I'm trying to update a table using another table from another database. The error details:
SQL Error [XX000]: ERROR: Assert
Detail:
-----------------------------------------------
error: Assert
code: 1000
context: scan->m_src_id == table_id -
query: 17277564
location: xen_execute.cpp:5251
process: padbmaster [pid=30866]
The context is not helpful. I have used a similar join based approach for other tables and there the update statement has been working fine.
Update syntax used:
```
UPDATE ods.schema.tablename
SET
"TimeStamp" = GETDATE(),
"col" = S."col",
FROM ods.schema.tablename T
INNER JOIN stg.schema.tablename S
ON T.Col = S.Col;
```
Is there a Boto3 python script available that gives Date and Time, when was last time Table in Amazon Redshift was written (INSERT, UPDATE and DELETE), just need data and time, not the content which was written.
We have been able to connect to a Microsoft SQL Server DB using both Glue's DynamicFrame and Spark's own JDBC write option due to the Glue connection option. However, considering the transactional nature of the data, we wanted to switch this to sending data programmatically using a python library, thus allowing us to move from Glue ETL Job to Python Shell Job. Our initial option was pyodbc, however, due to our inability to integrate the required driver with Glue, we were unsuccessful. Another option that we looked at was pymssql. Our experience with connecting to Microsoft SQL server using pymssql was seamless. However, it was restricted to Python 3.6 and we were unable to import it with Python 3.9. Our requirement for Python 3.9 was due to the boto3 version compatibility with the said python version as we are also trying to access the boto3 redshift-data client in the same script. Having considered all of the above, is it possible to query Microsoft SQL Server using a python library with Python 3.9 in AWS Glue Python Shell Job?
Concurrency scaling is supported even with Redshift Serverless? seems like there is no documentation around this for AWS Reshift Serverless.
If I have configured the base RPU, will it be changed automatically based on the workload?
What happens when I am at Max RPU & there is more workload / more number of users accessing the data warehouse?
With Provisioned Clusters, the maximum limit of the storage on RA3 (considering the maximum number of nodes) is around 16PB
What is the maximum limit on the amount of data that can be stored with AWS Redshift Serverless?
Trying below mentioned code using super data type. But its not parsing it. Please help me on this.
SELECT id,val,val."SS"
FROM temp_json c, c.d AS val
where id='0010a191-bf42-11ed-9178-ddc4f20ae2d7']
Table Script:-
CREATE TABLE reporting_owner.temp_json (
id character varying(200) ENCODE lzo,
d super
)
DISTSTYLE AUTO;
Out put attached:-

example data:-
[{"SS":[{"data":[{"label":"checksum","value":"2553"},{"label":"numSettings","value":"10"}],"metadata":{"type":"SS"}}]},{"SHV":[{"data":[{"label":"rawPayload","value":"544d555f544553545630332c32315f30345f32325f3031"}]}]},{"TT":[{"data":[{"label":"trailerId","value":"0"},{"label":"highTempThreshold","value":"115"},{"label":"highPressureThreshold","value":"97"},{"label":"lowPressureThreshold","value":"48"}],"metadata":{"type":"TT"}}]},{"BS1":[{"data":[{"label":"frontRight","value":"58","units":"%","type":"Brake Pad Thickness Remaining"},{"label":"rearLeft","value":"92","units":"%","type":"Brake Pad Thickness Remaining"},{"label":"frontLeft","value":"15","units":"%","type":"Brake Pad Thickness Remaining"},{"label":"rearRight","value":"100","units":"%","type":"Brake Pad Thickness Remaining"}],"metadata":{"label":"BS1_0","index":"0","type":"BMS"}}]},{"BC1":[{"data":[{"label":"parkingBrakeStatus","value":"UNLOCK"},{"label":"frontLeftServiceBrakePressure","value":"69.00","units":"kPa","type":"Pressure"},{"label":"frontLeftParkingBrakePressure","value":"103.00","units":"kPa","type":"Pressure"},{"label":"frontRightServiceBrakePressure","value":"345.00","units":"kPa","type":"Pressure"},{"label":"frontRightParkingBrakePressure","value":"379.00","units":"kPa","type":"Pressure"},{"label":"rearLeftServiceBrakePressure","value":"552.00","units":"kPa","type":"Pressure"},{"label":"rearLeftParkingBrakePressure","value":"586.00","units":"kPa","type":"Pressure"},{"label":"rearRightServiceBrakePressure","value":"896.00","units":"kPa","type":"Pressure"},{"label":"rearRightParkingBrakePressure","value":"931.00","units":"kPa","type":"Pressure"}],"metadata":{"label":"BC1_0","index":"0","type":"BMS"}}]},{"SHS1":[{"data":[{"label":"rawPayload","value":"0001c607"}]}]},{"SHN":[{"data":[{"label":"sensorHubS/N","value":"6.14336.10496"}],"metadata":{"type":"SHN"}}]}]
I am trying to use the AWS Glue Studio to build a simple ETL workflow. Basically, I have a bunch of `csv` files in different directories in S3.
I want those csvs to be accessible via a database and have chosen Redshift for the job. The directories and will be updated every day with new csv files. The file structure is:
YYYY-MM-DD (e.g. 2023-03-07)
|---- groupName1
|---- groupName1.csv
|---- groupName2
|---- groupName2.csv
...
|---- groupNameN
|---- groupNameN.csv
We will be keeping historical data, so every day I will have a new date-based directory.
I've read that AWS Glue can automatically copy data on a schedule but I can't see my Redshift databases or tables (screenshot below). I'm using my AWS admin account and I do have `AWSGlueConsoleFullAccess` permission (screenshot below)


Querying from redshift to serverless aurora in cross account facing error as Timeout Expired,Debug Driver logs from
com.amazon.redshift.core.v3.QueryExecutorImpl.receiveErrorResponse: <=BE ErrorMessage(ERROR: timeout expired
Location: File: ../src/external_catalog/external_catalog_api.cpp, Routine: localize_external_table, Line: 1267
Server SQLState: D;600)
Cross account set up well verified:Anoynomus details below
nc -vz auroradnsname.rds.amazonaws.com 5432
Connection to auroradnsname.rds.amazonaws.com 5432 port [tcp/postgres] succeeded!
P.S :Instead of VPC peering using aws transit gateway