Questions tagged with Amazon Redshift
Content language: English
Sort by most recent
there is a text and i would need to find the word SUBSTR(
How can i acheive?
I have cluster A and cluster B. Cluster A has an external schema called 'landing_external' that contains many tables from our glue data catalog. Cluster A also has a local schema that is comprised of views that leverage data from 'landing_external' - this schema is called 'landing'.
Cluster A has a datashare that Cluster B is the consumer of. The 'landing' schema is shared with Cluster B, however, anytime a user attempts to select data from any of the views in 'landing' schema, they receive an error `ERROR: permission denied for schema landing_external`. I thought that creating all of the views with option 'WITH NO SCHEMA BINDING' would address this permission gap but it does not.
Any ideas on what I am missing?
We are currently writing a number of queries in QuickSight using SPICE for storing the results. This is fine for one-off QuickSight reports, but we'd need to write new queries and store in new SPICE when we change parameters, such as date range (i.e. changing from 1-week to 1-month reporting periods). What is best practice for what to store in SPICE: querying and storing data from single queries, or all joining data from tables and storing all data in SPICE (so that we may change query parameters in the future)?
We are using Redshift + QuickSight
If customer paid for 3 year upfront RI for Amazon Redshift, does the full payment count towards a 1 year EDP or is it prorated across 3 years?
Redshift gives me the error
select table_schema,table_name,LISTAGG(column_name,', ')
within group (order by ordinal_position asc)
from information_schema.columns
where table_name = 'abcde'
and table_schema = 'xyz'
group by 1,2
i tried to create mytable
insert into mytable
select table_schema , table_name , ordinal_position as colpos,
column_name as ColumnName
from information_schema.columns
where table_name = 'abcde'
and table_schema = 'xyz'
group by 1,2
gives me error:
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
Function "has_table_privilege(oid,text)" not supported.
i would want to acheive this which would will be later used in my stored proc.
table_schema , tablename, distkey, sortkey, columns
xyz abcde col1 col2,col3 col1,col2,col3,col4,col5,col6,col7
i also tried with
select schema_name as databasename,table_name as tablename,ordinal_position as colpos,column_name
from pg_catalog.svv_all_columns where database_name='prod123' and schema_name='xyz' and
table_name='abcde' order by 1,2,3,4
get the error:
Function "has_column_privilege(oid,smallint,text)" not supported.
Function "has_column_privilege(oid,smallint,text)" not supported.
Failed to get redshift columns from *******
thanks
KN
I have lookup table that contains databasename and tablename with indicators. I want to loop through each table first to check for duplicates and if found will have to do some clean up.
Can someone help me with the sP?
CREATE OR REPLACE PROCEDURE record_example()
LANGUAGE plpgsql
AS $$
DECLARE
rec RECORD;
BEGIN
FOR rec IN select db_nm,tbl_nm from Tbl_Lookup where dupcheck_ind <> 'Y'
LOOP
DML1;
DML2;
DML3;
END LOOP;
END;
$$;
Thanks
KN
Is it possible to disable auto vacuum on RedShift Cluster?
I have a Step Function that uses the ExecuteStatement integration from the Redshift Data API. I want to execute a SQL statement that includes named parameters, which I want to use JSONPath to pass in to the API.
According to the API reference, the Parameters should look like this:
```json
{
"Parameters": [
{
"name": "string",
"value": "string"
}
]
}
```
However, when I try to pass parameters in this way, the Step Function refuses to start the task:
```
..."Parameters":[{"name":"foo","value":"bar"},{"name":"baz","value":"bat"}] ...could not be used to start the Task: [The field "name" is not supported by Step Functions]
```
Is there any way to pass parameters in without using the field "name"?
In Redshift, I'm trying to update a table using another table from another database. The error details:
SQL Error [XX000]: ERROR: Assert
Detail:
-----------------------------------------------
error: Assert
code: 1000
context: scan->m_src_id == table_id -
query: 17277564
location: xen_execute.cpp:5251
process: padbmaster [pid=30866]
The context is not helpful. I have used a similar join based approach for other tables and there the update statement has been working fine.
Update syntax used:
```
UPDATE ods.schema.tablename
SET
"TimeStamp" = GETDATE(),
"col" = S."col",
FROM ods.schema.tablename T
INNER JOIN stg.schema.tablename S
ON T.Col = S.Col;
```
Is there a Boto3 python script available that gives Date and Time, when was last time Table in Amazon Redshift was written (INSERT, UPDATE and DELETE), just need data and time, not the content which was written.
We have been able to connect to a Microsoft SQL Server DB using both Glue's DynamicFrame and Spark's own JDBC write option due to the Glue connection option. However, considering the transactional nature of the data, we wanted to switch this to sending data programmatically using a python library, thus allowing us to move from Glue ETL Job to Python Shell Job. Our initial option was pyodbc, however, due to our inability to integrate the required driver with Glue, we were unsuccessful. Another option that we looked at was pymssql. Our experience with connecting to Microsoft SQL server using pymssql was seamless. However, it was restricted to Python 3.6 and we were unable to import it with Python 3.9. Our requirement for Python 3.9 was due to the boto3 version compatibility with the said python version as we are also trying to access the boto3 redshift-data client in the same script. Having considered all of the above, is it possible to query Microsoft SQL Server using a python library with Python 3.9 in AWS Glue Python Shell Job?
Concurrency scaling is supported even with Redshift Serverless? seems like there is no documentation around this for AWS Reshift Serverless.
If I have configured the base RPU, will it be changed automatically based on the workload?
What happens when I am at Max RPU & there is more workload / more number of users accessing the data warehouse?