How to set a shorter lock_wait_time on Redshift

0

We use the scheduler from Redshift query editor v2 to run a SP every 5 minutes. The SP typically completes in 1-2 minutes. Occasionally the SP takes > 5 minutes due to various reasons (CPU, I/O spike, etc). The SP taking > 5 minutes occasionally is not a problem at all. The problem is that the scheduler will invoke the same SP every 5 minutes, and the 2nd SP will be blocked waiting for the first SP to complete and release the lock. This causes a cascading lock wait problem, with lock_wait_time increasing for each additional SP invocation. For our SP that typically completes in 1-2 minutes, it ended up spending hours in lock_wait_time (e.g. 8013092421 micro-seconds = 2.2 hours). We'd prefer if the 2nd SP simply throws an exception instead of waiting such a long time for the lock to be released.

  1. Is there a way to set a lock wait timeout value (something like 1 or 2 minutes)?
  2. What are other solutions folks have tried successfully? (We considered a couple of options. One of them is to implement a re-entrance check in the SP to prevent re-entrance but it leads to other complexities such as an aborted SP not cleaning up itself. Our simpler option is checking a system table to make sure the same SP isn't running concurrently but also wondering how reliable it would be)
Yi
asked 3 months ago448 views
4 Answers
1

Hello Yi. Thank you for your question. You can take an alternative approach to handle this conflict.

It's essential to implement a dependency when scheduling a process at regular interval. You can setup an audit table with one entry per process (or a table) to keep track of the status of your latest run of the stored procedure. In that audit table you can maintain columns like:

  1. Process_Name
  2. Process_Start_Time
  3. Process_End_Time
  4. Process_State

For a given stored procedure, you set the Process_State as “Ready” before the stored procedure runs for the first time.

As a first step inside the stored procedure, you can check the **Process_State ** of latest run from the audit table and implement a logic within stored procedure to check the following :

  1. if Process_State is “Running” then end the procedure successfully, as it indicates that the previous iteration is not completed yet and next iteration is not ready to start.
  2. If the Process_State is “Ready,” then start the load process, as it indicates that the previous iteration was completed successfully and the next iteration is ready to start. And the next step will be to update the audit table to change the Process_State to “Running”. Optionally you can maintain the start and end time of the process for better debugging.

Also consider referring to this blog for more sophisticated implementation using Step Function, Lambda, DynamoDB and Redshift Data API: https://aws.amazon.com/blogs/big-data/accelerate-orchestration-of-an-elt-process-using-aws-step-functions-and-amazon-redshift-data-api/

profile pictureAWS
EXPERT
answered 3 months ago
1

Hello Yi. Above query should be able to determine existing lock and will serve your purpose, except during the very first run of the Stored procedure when the target table is empty. SVV_TABLE_INFO starts showing up a record for a given table only after at least one row is loaded into the table. However, SVV objects are complex views and you might see delays while returning data in a timely manner when the system is busy or if you have large number of objects. You can consider using PG_locks which is much lighter weight system table, based on your transaction ID. While system objects can serve your short term purpose, recommendation is to implement a solid control over auditing techniques either inside (call audit stored proc) or outside of Redshift (E.g. DynamoDB) if you have a strict ingestion SLAs or data freshness requirement.

profile pictureAWS
EXPERT
answered 3 months ago
0

Thanks Poulomi!

My understanding is once Redshift acquires a lock for the first time, Redshift will hold on to the lock for the entire transaction even if the lock is only needed at the beginning of the Tx. Given that, we know exactly which table lock to check for our SP. We went with a simple solution that checks if a lock has been held (see below), and if it is, the SP will error out without proceeding:

WITH LockedTables AS (
   SELECT relation as table_id
   FROM SVV_TRANSACTIONS
   WHERE txn_db = 'app' AND lock_mode in ('ShareRowExclusiveLock', 'ExclusiveLock') AND
       granted = true AND lockable_object_type = 'relation'
)
SELECT COUNT(1) as lock_count from SVV_TABLE_INFO
INNER JOIN LockedTables USING (table_id)
WHERE database='app' AND "schema"='ingestion' AND "table"='email_send_metadata';

Any drawbacks you can think of for the above approach?

Yi
answered 3 months ago
0

Hello,

Thank you for posting your question.

Looking at the use case, ideally and how redshift works you cannot avoid locking on tables. You would have to schedule the SP in such a way to avoid concurrent queries. Please try limiting and schedule to avoid conflict on statements causing locks with the below document.

https://repost.aws/knowledge-center/prevent-locks-blocking-queries-redshift

query to identify sessions that are holding locks:

select a.txn_owner, a.txn_db, a.xid, a.pid, a.txn_start, a.lock_mode, a.relation as table_id,nvl(trim(c."name"),d.relname) as tablename, a.granted,b.pid as blocking_pid ,datediff(s,a.txn_start,getdate())/86400||' days '||datediff(s,a.txn_start,getdate())%86400/3600||' hrs '||datediff(s,a.txn_start,getdate())%3600/60||' mins '||datediff(s,a.txn_start,getdate())%60||' secs' as txn_duration from svv_transactions a left join (select pid,relation,granted from pg_locks group by 1,2,3) b on a.relation=b.relation and a.granted='f' and b.granted='t' left join (select * from stv_tbl_perm where slice=0) c on a.relation=c.id left join pg_class d on a.relation=d.oid where a.relation is not null;

Please add for any concerns. Thanks

AWS
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions