- Newest
- Most votes
- Most comments
I guess, the most probable cause is a edge case with Aurora Parallel Query (PQ) on reader instances. Under high concurrency, the coordination between the DB engine and the Storage Layer can occasionally fail to reconstruct the read-view correctly for a parallel scan, resulting in an empty result set instead of an error.
Try disable Parallel Query at the session level within your Lambda function to test this hypothesis:
SET aurora_parallel_query = OFF;
If the intermittent "0 records" issue disappears, you have identified the root cause. This is a much more likely culprit than transaction isolation levels, especially since your data isn't changing during execution.
Based on your description, this intermittent issue with parallel queries returning 0 records on Aurora MySQL reader instances is likely related to transaction isolation levels and how parallel query execution interacts with reader instances.
Key Issue: Isolation Level on Reader Instances
Aurora MySQL reader instances have specific isolation level requirements for parallel query execution. By default, reader instances use the REPEATABLE READ isolation level. However, parallel query on reader instances can also work with READ COMMITTED isolation level (available in Aurora MySQL version 2.09 and higher, which includes your version 3.08.0).
The critical point is that READ COMMITTED isolation level behaves differently on reader instances compared to writer instances - it's less strict on readers. This difference in isolation behavior could explain why you're seeing inconsistent results during parallel execution on readers while the writer returns consistent results.
Potential Causes:
-
Long-running transactions: If there are long-running transactions on your reader instance, they can prevent parallel query execution for subsequent queries, potentially causing inconsistent behavior.
-
Temporary table issues: Your query uses DISTINCT, multiple LEFT JOINs, and LIMIT with OFFSET, which can generate temporary tables during execution. Under parallel load, there have been known issues with temporary table handling that can cause unexpected behavior, including database restarts or incorrect query results in certain scenarios.
-
Parallel query execution plan instability: There have been documented cases where Aurora MySQL reader instances can experience issues while executing queries that utilize parallel query execution plans, particularly under concurrent load.
Recommended Solutions:
-
Set explicit isolation level: Try setting the isolation level explicitly at the session level in your Lambda functions when connecting to the reader. You might want to test with READ COMMITTED isolation level using the
aurora_read_replica_read_committedconfiguration option. -
Optimize your query: Your query has several characteristics that could be problematic:
- The LIMIT with large OFFSET (60000) is inefficient and can cause issues
- Multiple LEFT JOINs with COALESCE operations increase complexity
- Consider using appropriate indexes for the JOIN conditions and WHERE clauses
- The DISTINCT operation combined with ORDER BY and LIMIT can generate large temporary tables
-
Consider disabling parallel query temporarily: Test whether disabling parallel query for this specific query resolves the issue. This would help confirm if parallel query execution is the root cause.
-
Monitor for specific conditions: Check if the issue correlates with:
- Large UPDATE or DDL operations running on the writer instance
- High concurrent query load on the reader
- Buffer pool state and table statistics
- Query rewrite: Consider rewriting the query to avoid the large OFFSET. Use a keyset pagination approach instead (WHERE id < last_seen_id ORDER BY id DESC LIMIT 30000), which is more efficient and less prone to consistency issues.
The fact that retrying resolves the issue suggests a timing-related problem with how the reader instance handles parallel execution under load, possibly related to temporary table materialization or isolation level handling during concurrent access.
Sources
SQL constructs for parallel query in Aurora MySQL - Amazon Aurora
Troubleshoot table errors and storage full issues in Aurora My-SQL Compatible databases | AWS re:Post
Aurora MySQL database engine updates 2023-04-17 (version 3.02.3) (Deprecated) - Amazon Aurora
Relevant content
- asked 2 years ago
- asked 2 years ago
