Skip to content

Athena ICEBERG_CURSOR_ERROR on valid Security Lake parquet files — engine v3 vs. compactor output regression?

1

as anyone else seen ICEBERG_CURSOR_ERROR: Failed to read Parquet file from Athena engine v3 against Apache Iceberg tables managed by Amazon Security Lake (the OCSF v2.0 source tables in amazon_security_lake_glue_db_<region>.amazon_security_lake_table_<region>_<source>_2_0), where the offending files read cleanly via pyarrow?

Looking for: AWS confirmation of a known regression, or peer customers seeing the same symptom, or a customer-side workaround that doesn't require Lake Formation DDL access on Security-Lake-managed tables.

Setup

  • Region: us-east-1
  • Athena workgroup engine: v3
  • Security Lake delegated admin account ingesting all 7 OCSF v2.0 AWS sources: sh_findings, vpc_flow, s3_data, route53, waf, lambda_execution, cloud_trail_mgmt
  • Tables are Iceberg (format-version 2), Lake-Formation-managed — we cannot ALTER TABLE, OPTIMIZE, or VACUUM them
  • Dashboards on Athena have been working fine for months; failures started recently and are spreading

Symptom

Athena queries that read the parquet body (column projection, GROUP BY on data columns, UNNEST) over a window that includes one or more affected files fail with ICEBERG_CURSOR_ERROR. Manifest-only queries succeed because they don't open parquet bodies.

Reproduction shape:

-- Manifest-only — SUCCEEDS (0 bytes scanned)
SELECT COUNT(*) FROM <sh_findings_table>;

-- Body read, narrow window — SUCCEEDS today (affected files happen to be older than 7 days)
SELECT severity, COUNT(*)
FROM <sh_findings_table>
WHERE region = 'us-east-1'
  AND time_dt > current_timestamp - INTERVAL '7' DAY
GROUP BY severity;

-- Body read, wider window — FAILS
SELECT severity, COUNT(*)
FROM <sh_findings_table>
WHERE region = 'us-east-1'
  AND time_dt > current_timestamp - INTERVAL '30' DAY
GROUP BY severity;
-- ICEBERG_CURSOR_ERROR: Failed to read Parquet file:
-- s3://<security-lake-bucket>/aws/SH_FINDINGS/2.0/region=us-east-1/accountId=<account>/eventDay=20260424/<file>.gz.parquet

The pattern is consistent across all 7 OCSF v2.0 sources, across multiple eventDay partitions, and across multiple producer accounts in our org. Athena reports only the first failing file per query and aborts, so the actual count of affected files is likely larger than what error-message enumeration finds.

What I've validated

The parquet files themselves are fine. Downloaded a representative sample of offending files from S3 and read them with pyarrow on Ubuntu 24.04 LTS. They open in milliseconds with valid OCSF v2.0 schema and row counts that match the asl_rows value in S3 object metadata. Schema metadata on every file shows writer.model.name = 'avro' and a populated parquet.avro.schema key, indicating Security Lake's Avro→Parquet compactor path.

The Iceberg metadata is fine. Downloaded the current metadata.json for one of the affected tables:

  • format-version: 2
  • 1 partition spec, 4 fields — asl_version (identity), region (identity), accountid (identity), time_dt_day (day-transform on time_dt)
  • Properties: amazon_securitylake_append_strategy = "append", history.expire.max-snapshot-age-ms = 21600000 (6h), write.metadata.delete-after-commit.enabled = true
  • Manifests reference the affected parquet files coherently

The compactor is actively producing affected files. S3 LastModified on offending parquet objects clusters in the last 1–12 hours and re-touches old eventDay partitions. Lifecycle rule is SecurityLake_Generated_Rule_<source>_2.0 with ServerSideEncryption AES256, confirming Security-Lake-managed writes (not user uploads).

What I've ruled out

  • Parquet corruption — pyarrow proves the bytes are valid
  • Iceberg metadata corruption — metadata.json downloads, parses, and lists the affected files coherently
  • Schema drift on the consumer side — we don't write to these tables and our queries match the published OCSF v2.0 schema
  • Single-day blip — affects 7 sources, multiple accounts, multiple eventDays, growing daily
  • Engine version misalignment — engine v3 is the only option in our workgroup
  • Tunable read settings — Lake Formation blocks ALTER TABLE ... SET TBLPROPERTIES ('read.parquet.vectorization.enabled' = 'false') on Security Lake tables

What I've tried for workaround

  • OPTIMIZE table REWRITE DATA USING BIN_PACK — blocked by Lake Formation
  • Snapshot time travel FOR TIMESTAMP AS OF — 6h snapshot retention doesn't reach a pre-failure state
  • Direct S3 rewrite via pyarrow — risky for compliance, and the compactor would overwrite on the next run
  • Per-query time_dt-range exclusion of bad partitions — works briefly, brittle as new affected files keep appearing

The actual questions

  1. Has AWS acknowledged a regression in engine v3's Iceberg-aware vectorized parquet reader against Security Lake's Avro→Parquet compactor output format (writer.model.name = 'avro')?
  2. For Security-Lake-managed tables where customers cannot run DDL, is there a documented customer-side workaround other than registering the same S3 parquet files in a separate Hive-style external table (which works, but is operationally messy)?
  3. Is anyone else seeing this, and roughly when did it start on your account?

Happy to provide pyarrow output, S3 HEAD output, or full Athena QueryExecutionId traces in a follow-up.

  • If my answer was helpful, I would appreciate it if you could mark it as the accepted answer.

2 Answers
1

Potential Regression: Athena v3 Vectorized Parquet Reader vs. Security Lake Compactor

Based on your diagnostic, this appears to be a regression in Athena Engine v3's Vectorized Parquet Reader specifically handling the Parquet encoding produced by the Security Lake Avro-to-Parquet compactor. Since pyarrow reads the files correctly, the underlying data is valid, but the Trino-based engine is likely stumbling on page-level metadata or specific encodings (like RLE or Dictionary encoding) used in the managed compaction.

Key points to consider:

  • Validation of the Bug: The fact that manifest-only queries succeed confirms the Iceberg metadata layer is intact. The failure during body reads is a classic symptom of a vectorized reader mismatch.

  • The "Managed Table" Trap: Because Security Lake restricts DDL, the standard fix (disabling vectorization via TBLPROPERTIES) is unavailable.

  • Immediate Workaround: Try disabling vectorization at the session level instead of the table level. Run this before your query in the Athena console:

    SET SESSION parquet_vectorized_reader_enabled = false;
    

    If this allows the query to complete, it confirms a bug in the Athena v3 engine's optimized reader.

  • Service-Linked Issue: Since you cannot OPTIMIZE or VACUUM, this requires an AWS internal ticket. The Security Lake team likely deployed a new compactor version that produces Parquet features (or Avro-specific metadata keys) that the current Athena v3 build in us-east-1 is misinterpreting.

my recommendation: Open an AWS Support case under Amazon Security Lake (not just Athena). Explicitly state that the issue is with service-managed compaction and provide the QueryExecutionId where the session-level vectorization toggle was tested. This forces the internal teams to look at the compatibility between the compactor and the engine.

EXPERT
answered 10 days ago
  • Thanks — really useful triage. I tested the session-level toggle directly:

    SET SESSION parquet_vectorized_reader_enabled = false SET SESSION parquet_optimized_reader_enabled = false SET SESSION parquet_use_column_index = false

    All three return InvalidRequestException: Queries of this type are not supported — both via the StartQueryExecution API and in the Athena console. Athena engine v3 does not expose session-level parquet-reader tunables to customers, which is consistent with the very limited SET SESSION surface documented in the engine v3 reference. Which means the lockdown is total, and actually strengthens the support-case angle: there is no customer-side path to mitigate, so resolution has to come from AWS. Agree on routing — when I escalate to a paid case I'll file under Amazon Security Lake primary, Athena secondary, Lake Formation tertiary, so the right teams have to coordinate at the boundary between the managed Avro→Parquet compactor and the engine v3 Iceberg-aware reader.

    I will see what they say. I was hoping to garner a bit of attention on here so I wouldn't have to use the paid services.

0

Based on the available documentation, there is a known issue with Athena SQL reading Iceberg tables created by certain processes. Specifically, Athena SQL cannot read Iceberg table type V3 created by Amazon EMR Spark, resulting in a "GENERIC_INTERNAL_ERROR: Cannot read unsupported version 3" error. This suggests there are compatibility challenges between different Iceberg implementations and Athena's engine.

Additionally, there's a documented issue where Iceberg tables created outside Athena using Apache Iceberg SDK versions earlier than 0.13.0 can cause "Access Denied: Cannot select from columns" errors in Athena engine version 3. This occurs because earlier SDK versions don't properly populate column information in AWS Glue, preventing Lake Formation from determining authorized columns.

However, the specific regression you're describing—ICEBERG_CURSOR_ERROR when reading Security Lake's Avro-to-Parquet compactor output with engine v3—is not explicitly documented in the available sources. The pattern you're observing (manifest-only queries succeeding while body-read queries fail, files validating correctly with pyarrow, and the issue spreading across multiple sources and partitions) suggests a compatibility issue between Athena engine v3's vectorized Parquet reader and the specific Parquet format produced by Security Lake's compactor.

For Iceberg tables created outside Athena with older SDK versions, the documented solution is to perform an update using ALTER TABLE SET TBLPROPERTIES or use the latest Iceberg SDK to fix the table and update column information in AWS Glue. However, this approach wouldn't be available for Lake Formation-managed Security Lake tables where you lack DDL permissions.

Given the specificity of your issue and the lack of documented workarounds for Security Lake-managed tables, I recommend opening an AWS Support case with your QueryExecutionId traces and sample file details to determine if this is a known regression that requires a service-side fix.
Sources
Amazon EMR release 7.12.0 - Amazon EMR
Athena engine version 3 - Amazon Athena

answered 10 days ago
EXPERT
reviewed 10 days ago
  • Thanks for digging through this. Two of the linked issues don't apply to our case — flagging here for anyone else who lands on the same pattern:

    1. Iceberg format-version 3 / EMR-Spark angle. Our tables are format-version: 2 per the downloaded metadata.json (the standard Athena-supported variant). The "Cannot read unsupported version 3" error is from EMR-Spark-written V3 tables. We don't use Amazon EMR or Apache Spark anywhere in our stack — the only writer to these tables is Security Lake's own internal Avro→Parquet compactor.
    2. Old Iceberg SDK / Lake Formation column-access regression. Different error class. The documented Access Denied: Cannot select from columns issue (Iceberg SDK <0.13.0 not populating Glue column metadata) doesn't apply: our error is ICEBERG_CURSOR_ERROR: Failed to read Parquet file, the SDK is current (Security Lake writes these tables, not us), and Glue column metadata is intact and queryable.

    Agree the path forward is a support case for service-side investigation. The pattern — manifest-only queries succeed, body-reads fail, files validate cleanly via pyarrow, the broken set spreading across all 7 OCSF v2.0 sources and multiple producer accounts daily — doesn't match any of the documented Iceberg/Athena issues I've found. Reads like a recent regression in engine v3's vectorized parquet reader against the current Security Lake compactor output format specifically.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.