Create a writable table from a read-only replica in Amazon S3 Tables
Use Athena CTAS and Iceberg time travel to recover a writable table from a read-only S3 Tables replica
If you run production analytics workloads on Amazon S3 Tables you might replicate your Apache Iceberg tables across AWS Regions and accounts for data protection and compliance. S3 Tables replication creates read-only replicas that are always queryable, but a common question follows: How do I create a writable copy of my data from a replica?
In this post, we show how to create a writable table from a read-only S3 Tables replica using Amazon Athena CREATE TABLE AS SELECT (CTAS). We also show how to combine CTAS with Iceberg time travel to target a specific point in time, and we cover controls to protect replicas from accidental or malicious deletion.
Overview
S3 Tables replication (launched December 2, 2025) provides automatic replication of Apache Iceberg tables across AWS Regions and accounts. The feature creates read-only replicas, backfills them with the latest source state, and replicates subsequent updates chronologically while preserving parent-child snapshot relationships. File paths in Iceberg metadata are rewritten for the destination automatically.
Replicas are read-only. You can query them with any Iceberg-compatible engine, but you cannot write to them directly. When you need a writable copy (for example, after a source table is deleted or corrupted), you use CTAS to create a new, independent table from the replica. You can run CTAS against any replica - the source does not need to be deleted first.
This post covers:
- Setting up cross-Region replication (source in
us-east-1, replica inus-west-2) - Deleting the source table and verifying the replica persists independently
- Creating a writable table from the replica using CTAS, including point-in-time recovery
- Protecting the replica with IAM, table policies, and SCPs
Important: CTAS creates a writable copy of a single table. It does not provide multi-table orchestration, referential integrity across tables, or application-level failover. For multi-table workloads, you need automation around CTAS for each table plus coordination logic for your application layer.
Background: What S3 Tables replication solves
Before S3 Tables replication, maintaining a cross-Region copy of an Iceberg table on Amazon S3 required a complex architecture: replicating data and metadata files with S3 Replication, re-registering the table in the destination AWS Glue Data Catalog, rewriting the absolute S3 paths embedded in Iceberg metadata.json and manifest files, and accepting that S3 Replication has no awareness of Iceberg commit ordering, so files can arrive out of order, leaving the replica in an inconsistent state.
S3 Tables replication replaces that with a managed, table-aware service. Key capabilities for this walkthrough:
Replication commits updates to replicas in the same order as the source. Updates replicate within minutes of source commits.
- Cross-Region and cross-account. Replicate across AWS Regions and accounts with up to five destinations per rule. Snapshots that expire on the source (including ones you manually expired) remain queryable on the replica.
- Independent encryption. Replica tables support independent encryption policies from their source tables.
- Intelligent-Tiering on replicas. Destination buckets can use the S3 Tables Intelligent-Tiering storage class to automatically tier rarely-queried replica data to lower-cost access tiers.
- Observability. Monitor replication status via the console,
GetTableReplicationStatusAPI, and AWS CloudTrail. See Managing S3 Tables replication.
For the full list of what is replicated and replication-specific limitations, see How S3 Tables replication works.
Prerequisites
- An AWS account with permissions for Amazon S3 Tables, Amazon Athena, AWS Identity and Access Management (IAM), and AWS Glue Data Catalog. See integration prerequisites.
- AWS CLI v2 (latest).
- Two AWS Regions (this walkthrough uses
us-east-1as source,us-west-2as destination). - An Amazon Athena workgroup configured in both Regions.
This walkthrough uses a single AWS account with two Regions for simplicity. The destination table bucket could also be in a separate AWS account.
Step 1: Create table buckets
Source table bucket (us-east-1)
In the Amazon S3 console, choose Table buckets, then Create table bucket:
- Name:
demo-source-tablebucket - Select Enable integration under "Integration with AWS analytics services"
- Storage class: Standard or S3 Intelligent-Tiering (set at creation; cannot be changed later)
- Encryption: SSE-S3 or SSE-KMS (set at creation)
Destination table bucket (us-west-2)
Switch to us-west-2 and repeat with the name demo-dest-tablebucket.
The first time you enable integration in a Region, Amazon S3 creates the IAM service role, registers the data location, and creates the s3tablescatalog federated catalog in AWS Glue. This is a one-time per-Region operation. For more information, see Integrating S3 Tables with AWS analytics services.
Step 2: Create and populate the source table
In the Athena console in us-east-1, select s3tablescatalog/demo-source-tablebucket as the catalog.
CREATE DATABASE analytics;
CREATE TABLE orders ( order_id STRING, customer_id STRING, order_date DATE, amount DOUBLE, status STRING ) PARTITIONED BY (month(order_date)) TBLPROPERTIES ('table_type' = 'iceberg');
Keep table and column names lowercase. Uppercase names break the AWS Glue Data Catalog integration and are not visible to Athena. See the integration prerequisites.
INSERT INTO orders VALUES ('ORD-001', 'CUST-A', DATE '2025-01-15', 150.00, 'completed'), ('ORD-002', 'CUST-B', DATE '2025-01-20', 275.50, 'completed'), ('ORD-003', 'CUST-A', DATE '2025-02-10', 89.99, 'pending'), ('ORD-004', 'CUST-C', DATE '2025-02-14', 432.00, 'completed'), ('ORD-005', 'CUST-B', DATE '2025-03-01', 67.25, 'cancelled');
Verify:
SELECT COUNT(*) FROM "analytics"."orders"; -- Expected: 5
Step 3: Configure replication
This walkthrough configures table-level replication on a single table. For multiple tables, use bucket-level replication because new tables created after the configuration is set are automatically covered.
Create the replication IAM role
Create an IAM role that the S3 Tables replication service assumes to read the source and write the destination. The setup documentation provides the exact trust policy.
If either table bucket uses SSE-KMS, add kms:Decrypt on the source key and kms:GenerateDataKey on the destination key.
Shortcut: The Amazon S3 console Create table replication configuration dialog can create this role automatically with the required permissions.
Enable replication
In the Amazon S3 console (us-east-1), navigate to Table buckets → demo-source-tablebucket → Tables → orders → Management → Create table replication configuration:
- Destination table bucket ARN:
arn:aws:s3tables:us-west-2:<ACCOUNT_ID>:bucket/demo-dest-tablebucket - IAM role: Select the role created above
Amazon S3 begins the initial backfill. Wait until the Table replication status section shows Replication status = Completed. You can also check via CLI:
aws s3tables get-table-replication-status \ --table-arn arn:aws:s3tables:us-east-1:<ACCOUNT_ID>:bucket/demo-source-tablebucket/table/<TABLE_ID>
Verify the replica
In the Athena console in us-west-2, select catalog s3tablescatalog/demo-dest-tablebucket:
SELECT COUNT(*) FROM "analytics"."orders"; -- Expected: 5
Step 4: Delete the source table
This step is optional — you can run CTAS against any replica without deleting the source. We include it here to confirm that the replica persists and remains queryable after source deletion.
First, insert one more row and wait for it to replicate. This confirms the replica is current before deletion.
In Athena (us-east-1):
INSERT INTO "analytics"."orders" VALUES ('ORD-006', 'CUST-D', DATE '2025-03-15', 199.99, 'completed');
Wait until the Last replicated timestamp is after your INSERT commit time, then verify in us-west-2:
SELECT COUNT(*) FROM "analytics"."orders"; -- Expected: 6
Now delete the source:
aws s3tables delete-table \ --table-bucket-arn arn:aws:s3tables:us-east-1:<ACCOUNT_ID>:bucket/demo-source-tablebucket \ --namespace analytics \ --name orders \ --region us-east-1
Verify the replica persists
In Athena (us-west-2):
-- All 6 rows present SELECT * FROM "analytics"."orders" ORDER BY order_id; -- Time travel works SELECT * FROM "analytics"."orders$snapshots"; SELECT * FROM "analytics"."orders" FOR TIMESTAMP AS OF TIMESTAMP '<timestamp_between_inserts>'; -- Replica remains read-only INSERT INTO "analytics"."orders" VALUES ('ORD-999', 'CUST-X', DATE '2025-04-01', 10.00, 'test'); -- Error: Updates to service managed tables are not allowed
Deleting the source does not delete the replica. The DeleteTableReplication API documentation states: "existing replicated copies will remain in destination buckets." Data, metadata, and snapshot history remain intact. The replica stops receiving new updates.
Step 5: Create a writable table from the replica
The replica cannot be promoted to writable in place. To create a writable table, run CTAS from the replica. See CTAS for S3 Tables.
Run the following in Athena in us-west-2. S3 Tables integration uses a federated catalog in AWS Glue, and Athena does not support federated queries across Regions, so the CTAS must run in the same Region as the replica.
CREATE TABLE "analytics"."orders_writable" AS SELECT * FROM "analytics"."orders";
To preserve the source table's partitioning, add partitioning = ARRAY['<transform(col)>', ...] to the WITH clause.
Point-in-time recovery with time travel
You can combine CTAS with time travel to create a writable copy at a specific point in time rather than the latest state. This is why longer snapshot retention on replicas is valuable:
CREATE TABLE "analytics"."orders_writable" WITH (format = 'PARQUET') AS SELECT * FROM "analytics"."orders" FOR TIMESTAMP AS OF TIMESTAMP '<desired_point_in_time>';
For more precision, use FOR VERSION AS OF with a specific snapshot ID:
SELECT snapshot_id, committed_at, operation FROM "analytics"."orders$snapshots" ORDER BY committed_at;
CREATE TABLE "analytics"."orders_writable" WITH (format = 'PARQUET') AS SELECT * FROM "analytics"."orders" FOR VERSION AS OF <snapshot_id>;
Both FOR TIMESTAMP AS OF and FOR VERSION AS OF work on S3 Tables replicas via Athena.
Verify the writable table
-- Row count matches SELECT COUNT(*) FROM "analytics"."orders_writable"; -- Expected: 6 -- Symmetric difference is empty SELECT * FROM "analytics"."orders_writable" EXCEPT SELECT * FROM "analytics"."orders"; SELECT * FROM "analytics"."orders" EXCEPT SELECT * FROM "analytics"."orders_writable"; -- Writes succeed INSERT INTO "analytics"."orders_writable" VALUES ('ORD-007', 'CUST-E', DATE '2025-04-01', 55.00, 'pending'); UPDATE "analytics"."orders_writable" SET status = 'completed' WHERE order_id = 'ORD-007'; DELETE FROM "analytics"."orders_writable" WHERE order_id = 'ORD-007';
Considerations
- CTAS is a full data copy. Time to create the writable table scales linearly with table size.
- Snapshot history stays on the replica. CTAS creates a fresh table with a single initial snapshot. Query the replica directly for historical time travel.
- One table at a time. For multi-table workloads, you need automation around CTAS for each table and coordination logic for application-level cutover.
Protecting the replica
Replication protects the data path. It does not protect the replica from deletion or snapshot-retention changes. Add explicit Deny controls on the replica resources.
Actions to deny
| Action | Risk |
|---|---|
s3tables:DeleteTable, s3tables:DeleteTableBucket, s3tables:DeleteNamespace | Destroys the replica, its bucket, or namespace |
s3tables:PutTableMaintenanceConfiguration, s3tables:PutTableBucketMaintenanceConfiguration | Can shorten MaxSnapshotAgeHour or MinSnapshotsToKeep, collapsing the point-in-time recovery window |
For action and resource details, see Actions, resources, and condition keys for Amazon S3 Tables.
Layered controls
- Table policy on the replica table. Explicit Deny on the actions above. A table policy Deny overrides a table bucket policy Allow for the same action on that table.
- Table bucket policy on the destination table bucket. Deny bucket-level destructive actions and maintenance-configuration changes for all tables in the bucket.
- IAM identity-based policies in the replica account. Remove destructive actions from every role that doesn't strictly need them.
- Service Control Policies (SCPs) at the organization level. Deny the same destructive actions for all principals in the replica account's OU except a break-glass role.
Explicit Deny overrides Allow in IAM policy evaluation, so these layers hold even if an over-broad Allow exists elsewhere.
A note on Resource Control Policies (RCPs): RCPs are the natural organization-wide control on the resource side, but as of the current RCP supported-services list, S3 Tables is not included. Use SCPs for the organization-wide layer until RCP coverage is extended.
Recommendations
- Set longer snapshot retention on the replica than the source. For example, 30 days on source and 90 days on replica. Snapshots expired at the source, including ones you manually expired, remain queryable on the replica. This gives you a wider window of historical states to create writable copies from.
- Configure S3 Intelligent-Tiering on the destination bucket before enabling replication. Storage class is set at table bucket creation and cannot be changed later. Intelligent-Tiering is a natural fit for replicas that are read infrequently and retain snapshots for long periods. For details on how Intelligent-Tiering interacts with compaction and snapshot management, see Optimize data management on S3 Tables with Intelligent-Tiering.
- Place the replica in a separate account. Cross-account replication provides independent credentials and IAM blast radius.
- Test CTAS with representative data volumes. CTAS is a full data copy. Time to create the writable table scales with table size.
Cleaning up
Delete replication configurations, tables, namespaces, and table buckets in each Region, then delete the IAM roles. For more information, see Deleting an S3 table and Managing S3 Tables replication.
Conclusion
In this post, we demonstrated how to create a writable table in Amazon S3 Tables from a read-only replica using Amazon Athena CTAS. We showed that replicas persist independently after source deletion, retain full snapshot history for time-travel queries, and can serve as the source for a new writable table in a single statement. Combined with Iceberg time travel, you can target any snapshot within the replica's retention window.
- Language
- English
Relevant content
- asked 4 years ago
