Data movement from prod to dev with anonymization of PII
Hi All, Are there best practices and recommended tools for moving a copy of production data (in RDS/Dynamo/S3) to dev, with the ability to detect and anonymize personally identifiable information (PII) along the way?
You can use Amazon Macie to identify PII data that is stored in S3: https://docs.aws.amazon.com/macie/latest/userguide/macie-classify-data.html
For data in RDS/Dynamo, you would have to export that into S3 and then use Macie.
Once you identify, you can then implement your own logic (perhaps using AWS Glue) to anonymize data and then move it back into a dev cluster.
Copying data from sql server to snowflake with AWS GLUEasked a month ago
Move Glacier data to Deep Glacierasked 3 years ago
Creating Dev and Prod deployments using CDKAccepted Answerasked 7 months ago
Timestream Historical Data (10s of years old)asked 3 months ago
S3 File Anonymization ( Data Masking)asked 5 months ago
Different instances of same application reading data using KCL 1.Xasked 3 years ago
Data movement from prod to dev with anonymization of PIIAccepted Answerasked 4 years ago
Options to accelerate s3 copy of 4TB worth files between S3 buckets in same regionAccepted Answerasked 2 years ago
How can I display data with start and end date in QuickSight?asked 5 months ago
What are the recommended data integration tools for NetSuite ERP?Accepted Answerasked 2 years ago