Test Data Management tool for file Anonymization in AWS

0

Hi All,

I'm looking for a Test Data management (TDM) tool in AWS which can perform below requirements,

  1. TDM tool to Connect Production S3 bucket to extract files for anonymization and load it in Test S3 bucket

  2. Job scheduled on daily basis to anonymize files in Prod S3 and store files in Test S3 bucket

  3. Identify PII columns from S3 file and anonymize it, later these files are loaded in redshift database

  4. Data integrity should be maintained between file and database, for example, incremental daily data should able to match the existing mocked PII columns in the redshift database

Kindly let me know how can I achieve above requirements using AWS services

Thanks & Regards, Aflah

2 個答案
1

AWS Glue Databrew could help to achieve your objective. It offers different functions to mask data:

  1. It can connect to Amazon S3 and many other data stores including Amazon Redshift to read source data and to write the output data.
  2. Databrew Job can be scheduled and managed by the service or integrated with AWS StepFunctions and other workflow services
  3. it does include a PII detection feature
  4. some of the data masking techniques that can be used provide repeatable output

you can read more in this blog post.

hope this helps.

AWS
專家
已回答 2 年前
0

Hi, Good question

you could use a service called Comprehend to detect PII and replace it. An example workshop is located at https://github.com/karchit/s3-object-lambda-workshop/tree/Lab2

profile picture
Sri
已回答 2 年前
  • The solution provided is functionally correct but it offers limited redaction capabilities and it might not fully answer the question use case.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南