How to make AWS Lambda logs GDPR compliant?

0

If an AWS Lambda function performs logging in order debug activities performed by a user, assuming a user performs a Right to Erasure Request, there does not seem to be a way to carve out log entries even if an AWS Lambda function invocation only performs operations for a single user. This is because all AWS Lambda functions dump data in common log streams, which may only be deleted as a whole.

This effectively makes the logging functionality of AWS Lambda functions non-GDPR compliant. At most it can only be used to store only logs that do not assist in tracking and debugging user-related activities, e.g. assuming a user wishes to trace why certain changes took place for their account.

Is there any advice on how to alternately perform user-related logging in AWS Lambda functions so that user-related logs may be subsequently deleted on demand?

Update - clarification on GDPR requirements The Right to Erasure Request GDPR requirement mandates that GDPR compliant software must allow all traces of personally identifiable information related to a particular person must be removable. Assuming an AWS Lambda function performs user-driven operations such as data retrieval or modification, any related logs that are stored in order to assist in tracing any future debugging or audit trail cannot be complied with due to the hard limitation that AWS Lambda functions store logs in log streams shared with other requests to the same AWS Lambda virtual machine.

Potential Solution: As there is no system to excise log entries related to a single AWS Function invocation seemingly the only alternative to remain compliant is to use pseudonymization and erase the correspondence between the user identifiers and the pseudonym used in the logs. This would mean the logs would stay but the entries related to the erased user would not be traceable back to the user.

  • Hi NicM.

    What would be the requirements to make the log GDPR compliant? I'm not familiar with the details so if you could list what the expectation is it would be easier to formulate a possible solution.

  • @Jose Guay - please see update, thanks!

NicM
질문됨 10달 전323회 조회
2개 답변
1
수락된 답변

NicM,

Without knowing how your application works and how your Lambda function logs information, I would think the following might be worth looking into:

  • Check CloudWatch Log Streams and Log Groups and how to programmatically add logs to CloudWatch Logs so you have full control of what gets logged.
  • Try to tie some sort of identifier to the user and add the identifier to each and every log entry related to the user. It could very well be the username or other type of id. If this is not in existence you could probably use a DynamoDB table to store it for quick reference.
  • In CloudWatch Logs you can set a retention period if you don't need the logs after some time so they get discarded automatically.
  • Identifying each log entry with a link to the user will greatly simplify the identification and removal of the log entries when needed.

I hope this helps.

profile pictureAWS
전문가
답변함 10달 전
1

This issue doesnt sit with Lambda, it can be anything that logs to cloudwatch log groups of which lambda stores its logs here. I have seen other services output data to CW logs.

I guess the best option is that you have is to ensure your applications does not log personal information to logs and to omit this data when writing logs. If users need to be outputted into Logs then ensure masked information such as GUID's are captured in logs which in turn can be looked up if required. The application could encrypt/mask the data stored in the logs here also. I would focus more on the reason why there is PII information being captured in logs?

The other options are to set a short retention period for the logs so that the data is purged. You could have a specific KMS key for the real sensitive information and encrypt these cloudwatch logs with the KMS key that is limited to a small number of IAM users or services.

profile picture
전문가
답변함 10달 전
profile picture
전문가
검토됨 10달 전
  • Data must be stored in logs for audit trail purposes and in order to assist with R&D debugging of certain behaviors related to a user's activity. This includes data such as IP Addresses and user identifiers. In case of storing 'masked' information, the masking must be done in such a way that there is no way to trace the masked identifiers to the real identifiers such as thru a separate service that stores the references.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠