- Newest
- Most votes
- Most comments
Thank you for your question about securing a SageMaker Ground Truth labeling job inside a VPC. I'll clarify some points and provide guidance on how to secure your labeling job effectively.
First, it's important to understand that there are two main scenarios for using Ground Truth in a VPC:
- Running a labeling job in a VPC
- Using Amazon VPC mode from a private worker portal
For your specific case, where workers are accessing the labeling job from their own laptops outside the VPC, you're correct that adding "aws:SourceVpc": "myVpcId" to the bucket policy would block access. This setting is more applicable when you're using a private workforce within your VPC.
To secure your labeling job for workers accessing it from outside your VPC, consider the following approaches:
-
Use IP-restricted presigned URLs: This is a new feature that adds an extra layer of security by restricting access to presigned URLs based on the worker's IP address. You can enable this by specifying a WorkerAccessConfiguration object when creating or updating a workteam. This allows you to lock down access to specific IP addresses, enhancing data privacy and reducing the risk of unauthorized access.
-
Encrypt your output data and storage volumes: Use AWS Key Management Service (KMS) to encrypt your output data and storage volumes. This ensures that your data is protected at rest.
-
Configure workforce authentication and restrictions: Implement strong authentication methods for your workforce, such as email, SMS, or phone call verification. You can also restrict access to specific IP addresses or ranges to control who can access the labeling service.
-
Use IAM policies and roles: Ensure that you have proper IAM permissions set up for users creating labeling jobs and for the execution role used to start the labeling job. Use the principle of least privilege when assigning permissions.
-
Enable VPC endpoints for Ground Truth: Even if your workers are not in the VPC, you can still use VPC endpoints to ensure that communication between Ground Truth and other AWS services (like S3) stays within the AWS network, adding an extra layer of security.
Regarding adding the Ground Truth endpoint to the bucket policy, this is not typically necessary for extra security. Instead, focus on properly configuring your VPC endpoints and ensuring that your S3 bucket policy allows access from the Ground Truth service.
Remember, while you can't force external workers to access the labeling job from within your VPC, you can implement these security measures to protect your data and control access to your labeling tasks.
Sources
Using Amazon SageMaker Ground Truth in an Amazon Virtual Private Cloud - Amazon SageMaker
Use Amazon VPC Mode from a Private Worker Portal - Amazon SageMaker
Ground Truth Security and Permissions - Amazon SageMaker
Use IP-restricted presigned URLs to enhance security in Amazon SageMaker Ground Truth | AWS Machine Learning Blog
answered 2 years ago
Relevant content
- AWS OFFICIALUpdated 2 years ago

where should i specify the workers sourceIp range? the accepted values for source IP is Enabled/Disbled so where is should specify my IP Range x.x.x.x/cidr?
{ "S3Presign": { "IamPolicyConstraints": { "SourceIp": "Enabled"|"Disabled", "VpcSourceIp": "Enabled"|"Disabled" } } }