NoCredentialsError: Unable to locate credentials when using dask_ml inside sagemaker
I'm trying to use dask in sagemaker because I have over 1B+ rows in a single dataset. Creating a dask.dataframe works fine, when I create a client through dask, it also works:
client = Client(n_workers=6, threads_per_worker=20)
client
However when I try to use dask_ml.preprocessing.Categorizer, I get the error NoCredentialsError: Unable to locate credentials
.
I understand the issue might be dask distributed client doesn't have authorization to sagemaker client. Its confusing but how do I have them connect somehow? Code below:
import dask.dataframe as dd
df3 = dd.read_parquet('s3://bucket/parquetfiles2/data_*.parquet',
storage_options={'token': 'anon'})
if __name__ == "__main__":
obj_df = df3.select_dtypes(include=['object','datetime64'])
num_df = df3.select_dtypes(exclude=['object','datetime64'])
ce = Categorizer()
obj_df_1 = ce.fit_transform(obj_df)
Hello,
Thank you for contacting us and for using Amazon Sagemaker.
I understand that you encountered a "NoCredentialsError: Unable to locate credentials" when trying to use dask_ml.preprocessing.Categorizer.
It looks like you're running the code locally on your machine. Please feel free to correct me if I have misunderstood anything here. The error is usually seen when you don’t have AWS credentials correctly configured on your machine.
However, I see you're using Dask Distributed client to use Amazon SageMaker Processing. Please have a look here : https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_processing/feature_transformation_with_sagemaker_processing_dask/feature_transformation_with_sagemaker_processing_dask.html#Build-a-Dask-container-for-running-the-preprocessing-job where we have built Dask enabled containers for SageMaker Processing.
You might need to run aws configure so set up IAM Credentials (User) on your machine. 1 However, to be able to replicate and investigate into this further, we'd need your IAM role arn and other details. Hence, for further investigation on this issue, I recommend to cut a support case and provide more detail about your account information and script/config. Due to security reason, we cannot discuss account specific issue in the public posts.
Please open a support case with AWS using the link:
https://console.aws.amazon.com/support/home?#/case/create
Thank you
References :
1 https://aws.amazon.com/premiumsupport/knowledge-center/s3-locate-credentials-error/ 2 https://youtu.be/UMUQs2PojdE
Relevant questions
CloudFormation with SageMaker LifeCycleConfig without leaving the instance running
Accepted Answerasked 4 years agoHow to access file system in Sagemaker notebook instance from outside of that instance (ie via Python Sagemaker Estimator training call)
Accepted Answerasked 6 months agoS3 Dataset versioning with SageMaker?
Accepted AnswerSagemaker Autopilot - unable to connect my data
asked 2 months agoSageMaker framework processor compatibility with sagemaker pipelines
Accepted Answerasked 12 days agoRedshift ML / SageMaker - Deploy an existing model artifact to a Redshift Cluster
Accepted Answerasked a year agoNoCredentialsError: Unable to locate credentials when using dask_ml inside sagemaker
asked 2 months agoClientError: Data download failed:Unable to create download dir
asked 8 months agoUnable to create endpoint
Accepted Answerasked 3 years agoIn Sagemaker endpoint, pip download fails when connected through VPC
asked 2 months ago