When trying to create my own language model, I get the following error: "The URI that you provided doesn't point to an S3 object. Make sure that the object exists and try your request again." I am completely sure that the URI I provided exists because I can get the object through the "Browse S3 URI" option on the console.
I had spent many hours before, and again now, trying to create a language model, and out of 50 attempts trying different things, I would eventually be successful, but for seemingly no reason. I managed to train a model before, and am trying again now, but still can't.
My configuration/things I've tried:
- DataAccessRole: The service role I created for the Transcribe service has full access to S3.
- My S3 bucket has "block all public access" turned off (but it worked once before without this)
- I've ensured that my bucket region and model region are the same (eu-west-1)
- Tried to create the same model through the boto3 API as well as the console. Same error.
- The bucket is not new, i.e. I've tried creating the model several hours after the bucket was created and populated
- My S3 bucket ("my-bucket") has a single folder object ("my-prefix") which contains thousands of .txt files.
- Different language model configurations.
S3 Bucket Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": "*",
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*",
"arn:aws:s3:::my-bucket/my-prefix/*",
]
}
]
}
Update: 17th April 2023
Combining all the .txt files into one fixes the issue, but I am not sure why this is the case since I don't see anywhere in the documentation that the training data has to consist of only one file. And even then, the error message is clearly incorrect.