I created a POC Batch Operation Job that I want to Invoke Lambda function.
This Lambda will get the file, do transformation, copy transformed file into new bucket, and delete file from old bucket upon completion.
The Batch job fails before invoking my lambda and report csv shows following errors:
eceeecom-5732-poc-old,emailstore/0079d564-dccc-4066-a42d-8d9113097d02,,failed,400,InvalidRequest,Task failed due to missing VersionId
eceeecom-5732-poc-old,emailstore/00975dec-f64b-4932-a1e8-9ec1284f76bb,,failed,400,InvalidRequest,Task failed due to missing VersionId
My manifest.json
{
"sourceBucket" : "poc-old",
"destinationBucket" : "arn:aws:s3:::poc-new",
"version" : "2016-11-30",
"creationTimestamp" : "1656633600000",
"fileFormat" : "CSV",
"fileSchema" : "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, Size, LastModifiedDate, ETag, StorageClass, IsMultipartUploaded, ReplicationStatus, EncryptionStatus, ObjectLockRetainUntilDate, ObjectLockMode, ObjectLockLegalHoldStatus, IntelligentTieringAccessTier, BucketKeyStatus, ChecksumAlgorithm",
"files" : [ {
"key" : "emailstore/eceeecom-5732-poc-old/emailstore-inventory-config/data/b84c7842-bc58-40b9-afb0-622060853c8a.csv.gz",
"size" : 623,
"MD5checksum" : "XYZ"
} ]
}
When I unzip above csv.gz file, I observe:
"eceeecom-5732-poc-old","emailstore/0079d564-dccc-4066-a42d-8d9113097d02","","true","false","150223","2022-06-30T21:22:06.000Z","60d3815bd0e85e3b689139b6938362b4","STANDARD","false","","SSE-S3","","","","","DISABLED",""
"eceeecom-5732-poc-old","emailstore/00975dec-f64b-4932-a1e8-9ec1284f76bb","","true","false","46054","2022-06-30T21:22:06.000Z","214c15f193c58defbbf938318e103aed","STANDARD","false","","SSE-S3","","","","","DISABLED",""
Clearly there is no Version Id and that is a culprit, but how can I make Inventory configuration not ask for Version Id to be added to manifest?
When I was reading about Inventory list, it said that Version ID field is not included if the list is only for the current version of objects: https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html
It blows my mind that we are using a built in S3 function for inventory and job from inventory, then it fails because of something like this. Then there is an entire document for performing an ETL on your inventory files, INSTEAD OF JUST FIXING THE PROBLEM WITH THE SERVICE. Just handle the case when there are missing version ids!
This is very disappointing. I expected more from AWS that prides it self on compatibility of its different services.