- Newest
- Most votes
- Most comments
Based on the AWS Glue documentation [1][5], the native Salesforce connector for AWS Glue does not provide a mechanism to retrieve the Salesforce Bulk API Job ID after job completion. This is a known limitation of the current connector implementation, not a configuration issue. The connector handles the bulk data operations internally and abstracts away the underlying Salesforce Bulk API details, including the Job ID. This means you cannot directly access the Salesforce Bulk API Job ID through the connector to call Salesforce's success/failure result APIs.
Additionally, it is important to note that the AWS Glue Salesforce connector has several other known limitations [5], including:
- Lack of support for job bookmarks
- The requirement to use Spark SQL rather than Salesforce's SOQL
These constraints are part of the current connector design. The AWS Glue service team is already aware of these limitations, and a feature request is already in place to address the Job ID limitation.
In order to work around this known limitation, we have the following recommendations. Please evaluate and use them as per your use-case:
=> Enable "FAIL_ON_FIRST_ERROR": "true" I noticed in your current script this is set to "false". Setting this to "true" [1] can help minimize partial writes by failing the Glue job when the first error is encountered on the Salesforce side. However, please note that some records may still be written before the first error occurs, so partial data in Salesforce is still possible. It can help reduce the scope of unknown failures.
=> Error Handling in AWS Glue While the connector doesn't return the Salesforce Job ID, you can implement error handling within your Glue job to capture and log any failures that occur during the write operation. Monitor the job execution logs in Amazon CloudWatch [6] for any errors related to the Salesforce write operation.
=> Retrieve Job ID from logs and use Salesforce APIs externally: The Salesforce Bulk API V2 Job ID does appear in the execution logs. You can extract this Job ID from the logs and then call Salesforce's native Bulk API V2 endpoints to retrieve the results of the job. However, please note that this cannot be done within the Glue Spark script itself.
The following three Salesforce Bulk API V2 endpoints (available in API version 41.0 and later) can be used to retrieve results:
Get Job Failed Record Results [2]: GET /services/data/vXX.X/jobs/ingest/{jobId}/failedResults/ — Returns a CSV file containing all records that encountered an error during processing, including the error code and message (sf__Error) and the record ID (sf__Id).
Get Job Successful Record Results [3]: GET /services/data/vXX.X/jobs/ingest/{jobId}/successfulResults/ — Returns a CSV file containing all records that were successfully processed, including whether the record was created (sf__Created) and the record ID (sf__Id).
Get Job Unprocessed Record Results [4]: GET /services/data/vXX.X/jobs/ingest/{jobId}/unprocessedrecords/ — Returns a CSV file containing all records that were not processed by the job. This applies to jobs that were interrupted or otherwise failed to complete. Note that unprocessed records are not the same as failed records — failed records were processed but encountered an error during processing, while unprocessed records were not processed.
I hope you found the above information useful and clarifies the situation you are encountering with the AWS Glue Salesforce connector not returning the Bulk API V2 Job ID.
Please be informed that there is already an existing feature request with the AWS Glue service team to support returning the Salesforce Bulk API V2 Job ID and response data through the native connector. While as a Support Engineer, I cannot provide an ETA on when this will be implemented, the service team is actively aware of this limitation.
It is recommended monitoring the AWS blogs and announcements pages linked below for updates on new features and releases. [+] https://aws.amazon.com/new/ [+] https://aws.amazon.com/blogs/aws/
===Reference(s)===
[1] https://docs.aws.amazon.com/glue/latest/dg/salesforce-writing-to.html
[5] https://docs.aws.amazon.com/glue/latest/dg/salesforce-connector-limitations.html
Relevant content
- asked 3 years ago
- asked a year ago
- asked 2 years ago
