2 Answers
- Newest
- Most votes
- Most comments
0
Thanks for the answer. So, in terms of maintainability it would be best to have one for each, but for cost saving parallel tasks would be better, right?
answered 7 months ago
0
Yes, in the code if you call the forEachLoop/await in a thread, you can start multiple streaming queries in the same cluster (Glue streaming job), for instance if using PySpark using a ThreadPool and tasks
This is complicate monitoring, tuning and operations in general but will save you cost significantly.
Relevant content
- AWS OFFICIALUpdated 2 years ago
correct, parallel tasks can be challenging if you don't have prior experience maintaining streams