- Newest
- Most votes
- Most comments
Hi Henry, on which account? I have a multi-account setup. I have tested with both my sandbox account, and our datalake account on my company's org. Both continue to experience the same resource limit. This Builder ID is linked to my personal aws org. I haven't run Glue Ray jobs on any other accounts, I don't think.
Is it possible to speak in a less async, or less public, way?
2023-12-05T09:10:52.553+00:00 ======== Autoscaler status: 2023-12-05 09:10:52.397427 ========
2023-12-05T09:10:52.553+00:00 Node status
2023-12-05T09:10:52.553+00:00 ---------------------------------------------------------------
2023-12-05T09:10:52.553+00:00 Healthy: 1 ray.head.default 4 ray.worker.default
2023-12-05T09:10:52.553+00:00 Pending: (no pending nodes)
2023-12-05T09:10:52.553+00:00 Recent failures: (no failures)
2023-12-05T09:10:52.553+00:00 Resources
2023-12-05T09:10:52.553+00:00 ---------------------------------------------------------------
2023-12-05T09:10:52.553+00:00 Usage: 40.0/40.0 CPU 0B/212.62GiB memory 562.14KiB/93.80GiB object_store_memory
2023-12-05T09:10:52.553+00:00 Demands: {'CPU': 1.0}: 312+ pending tasks/actors
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,418 INFO autoscaler.py:1370 -- StandardAutoscaler: Queue 5 new nodes for launch
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,418 INFO autoscaler.py:466 -- The autoscaler took 0.098 seconds to complete the update iteration.
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,418 INFO node_launcher.py:166 -- NodeLauncher1: Got 5 nodes to launch.
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,418 INFO monitor.py:429 -- :event_summary:Adding 5 node(s) of type ray.worker.default.
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,419 INFO manta_cluster_manager.py:89 -- Creating nodes with config {'ExecutorSizeInDpu': 1}, ...
2023-12-05T09:10:52.553+00:00 2023-12-05 09:10:52,498 WARNING manta_cluster_manager.py:128 -- Create node failed as compute resource limits were reached
2023-12-05T09:10:52.803+00:00 2023-12-05 09:10:52,499 WARNING manta_cluster_manager.py:128 -- Create node failed as compute resource limits were reached
2023-12-05T09:10:52.803+00:00 2023-12-05 09:10:52,581 WARNING manta_cluster_manager.py:128 -- Create node failed as compute resource limits were reached
2023-12-05T09:10:52.803+00:00 2023-12-05 09:10:52,583 WARNING manta_cluster_manager.py:128 -- Create node failed as compute resource limits were reached
2023-12-05T09:10:52.803+00:00 2023-12-05 09:10:52,583 INFO manta_cluster_manager.py:131 -- Successfully created 5 executors
Hi Bob, thanks for your feedback. It seems the Job is hitting some account limits. We are checking, Thanks
Godspeed, Henry! I've seen the same limits on two separate AWS accounts. I did wonder if there was some undocumented soft-limit to stop new users accidentally deploying ALL the clusters, but there no other Glue jobs running concurrently.
hey @Bob, I have reviewed your account limits in the eu-west-1 Region. Could you please rerun the Glue for Ray jobs using the AWS account you used earlier ? Thanks
Hi @henr, on which account? I have a multi-account setup. I have tested with both my sandbox account, and our datalake account on my company's org. Both continue to experience the same resource limit. This Builder ID is linked to my personal aws org. I haven't run Glue Ray jobs on any other accounts, I don't think.
Is it possible to speak in a less async, or less public, way?
@henry, thanks for the fix. I just want you to know that you're a beautiful person.
Relevant content
- Accepted Answerasked a month ago
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 3 years ago
Hi Bob. This has been fixed since 12/8 for your sandbox and datalake account in your company's org. We got the account IDs via AWS support tickets. (On 12/4, the fix had not captured the correct scope, that's why you experienced the issue again). We've provided more details via your account team. Thanks !