Does Systems Manager Limit resource allocation?

0

Scenario:

I have a .bat script on a windows machine, which has 32 cores. It does a bunch of context setup and then calls a python script that runs 24 parallel threads via multiprocessing library. Each thread moves some data around a network and then does some calculations via a 3rd-party program by calling win32com.client.

When I login to the machine and run this script - via CMD or powershell - it does exactly what is expected - I see 24 python instances spin up (in task manager) and eventually 24 instances of the 3rd party software. CPU usage trends towards 100% for a little while. RAM goes towards 25-30 GB. Eventually, it finishes and everything looks good. (I've also tested with different instance sizes and different numbers of threads, etc - the point is the code runs as expected)

If I run the exact same script via a SSM "AWS-RunPowerShellScript" run command, I get different behavior: The context still gets setup and the python code still runs. I still see 24 python instances in task manager, but never more than 10-12 instances of the 3rd party software. The other threads get errors like this:

File "C:\Users\Administrator\Anaconda3\envs\python_3x\lib\site-packages\win32com\client\dynamic.py", line 86, in _GetGoodDispatch IDispatch = pythoncom.connect(IDispatch) pywintypes.com_error: (-2147221008, 'CoInitialize has not been called.', None, None)

or this:

File "C:\Users\Administrator\Anaconda3\envs\python_3x\lib\site-packages\win32com\client\dynamic.py", line 86, in _GetGoodDispatch IDispatch = pythoncom.connect(IDispatch) pywintypes.com_error: (-2147221021, 'Operation unavailable', None, None)

or this:

File "C:\Users\Administrator\Anaconda3\envs\python_3x\lib\site-packages\win32com\client\dynamic.py", line 368, in ApplyTypes result = self.oleobj.InvokeTypes( pywintypes.com_error: (-2147352567, 'Exception occurred.', (0, 'thirdparty app', "Access violation at address 0000000000625F98 in module 'thirdparty.exe'. Read of address 0000000000000460", None, 0, -2147418113), None)

So what's different about running as systems manager? The user is different. The domain is different. There is not necessarily an active logic (though I get the same behavior whether I have an active RDP window the machine or not).

But none of those explain why some threads would be able to get a com connection and some would not.

Does anyone know anything that could be helpful here?

  • One clarification: If I wrap the call to win32com in a try/catch and sleep a random number of seconds (trying to provide a dither in case it's a race condition) and then retry - subsequent calls do succeed, but it appears to be necessary for other threads to finish first. As though there's a limit on the number of com objects that the SSM user can open or something.

  • Another clarification: If I scale up the ec2 instance (64 CPUs, 128 CPUs) the behavior does not change. Always 10-12 good threads before getting com errors via SSM.

Zack
asked a year ago65 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions