Tuning AWS Lambda for sklearn machine learning training

0

I am trying to build a Lambda and Step Functions pipeline for performing parameter tuning of Python sklearn machine learning models. Several Lambda functions run in parallel each with it's own set of model parameters. Within each Lambda it performs 5 or 10 fold cross validation using the input parameters. The specific command from sklearn uses the cross_val_score method and looks something like:

cv_results = cross_val_score(model, X_train, y_train, cv=kf, n_jobs=-1, scoring=scoring)

What I expected to achieve was overall reduced training time by parallelising best parameter search. To some extent this happens, however, in comparison to my desktop with 4 cores some individual Lambda functions take up to twice as long to run the same sklearn code. I have maxed the memory at 10GB to hopefully use 6 cores as documented by AWS and I thought the n_jobs=-1 in sklearn would use all 6 cores available. I did try the arm64 processor but that was slower than x86. I am deploying as a Docker container. I also tried a very simple python function which multiplied large arrays and that also took almost twice as long in Lambda than on my desktop.

My question is, why is it so slow and what can I do to speed it up?

질문됨 2년 전85회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠