By using AWS re:Post, you agree to the Terms of Use
/Setting MKL_NUM_THREADS to be more than 16 for m5 instances/

Setting MKL_NUM_THREADS to be more than 16 for m5 instances

0

Hey, I have a 32-core EC2 linux m5 instance. My python installed via anaconda. I notice that my numpy cannot use more than 16 cores.

Looks like my numpy uses libmkl_rt.so:

[2]: np.show_config()                                                                                
blas_mkl_info:                                                                                          
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
blas_opt_info:                                                                                          
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
lapack_mkl_info:                                                                                        
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
lapack_opt_info:                                                                                        
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']

When I tried to set MKL_NUM_THREADS below 16, it works

(base) ec2-user@ip-172-31-18-3:~$ export MKL_NUM_THREADS=12 && python -c "import ctypes; mkl_rt = ctypes.CDLL('libmkl_rt.so'); print (mkl_rt.mkl_get_max_threads())"                                             
12

When I tried to set it to 24, it stops at 16

(base) ec2-user@ip-172-31-18-3:~$ export MKL_NUM_THREADS=24 && python -c "import ctypes; mkl_rt = ctypes.CDLL('libmkl_rt.so'); print (mkl_rt.mkl_get_max_threads())"                                             
16                                                                                                    

But I do have 32 cores

In [2]: os.cpu_count()
Out[2]: 32

Is there any other settings I need to check?

Thanks, Bill

3 Answers
1
Accepted Answer

If you're using an m5.8xlarge that has 32 vCPU but that's with hyperthreading enabled by default; that corresponds to 16 physical cores. I've seen some discussion (https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-threading-behavior-on-Hyper-Threading-systems/td-p/896702) that MKL caps out at the number of physical cores when hyperthreading is enabled unless you set MKL_DYNAMIC=FALSE

answered 2 months ago
  • This is it! Exporting MKL_DYNAMIC=FALSE and setting MKL_NUM_THREADS did the magic for me.

0

FYI I suspect that the MKL needs to be updated since I updated my m5 instance from 16 core to 32 core. Not sure if this would help. I am looking at this link: https://docs.anaconda.com/mkl-optimizations/index.html

answered 3 months ago
0

No luck. Plan to create a new m5 instance with 32 core and setup mkl there,

answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions