Setting MKL_NUM_THREADS to be more than 16 for m5 instances

0

Hey, I have a 32-core EC2 linux m5 instance. My python installed via anaconda. I notice that my numpy cannot use more than 16 cores.

Looks like my numpy uses libmkl_rt.so:

[2]: np.show_config()                                                                                
blas_mkl_info:                                                                                          
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
blas_opt_info:                                                                                          
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
lapack_mkl_info:                                                                                        
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']
lapack_opt_info:                                                                                        
    libraries = ['mkl_rt', 'pthread']                                                                   
    library_dirs = ['/home/ec2-user/anaconda3/lib']                                                     
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/ec2-user/anaconda3/include']

When I tried to set MKL_NUM_THREADS below 16, it works

(base) ec2-user@ip-172-31-18-3:~$ export MKL_NUM_THREADS=12 && python -c "import ctypes; mkl_rt = ctypes.CDLL('libmkl_rt.so'); print (mkl_rt.mkl_get_max_threads())"                                             
12

When I tried to set it to 24, it stops at 16

(base) ec2-user@ip-172-31-18-3:~$ export MKL_NUM_THREADS=24 && python -c "import ctypes; mkl_rt = ctypes.CDLL('libmkl_rt.so'); print (mkl_rt.mkl_get_max_threads())"                                             
16                                                                                                    

But I do have 32 cores

In [2]: os.cpu_count()
Out[2]: 32

Is there any other settings I need to check?

Thanks, Bill

3 Antworten
1
Akzeptierte Antwort

If you're using an m5.8xlarge that has 32 vCPU but that's with hyperthreading enabled by default; that corresponds to 16 physical cores. I've seen some discussion (https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-threading-behavior-on-Hyper-Threading-systems/td-p/896702) that MKL caps out at the number of physical cores when hyperthreading is enabled unless you set MKL_DYNAMIC=FALSE

AWS
beantwortet vor 2 Jahren
  • This is it! Exporting MKL_DYNAMIC=FALSE and setting MKL_NUM_THREADS did the magic for me.

0

FYI I suspect that the MKL needs to be updated since I updated my m5 instance from 16 core to 32 core. Not sure if this would help. I am looking at this link: https://docs.anaconda.com/mkl-optimizations/index.html

beantwortet vor 2 Jahren
0

No luck. Plan to create a new m5 instance with 32 core and setup mkl there,

beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen