Questions tagged with Machine Learning & AI

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

What is a practical Inferentia limit to model size?

I am trying to test a model compiled for Inferentia on an `inf1.2xlarge`, but when loading the model I receive the following error messages: ``` 2022-Sep-15 22:10:01.0152 3802:3802 ERROR TDRV:dmem_alloc Failed to alloc DEVICE memory: 1073741824 2022-Sep-15 22:10:01.0152 3802:3802 ERROR TDRV:dma_ring_alloc Failed to allocate TX ring 2022-Sep-15 22:10:01.0172 3802:3802 ERROR TDRV:io_create_rings Failed to allocate io ring for queue qPoolOut0_0 2022-Sep-15 22:10:01.0172 3802:3802 ERROR TDRV:kbl_model_add create_io_rings() error 2022-Sep-15 22:10:01.0182 3802:3802 ERROR NMGR:dlr_kelf_stage Failed to load subgraph 2022-Sep-15 22:10:01.0182 3802:3802 ERROR NMGR:stage_kelf_models Failed to stage graph: kelf-a.json to NeuronCore 2022-Sep-15 22:10:01.0184 3802:3802 ERROR NMGR:kmgr_load_nn_post_metrics Failed to load NN:, err: 4 ``` These are wrapped into a Python runtime exception: ``` RuntimeError: Could not load the model status=4 message=Allocation Failure ``` I presume that this is because the model is on the large size. The `.neff` file is 373MB and takes ~4 hours to compile for a batch size of 1. This particular model is compiled for a single Neuron core. I am now trying to compile with `--neuroncore-pipeline-cores 4` to spread the model across multiple cores. This however gives me the following log message: ``` INFO: The requested number of neuroncore-pipeline-cores (4) may not be suitable for this network, and may lead to sub-optimal performance. Recommended neuroncore-pipeline-cores for this network is 1. ``` (I can't find any technical details on how much memory an Inferentia chip has, although I'm guessing that due to Inferentia architecture "memory" is not used in the same way as it might be on CPU or GPU.) So, what is a practical size limit for an Inferentia model and what can I do about running this model on Inf1?
asked 2 months ago