Issue with loading neuron model

0

I am trying to load a neuron compiled model generated as given in https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html . I am still a newbie so please excuse my mistakes. This is my code for loading a neuron compiled model.It is almost entirely based on the code in the page referred earlier.

from transformers import pipeline import tensorflow as tf import tensorflow.neuron as tfn

class TFBertForSequenceClassificationDictIO(tf.keras.Model): def init(self, model_wrapped): super().init() self.model_wrapped = model_wrapped self.aws_neuron_function = model_wrapped.aws_neuron_function def call(self, inputs): input_ids = inputs['input_ids'] attention_mask = inputs['attention_mask'] logits = self.model_wrapped([input_ids, attention_mask]) return [logits]

class TFBertForSequenceClassificationFlatIO(tf.keras.Model): def init(self, model): super().init() self.model = model def call(self, inputs): input_ids, attention_mask = inputs output = self.model({'input_ids': input_ids, 'attention_mask': attention_mask}) return output['logits']

string_inputs = [ 'I love to eat pizza!', 'I am sorry. I really want to like it, but I just can not stand sushi.', 'I really do not want to type out 128 strings to create batch 128 data.', 'Ah! Multiplying this list by 32 would be a great solution!', ] string_inputs = string_inputs * 32 model_name = 'distilbert-base-uncased-finetuned-sst-2-english'

neuron_pipe = pipeline('sentiment-analysis', model=model_name, framework='tf') example_inputs = neuron_pipe.tokenizer(string_inputs) pipe = pipeline('sentiment-analysis', model=model_name, framework='tf') reloaded_model = tf.keras.models.load_model('./distilbert_b128_2') model_wrapped = TFBertForSequenceClassificationFlatIO(pipe.model) example_inputs_list = [example_inputs['input_ids'], example_inputs['attention_mask']] model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list) rewrapped_model = TFBertForSequenceClassificationDictIO(model_wrapped_traced)

This is the stacktrace

2022-11-05 02:46:55.553817: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training. Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19']

  • This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

  • This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_39'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. WARNING:tensorflow:No training configuration found in save file, so the model was not compiled. Compile it manually. Traceback (most recent call last): File "inferencesmall.py", line 40, in <module> model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow_neuron/python/_trace.py", line 167, in trace func = func.get_concrete_function(*example_inputs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1264, in get_concrete_function concrete = self._get_concrete_function_garbage_collected(*args, **kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1244, in _get_concrete_function_garbage_collected self._initialize(args, kwargs, add_initializers_to=initializers) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 786, in _initialize *args, **kwds)) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2983, in _get_concrete_function_internal_garbage_collected graph_function, _ = self._maybe_define_function(args, kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3292, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3140, in _create_graph_function capture_by_value=self._capture_by_value), File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1161, in func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn out = weak_wrapped_fn().wrapped(*args, **kwds) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler raise e.ag_error_metadata.to_exception(e) tensorflow.python.autograph.impl.api.StagingError: in user code:

    File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/engine/base_layer.py", line 987, in error_handler * return fn(*args, **kwargs)

    StagingError: Exception encountered when calling layer "tf_bert_for_sequence_classification_flat_io" (type TFBertForSequenceClassificationFlatIO).

    in user code:

      File "inferencesmall.py", line 22, in call  *
          output = self.model({'input_ids': input_ids, 'attention_mask': attention_mask})
      File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
          raise e.with_traceback(filtered_tb) from None
    
      StagingError: Exception encountered when calling layer "tf_distil_bert_for_sequence_classification_1" (type TFDistilBertForSequenceClassification).
      
      in user code:
      
          File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 798, in call  *
              distilbert_output = self.distilbert(
          File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
              raise e.with_traceback(filtered_tb) from None
      
          StagingError: Exception encountered when calling layer "distilbert" (type TFDistilBertMainLayer).
          
          in user code:
          
              File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 423, in call  *
                  inputs = input_processing(
              File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/modeling_tf_utils.py", line 372, in input_processing  *
                  output[parameter_names[i]] = input
          
              IndexError: list index out of range
    

Thanks in advance Ajay

asked a month ago73 views
1 Answer
0

Hi Ajay,

Thank you for your interest in neuron. This response assumes that you followed the tutorial that you linked, compiled the neuron model in the underlying HuggingFace pipeline and saved it to disk. Then, in the code in the post, you are trying to reload the compiled model and run inference on the reloaded model.

If you followed the tutorial then these two lines of code were executed:

#compile the wrapped model and save it to disk
model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list)
model_wrapped_traced.save('./distilbert_b128')

This saves an already traced (compiled) model to the disk that accepts input in the form of a list (remember this is required to use the neuron_model.save() function). Remember that HuggingFace tokenizers take string inputs and convert them into dictionaries. In your code it looks like you reload the model with reloaded_model = tf.keras.models.load_model('./distilbert_b128_2'). Note that if this model was already compiled and then saved (see the two lines of code above) then it does not need to be recompiled. It looks like you have attempted to compile another model with this line of code: model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list). This isn’t necessary as you already have a compiled neuron model saved to your disk. One of the main benefits of saving neuron models is to avoid the time-consuming recompilation step.

Try something like this:

#First we define a wrapper class that wraps a model that accepts list inputs 
# (what we needed in order to originally save the model)
# into a model that accepts dictionary inputs
# (what the huggingface tokenizer outputs)
class TFBertForSequenceClassificationDictIO(tf.keras.Model):
  def __init__(self, model_wrapped):
    super().__init__()
    self.model_wrapped = model_wrapped
    self.aws_neuron_function = model_wrapped.aws_neuron_function
  def call(self, inputs):
    input_ids = inputs['input_ids']
    attention_mask = inputs['attention_mask']
    logits = self.model_wrapped([input_ids, attention_mask])
    return [logits]

#Since we are in a new process we have to recreate our neuron pipeline
# with it's special tokenizer that makes sure all the inputs have the same shape
neuron_pipe = pipeline('sentiment-analysis', model=model_name, framework='tf')
#the first step is to modify the underlying tokenizer to create a static
#input shape as inferentia does not work with dynamic input shapes
original_tokenizer = pipe.tokenizer

#you intercept the function call to the original tokenizer
#and inject our own code to modify the arguments
def wrapper_function(*args, **kwargs):
  kwargs['padding'] = 'max_length'
  #this is the key line here to set a static input shape
  #so that all inputs are set to a len of 128
  kwargs['max_length'] = 128
  kwargs['truncation'] = True
  kwargs['return_tensors'] = 'tf'
  return original_tokenizer(*args, **kwargs)

#insert our wrapper function as the new tokenizer as well
#as reinserting back some attribute information that was lost
#when you replaced the original tokenizer with our wrapper function
neuron_pipe.tokenizer = wrapper_function
neuron_pipe.tokenizer.decode = original_tokenizer.decode
neuron_pipe.tokenizer.mask_token_id = original_tokenizer.mask_token_id
neuron_pipe.tokenizer.pad_token_id = original_tokenizer.pad_token_id
neuron_pipe.tokenizer.convert_ids_to_tokens = original_tokenizer.convert_ids_to_tokens

#now wrap your neuron model that you reloaded with the wrapper class
reloaded_model = tf.keras.models.load_model('./distilbert_b128_2')
rewrapped_model = TFBertForSequenceClassificationDictIO(model_wrapped_traced)

#now you can reinsert our reloaded model back into our pipeline
neuron_pipe.model = rewrapped_model
neuron_pipe.model.config = pipe.model.config

#Run inference on your reloaded model (that you didn't have to recompile)
#Our example data!
string_inputs = [
  'I love to eat pizza!',
  'I am sorry. I really want to like it, but I just can not stand sushi.',
  'I really do not want to type out 128 strings to create batch 128 data.',
  'Ah! Multiplying this list by 32 would be a great solution!',
]
string_inputs = string_inputs * 32
outputs = neuron_pipe(string_inputs)
answered 25 days ago
  • Couldnt get it to work. After filling in the gaps I got this: StagingError: Exception encountered when calling layer "tf_bert_for_sequence_classification_flat_io" (type TFBertForSequenceClassificationFlatIO).

    in user code:
    
        File "answer.py", line 15, in call  *
            output = self.model({'input_ids': input_ids, 'attention_mask': attention_mask})
        File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
            raise e.with_traceback(filtered_tb) from None
    
        StagingError: Exception encountered when calling layer "tf_distil_bert_for_sequence_classification" (type TFDistilBertForSequenceClassification).
          in user code:
      File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 798, in call  *
                distilbert_output = self.distilbert(
            File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
                raise e.with_traceback(filtered_tb) from None
        StagingError: Exception encountered when calling layer "distilbert" (type TFDistilBertMainLayer).
    
  • Could you please share the complete code with me. I think as I added the code for the gaps in your answer, I might have made more mistakes.

  • Also I need neuron compilation of a Google bert large model with sequence length 512. Can I do this without referring to HF code?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions