Multi-model, Multi-container and Variants - what are the possible combinations?

0

This question is mostly for educational purposes, but the current SageMaker documentation does not describe whether these things are allowed or not.

Lets suppose I have:

  • a XGBoost_model_1 (that needs a XGBoost container)
  • a KMeans_model_1 and a KMeans_model_2 (both require a KMeans container)

1. Here's the first question - can I do the following:

  • create a Model with InferenceExecutionConfig.Mode=Direct and specify two cointainers (XGBoost and KMeans with Mode: MultiModel)

That would enable the client:

  • to call invoke_endpoint(TargetContainer="XGBoost") to access the XGBoost_model_1
  • to call invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_1") to access the KMeans_model_1
  • to call invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_2") to access the KMeans_model_2

I don't see a straight answer in the documentation whether combining Multi-Model containers with Multi-container endpoint is possible.

2. The second question - how does the above idea work with ProductionVariants. Can I create something like this:

  • Variant1 with XGBoost serving XGBoost_model_1 having a weight of 0.5
  • Variant2 with a Multi-container having both XGBoost and KMeans (with a MultiModel setup) having a weight of 0.5

So that the client could:

  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="KMeans", TargetModel="KMeans_model_1") to access the KMeans_model_1
  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="KMeans", TargetModel="KMeans_model_2") to access the KMeans_model_2
  • call invoke_endpoint(TargetVariant="Variant1") to access the XGBoost_model_1
  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="XGBoost") to access the XGBoost_model_1

Is that combination even possible?

If so, what happens when the client calls the invoke_endpoint without specifying the variant? For example:

  • would invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_2") fail 50% of the time (if it hits the right variant then it works just fine, if it hits the wrong one it would most likely result with a 400/500 error ("incorrect payload")?
asked 2 years ago711 views
1 Answer
0

Well, I've checked that myself.

Turns out NONE of these combinations are possible. :)

  1. Multi-model + Multi-container is NOT possible
  2. Variants + Multi-container is NOT possible
  3. Variants + Multi-model is NOT possible

In all cases, you get a corresponding error while invoking create_endpoint_configuration:

  1. Multiple ProductionVariants is currently not supported when a Model uses a Direct InferenceExecutionMode.
  2. Direct InferenceExecutionMode is not supported when a Container uses MultiModel mode.
  3. MultiModel mode is not supported with the current model specification.
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions