Multi-model, Multi-container and Variants - what are the possible combinations?

0

This question is mostly for educational purposes, but the current SageMaker documentation does not describe whether these things are allowed or not.

Lets suppose I have:

  • a XGBoost_model_1 (that needs a XGBoost container)
  • a KMeans_model_1 and a KMeans_model_2 (both require a KMeans container)

1. Here's the first question - can I do the following:

  • create a Model with InferenceExecutionConfig.Mode=Direct and specify two cointainers (XGBoost and KMeans with Mode: MultiModel)

That would enable the client:

  • to call invoke_endpoint(TargetContainer="XGBoost") to access the XGBoost_model_1
  • to call invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_1") to access the KMeans_model_1
  • to call invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_2") to access the KMeans_model_2

I don't see a straight answer in the documentation whether combining Multi-Model containers with Multi-container endpoint is possible.

2. The second question - how does the above idea work with ProductionVariants. Can I create something like this:

  • Variant1 with XGBoost serving XGBoost_model_1 having a weight of 0.5
  • Variant2 with a Multi-container having both XGBoost and KMeans (with a MultiModel setup) having a weight of 0.5

So that the client could:

  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="KMeans", TargetModel="KMeans_model_1") to access the KMeans_model_1
  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="KMeans", TargetModel="KMeans_model_2") to access the KMeans_model_2
  • call invoke_endpoint(TargetVariant="Variant1") to access the XGBoost_model_1
  • call invoke_endpoint(TargetVariant="Variant2", TargetContainer="XGBoost") to access the XGBoost_model_1

Is that combination even possible?

If so, what happens when the client calls the invoke_endpoint without specifying the variant? For example:

  • would invoke_endpoint(TargetContainer="KMeans", TargetModel="KMeans_model_2") fail 50% of the time (if it hits the right variant then it works just fine, if it hits the wrong one it would most likely result with a 400/500 error ("incorrect payload")?
gefragt vor 2 Jahren729 Aufrufe
1 Antwort
0

Well, I've checked that myself.

Turns out NONE of these combinations are possible. :)

  1. Multi-model + Multi-container is NOT possible
  2. Variants + Multi-container is NOT possible
  3. Variants + Multi-model is NOT possible

In all cases, you get a corresponding error while invoking create_endpoint_configuration:

  1. Multiple ProductionVariants is currently not supported when a Model uses a Direct InferenceExecutionMode.
  2. Direct InferenceExecutionMode is not supported when a Container uses MultiModel mode.
  3. MultiModel mode is not supported with the current model specification.
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen