SageMaker Inference recommender - Model latency for streaming response

0

I have an inference endpoint that returns a HTTP streaming response and I would like to load test it.

Does ModelLatency in the recommender metrics refer to time to receive the first chunk, or time to receive all chunks?

c.f. https://docs.aws.amazon.com/sagemaker/latest/dg/inference-recommender-interpret-results.html

Francis Flannery 專家
6 個月前
The following links may help you understand ModelLatency in more detail. https://aws.amazon.com/blogs/machine-learning/best-practices-for-load-testing-amazon-sagemaker-real-time-inference-endpoints/ and https://repost.aws/knowledge-center/sagemaker-endpoint-latency particularly note how ModelLatency and OverheadLatency are defined.

主題

機器學習與 AI

標籤

Amazon SageMaker

語言

English

已提問 6 個月前檢視次數 54 次

沒有答案

最新
最多得票
最多評論

相關內容

Application Load Balancer 在什麼時候會向後端建立新的連線
已接受的答案
Luke W
已提問 6 個月前
amazon polly使用字數異常增加(The characters increased for no reason)
ys_1004
已提問 10 個月前
Kinesis Video Streams in WebRTC應用問題
ShuoYue
已提問 8 個月前
[Glue Studio] Data target 選了 partition key，執行 job 卻說 Partition column not found in schema
anderson
已提問 9 個月前
如何解決 Amazon SageMaker 推論錯誤「從上游讀取回應標頭時上游逾時 (110：連線逾時)」？
AWS 官方已更新 2 年前
如何對 Okta 和 AWS IAM Federation 的 "invalid SAML response" (無效 SAML 回應) 錯誤進行疑難排解？
AWS 官方已更新 2 年前
如何確保我的用戶端 I/O 不會因為安全性修補程式而中斷？
AWS 官方已更新 1 年前
如何疑難排解 Application Load Balancer HTTP 502 錯誤？
AWS 官方已更新 1 年前