1 Answer
- Newest
- Most votes
- Most comments
0
Hi. It's hard to give a precise answer because it depends on a lot of factors. Just to give you a few examples, latency can depend on the region where the model is deployed, the prompt itself, and so on. However, I'd suggest you use e benchmarking tooling that you can find on github. I'm proving the repo link:
https://github.com/aws-samples/foundation-model-benchmarking-tool
The quotas can be found here:
https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html
Relevant content
- asked a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
Can I just hit API for LLM and all my agents/knowledge bases/context-history can be separately stored in another cloud? Eg: How we hit bare OpenAI's APIs with LangChain and Knowledgebase and everything custom-built on our side