1 Respuesta
- Más nuevo
- Más votos
- Más comentarios
0
Hi,
You write "v2", did you try with v2, ie. v2.0 or v2.1? I would suggest to try with 2.1 as it is specifically adapted to large inputs (to 200k tokens)
Also, you can try to initially adjust prompt parameters like temperature, top_k, top_p to reduce Claude's creativity: that may lead to less operations when inferring. So higher chance to not time out.
Additionally, I'd try to start with less than 20k words in my prompt and increase incrementally to see where the limit is.
Best,
Didier
Contenido relevante
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años
I am getting response if using Claude V2 for relatively shorter tokens(lets say 12K words). body= json.dumps({ "prompt": f"\n\nHuman: {usercontent}\n\nAssistant: {assistant}", "max_tokens_to_sample": 2000, "temperature": 0.3, "top_p": 1, })
modelId = 'anthropic.claude-v2' contentType= 'application/json' accept= 'application/json'
However, if I am replacing the modelId with 'anthropic.claude-v2:1'. I am getting empty result for shorter tokens and time out for bigger ones.