Skip to content

Is the Messages API supported fmeval BedrockModelRunner?

0

I have the following code, but it does not work with model_id='anthropic.claude-3-sonnet-20240229-v1:0'.

import boto3
import json

from fmeval.constants import MIME_TYPE_JSONLINES
from fmeval.data_loaders.data_config import DataConfig
from fmeval.eval_algorithms.qa_accuracy import QAAccuracy, QAAccuracyConfig
from fmeval.model_runners.bedrock_model_runner import BedrockModelRunner

bedrock = boto3.client(service_name='bedrock')
bedrock_runtime = boto3.client(service_name='bedrock-runtime')

bedrock_model_runner = BedrockModelRunner(
    model_id='anthropic.claude-3-sonnet-20240229-v1:0',  # Does not work
    # model_id='anthropic.claude-v2',
    output='completion',
    content_template='{"prompt": $prompt, "max_tokens_to_sample": 500}'
)

config = DataConfig(
    dataset_name="qa_sample",
    dataset_uri="qa_sample.jsonl",
    dataset_mime_type=MIME_TYPE_JSONLINES,
    model_input_location="question",
    target_output_location="answers"
)

eval_algo = QAAccuracy(
    QAAccuracyConfig(
        target_output_delimiter="<OR>"
    )
)

eval_output = eval_algo.evaluate(
    model=bedrock_model_runner,
    dataset_config=config,
    prompt_template="Human: $model_input\n\nAssistant:\n",
    save=True
)

The error message is:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: "claude-3-sonnet-20240229" is not supported on this API. Please use the Messages API instead.

How can I get fmeval to work with the Messages API? Do I need to change the prompt template?

1 Answer
0

Please take a look at this issue on the fmeval repository

In brief, this error occurs because of incorrect content and prompt templates passed to the BedrockModelRunner. See here for details.

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.