I was instruction finetuning on sagemaker jumpstart. I provided a template.json file, describing the template used, in the train directory alongside the dataset on S3 as directed in the documentation. However, when I run the training I get the error: Couldn't cast because columns don't match. Am I configuring the template.json wrong? Please help. Thanks
Exact error:
We encountered an error while training the model on your data. AlgorithmError: ExecuteUserScriptError:
ExitCode 1
ErrorMessage "raise ValueError(f"Couldn't cast\n{table.schema}\nto\n{features}\nbecause column names don't match")
ValueError: Couldn't cast
question: string
context: string
response: string
to
{'prompt': Value(dtype='string', id=None), 'completion': Value(dtype='string', id=None)}
because column names don't match
The above exception was the direct cause of the following exception
Traceback (most recent call last)
File "/opt/ml/code/llama_finetuning.py", line 301, in <module>
fire.Fire(main)
File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/opt/ml/code/llama_finetuning.py", line
Custom template
## template.json
{
"prompt": "Below is a dialog between three people. The first person is the system, the second person is the user and the third person is the assistant. The system starts the dialog with a context and none other input for the rest of the dialog. The user asks a question and the assistant responds. Write a response that appropriately completes the conversation.\n\n### Dialog:\n[\n\t{\n\t\t\"role\": \"system\",\n\t\t\"content\": \"{context}\"\n\t},\n\t{\n\t\t\"role\": \"user\",\n\t\t\"content\": \"{question}\"\n\t},\n\t{\n\t\t\"role\": \"assistant\",\n\t\t\"content\": \"{response}\"\n\t}\n]",
"completion": "{response}"
}
Dataset
{
"question": "What is the color of water",
"context": "",
"response": "Water is colorless"
}
{
"question": "What do humans inhale",
"context": "",
"response": "Humans inhale oxygen and exhale carbon dioxide"
}