I am trying to run a custom entity recognition job on a couple text files. I trained the recognizer using PDF annotation, but I am sending .txt files into the job.
So, I am getting this error: DOCUMENT_CORPUS_SIZE_LESS_THAN_MINIMUM: Document corpus size is less than the minimum requirement: 5120 bytes.
It is true, the text files in my input folder are only 1.2 KB, but I am not sure how to proceed from here. I tried changing the options to "ONE_DOC_PER_LINE", but that gave another error saying that is unsupported for semi-structured data.
UPDATE: I added extra text to the file, and it went through, but the entity recognition is very poor. Before this, I was submitting PDFs to the analysis job, and it was working well. However, I need to submit text documents because I want to eventually create an endpoint to use for this job. What do I do?