1 個回答
- 最新
- 最多得票
- 最多評論
1
Hi Ayman,
Try increasing the number of training data or set max_seq_len hyper-parameter to be small (For example a value of 128) to see if the error keeps persisting.
The way that the computation works is that all text is processed, combined and then split into sample (each of length equal to max input length). Then, the examples are batched as per the batch size. If you are using 8 GPU machines, you need to have at least 8 non-empty batches. That is, you either need to have large enough data such that there are 8 batches or you need to decrease the batch size or you need to reduce the max input length.
已回答 4 個月前
相關內容
- 已提問 1 年前
- 已提問 1 年前
- 已提問 10 個月前
- AWS 官方已更新 1 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前