Repeating a PipeModeDataset over epochs

0

I'm using a PipeModeDataset to read the data from an augmented manifest file. Right now, if I try something like this:

   ds = PipeModeDataset(channel=channel_name)
    for epoch in range(epochs):
        logging.info(f"Epoch: {epoch}")
        for image, bbox, labels in ds:
            logging.info("New iteration")
            step += 1

I get the output:

[test_dataset.py:64 - run ] Epoch: 0
[test_dataset.py:67 - run ] Step: 0
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 1
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 2
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 3
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 4
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 5
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 6
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 7
[test_dataset.py:71 - run ] New iteration
[test_dataset.py:67 - run ] Step: 8
[test_dataset.py:64 - run ] Epoch: 1
[test_dataset.py:64 - run ] Epoch: 2

Meaning that, after the first epoch, the dataset has exhausted itself and it won't yield any more data. Is there a way to "reset" the dataset so that I can fetch the data multiple times? I tried ds=ds.repeat(epochs) but then I only have 1 epoch and the number of steps is multiplied by the number of epochs. That is, I don't get a signal that one epoch is over, the dataset is just repeated.

Thanks!

2 Answers
0

https://python.hotexamples.com/examples/sagemaker_tensorflow/PipeModeDataset/repeat/python-pipemodedataset-repeat-method-examples.html

Based on the Python examples I found, it seems like you're on the right track with using the repeat() method to make a PipeModeDataset iterable over multiple epochs. The repeat() method is typically called on the dataset before starting the training loop. The argument to repeat() is the number of epochs, and if no argument is provided, the dataset will repeat indefinitely​.

You mentioned that you only have one epoch and the number of steps is multiplied by the number of epochs when using ds.repeat(epochs). This is the expected behavior when using repeat() with TensorFlow datasets. When the dataset is repeated, it concatenates the copies of the dataset, making it one long continuous dataset without an explicit signal for the end of an epoch.

If you want to maintain the concept of an epoch in your training loop, you could manually keep track of the number of steps per epoch. For example, if you know the number of steps per epoch (which could be the total number of data samples divided by the batch size), you can use a counter in your training loop to track when an epoch ends.

If you were asking about something else related to "repeating a PipeModeDataset over epochs" or "Stack Overflow for Teams," please provide additional details and I'll be glad to assist further.

profile picture
EXPERT
answered 10 months ago
0

Hello,

I understand that you would like to know a way to “reset" the dataset so that you can fetch the data multiple times.

Yes, you're correct! you can use the repeat() method to make the dataset iterable over multiple epochs. The repeat() method allows you to specify the number of epochs or repeat indefinitely if no argument is provided. Typically, the repeat() method is invoked on the dataset prior to commencing the training loop. By specifying the number of epochs as an argument to repeat(), the dataset will iterate for the desired number of epochs.

Please refer the below sample code.

num_epochs = 3 # Set the desired number of epochs

dataset = dataset.repeat(num_epochs)

Here the repeat() method is called on the dataset object with the num_epochs variable as the argument. This will repeat the dataset for the specified number of epochs. Inside the training loop, you can process the dataset by iterating over it using a for loop, and each iteration will provide a batch of data.This behavior is in line with expectations when applying repeat(). By repeating the dataset, it effectively concatenates multiple copies of itself, resulting in a single extended dataset that lacks a clear indication for the completion of an epoch

You may also consider using num_epochs parameter. By specifying the desired number of epochs, the dataset will automatically repeat for that number of epochs during training. If no value is provided for num_epochs, the dataset will repeat indefinitely.

Also, As suggested, to preserve the concept of an epoch within your training loop, it is possible to manually monitor the number of steps per epoch. For instance, if you have prior knowledge of the number of steps per epoch (which can be calculated as the total number of data samples divided by the batch size), you can employ a counter in your training loop to keep track of the completion of each epoch.

I highly recommend you to go through the below references for better understanding.

  1. https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-python-sdk/tensorflow_script_mode_pipe_mode/tensorflow_script_mode_pipe_mode.html
  2. https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/
  3. https://pypi.org/project/sagemaker-tensorflow/1.10.0.1.0.0/
  4. https://github.com/aws/amazon-sagemaker-examples/blob/main/advanced_functionality/pipe_bring_your_own/pipe_bring_your_own.ipynb
  5. https://python.hotexamples.com/examples/sagemaker_tensorflow/PipeModeDataset/repeat/python-pipemodedataset-repeat-method-examples.html

If you have any difficulty verifying any of the above-mentioned points or if you still run into issues, please reach out to AWS Support (Sagemaker) along with your issue or use case in detail, and we would be happy to assist you further.

[+]https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-casehttps://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Thank you.

AWS
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions