Can we train Stable diffusion on sagemaker using dreambooth approach?

0

We are fine-tuning stable diffusion model on a custom dataset and looking for the DreamBooth approach for training on Sagemaker. Is it possible on Sagemaker. if yes then can you give me some links or the Sagemaker notebook?

5 Answers
0
  • It's not dream booth approach

0

@dayanand The Dreambooth approach is used in the training scripts, but the training scripts are customized so that they can be called using the Sagemaker SDK.

If you want to walk through it in a notebook, you can use https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart_text_to_image/Amazon_JumpStart_Text_To_Image.ipynb

However, in both cases the training scripts are not shown. They are retrieved from a repository associated with the model as part of the jump start process. The training scripts are where the Dreambooth process happens. You can see that in the notebook where it says:

    # Retrieve the training script. This contains all the necessary files including data processing, model training etc.
    train_source_uri = script_uris.retrieve(
        model_id=train_model_id, model_version=train_model_version, script_scope=train_scope)

What is it that you are looking for in the fine tuning process that you are not seeing?

profile pictureAWS
EXPERT
answered 9 months ago
  • I want to write a prompt for each image in a dataset and train the model based on that prompt. is it possible in Sagemaker SDK ?

0

@Dayanand

The way the Dreambooth finetuning process works is that you can provide multiple images of an object/person that you want to add to the model. The foundational model may be able to generate pictures of a celebrity that it "knows" about, but not pictures of me (Burtoft). In the examples above, you need a directory with some photos cropped on the thing to be trained (5 or so usually work), and a json file that describes the thing.

In the finetuning process, there are multiple photo examples of a single text description (instance) that is part of a class that already exists (e.g. humans or cats). If you are talking about a longer text description of a single photo, I haven't heard of anything that can do that. However, depending on the use case, you may be able to crop parts of your existing photos to create smaller training photos. If your description was "Burtoft on a model xyz chair in a type abc suit", you could train on Burtoft (human), model xyz(chair) and type abc (suit) if you can provide multiple photos of each.

I fine tuned Stable Diffusion to generate pictures of me by using a json that looked like this:

{"instance_prompt": "a photo of a burtoft human"
,"class_prompt": "a photo of a human"
}

(in my case, my name is pretty unique, so I could use that) I included that json in a directory that had (cropped) photos of me in it. You can also see an example in the notebook above in the S3 bucket jumpstart-cache-prod-{aws_region}/training-datasets/dogs_sd_finetuning/

If that doesn't answer your question, please provide some details around the photos and descriptions you are trying to fine tune.

profile pictureAWS
EXPERT
answered 9 months ago
  • I want to train the stable diffusion model on variety of images, for instance, I have multiple categories of images and in those categories too I have specific Images and relevant captions for each image. I want to train stable diffusion on this data set and then use the prompt to generate a mix of images. For example, I have a dataset of different categories like Sofas, Bed, Table, and Chairs, and many more and in the sofa's too I have Italian sofas, modern sofas with specific names, now using dreambooth I use to train these varieties of images and provide a separate caption for these images. And then pass on the prompt to generate a room image with an Italian sofa and some different beds and table.So Is there any way that I can train the Stable Diffusion model in such way with all these parameters or If the Dreambooth script can be used here or any other alternative solution you can offer me for the specified problem.

0

@Dayanand In your example, the sofas would be the class and the different names would each be an instance. Fine tuning multiple instances of the same class can be challenging with the current sagemaker scripts. In this article look at the section on "Train on multiple datasets"

Unfortunately, JumpStart is currently limited to training on a single subject. 
You can’t fine-tune the model on multiple subjects at the same time. 
Furthermore, fine-tuning the model for different subjects sequentially 
results in the model forgetting the first subject if the subjects are similar.

If you look in that article on the section on Cats and Dogs, fine tuning for different types of dogs "overwrites" the previous dog training, but training for a cat and dog can work. In your example, you may be able to train for a sofa, a table, and a lamp of specific types, but if you want to change the type of lamp you would need to train another model.

For a "universal" model that would let you generate photos of different combinations of lamps and tables, you would need to bring your own code to do a different type of training that isn't available in sagemaker jumpstart.

profile pictureAWS
EXPERT
answered 9 months ago
  • Thanks, Jim For SD model training, I would like to convert the below function for sagamaker can you guide me?

    def dump_only_textenc(trnonltxt, modelt_name, instance_dir, output_dir, PT, Seed, precision, Training_Steps):

    !accelerate launch content/diffusers/examples/dreambooth/train_dreambooth.py \
    $trnonltxt \
    $extrnlcptn \
    $ofstnse \
    --image_captions_filename \
    --train_text_encoder \
    --dump_only_text_encoder \
    --pretrained_model_name_or_path="$modelt_name" \
    --instance_data_dir="$instance_dir" \
    --output_dir="$output_dir" \
    --captions_dir="$captions_dir" \
    --instance_prompt="$PT" \
    --seed=$Seed \
    --resolution=$TexRes \
    --mixed_precision=$precision \
    --train_batch_size=1 \
    --gradient_accumulation_steps=1 --gradient_checkpointing \
    --use_8bit_adam \
    --learning_rate=$txlr \
    --lr_scheduler="linear" \
    --lr_warmup_steps=0 \
    --max_train_steps=$Training_Steps
    
0

@Dayanand The Sagemaker fine tuning process has already converted those scripts.

If you take a look at the Hyperparameter section in the jumpstart blog you will see a lot of the same parameters in the python script you shared.

You will have to check the parameters available in the Sagemaker training scripts. If you don't see what you want in the article, you can check the metadata from the model. Go through the notebook and print these hyperparameters to see what else can be passed in:

hyperparameters = hyperparameters.retrieve_default(
model_id=train_model_id, model_version=train_model_version
)

(additional hyperparameters may be available since the article was written)

You should walk through this notebook.

profile pictureAWS
EXPERT
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions