Skip to content

Best practices for customizing Bedrock Data Automation for multiple clients and utilizing multiple blueprints

0

Hello,

I am working on a project where we have multiple clients (100s+) and we are trying to scrape their Purchase Order PDFs. We have samples from many different clients and their PDFs are all slightly different. For example, certain clients have multiple "quantity ordered" columns and we only need data from a specific quantity column. We are accomplishing this by creating a custom blueprint for each client and specifying that the "quantity" data that we're looking for comes from a specific column with a specific heading. There is as much variability between these PDFs as there are clients. We do not have an issue with client or document classification - we know exactly who the data belongs to and exactly the type of document that it is.

We are still in the early stages of development, but there seems to be a limitation of 40 blueprints per project and 350 total per account. It seems like we are approaching this problem incorrectly by creating blueprints per client.

How are others dealing with this?

Thanks,

GP

1 Answer
0

You're correct that creating individual blueprints per client may not scale well given the limits of 40 document blueprints per project and 350 total blueprints per account. Here are some best practices to consider for your use case:

Consolidate Similar Document Types: Instead of creating one blueprint per client, consider grouping clients whose Purchase Order PDFs share similar structures and layouts. Create blueprints based on document format patterns rather than individual clients. For example, if 20 clients use similar PO templates with the same general structure, they could potentially share a single blueprint.

Leverage Blueprint Matching: When you provide multiple blueprints to a project, Bedrock Data Automation automatically matches documents to the most appropriate blueprint based on the blueprint name, description, and fields. This matching capability is designed to handle different document types within a single batch. You can take advantage of this by creating blueprints that represent common PO format variations rather than client-specific versions.

Use Descriptive Blueprint Definitions: Make your blueprint names and descriptions explicit and detailed. This helps the matching algorithm select the best blueprint for each document. Include information about the specific layout characteristics, column structures, or vendor patterns in your blueprint descriptions.

Consider Preprocessing: Since you already have strong client and document classification capabilities, you could potentially route documents to different projects based on format families, allowing you to work within the 40-blueprint-per-project limit while still maintaining specificity.

The key insight is that blueprints should be organized around document structure patterns rather than client identity, allowing you to scale more effectively while still capturing the variations in your Purchase Order formats.
Sources
Custom output and blueprints - Amazon Bedrock
Bedrock Data Automation projects - Amazon Bedrock
Splitting documents while using projects - Amazon Bedrock

answered a month ago
EXPERT
reviewed 25 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.