- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
Amazon Titan Multimodal Embeddings G1 is designed for generating embeddings rather than for interactive question answering, which explains why you don't see it in the Bedrock playground. This model can only be used programmatically through the Bedrock API.
The Titan Multimodal Embeddings G1 model translates text inputs and images into numerical representations (embeddings) that capture their semantic meaning in the same semantic space. It's specifically designed for use cases like searching images by text, searching by image for similarity, or using a combination of text and image inputs.
Regarding your specific use case of analyzing graphs in images and answering questions about them:
The Titan Multimodal Embeddings G1 model is not suitable for this purpose. It doesn't provide direct question answering capabilities about image content. Instead, it generates vector embeddings that represent the semantic meaning of images and text, which are primarily useful for search and recommendation systems.
For your use case of analyzing a graph image and answering questions about it, you would need a generative multimodal model that can interpret visual content and generate text responses based on that interpretation. The Titan Multimodal Embeddings G1 model doesn't have this capability as it only produces embeddings, not text responses.
The model accepts text and image inputs and outputs vector embeddings of size 1,024 (default), 384, or 256 dimensions. These embeddings can be used for similarity matching but not for direct question answering about image content.
Sources
OfferUp improved local results by 54% and relevance recall by 27% with multimodal search on Amazon Bedrock and Amazon OpenSearch Service | AWS Machine Learning Blog
Amazon Titan Multimodal Embeddings G1 model - Amazon Bedrock
Amazon Titan Multimodal Embeddings G1 - Amazon Bedrock
