Understanding Language Support in AWS Personalize

0

Hello, we have recently decided to implement AWS Personalize for our e-commerce website. While going through the documentation, I noticed that only seven languages are listed as supported. I have a few questions regarding this:

When it mentions "support for language," does it imply that AWS Personalize has the capability to understand the content of the text? For example, would it relate products with different but synonymous descriptions together?

Or does it simply mean that only the text written in those supported languages is accepted? For instance, if I were to "Latinize" my data, would AWS Personalize still be able to establish relationships between products?

MJ2000
asked 10 months ago296 views
1 Answer
0
Accepted Answer

Amazon Personalize supports for languages enables customers to unlock the information trapped in their product descriptions, reviews, movie synopses or other unstructured text to generate highly relevant recommendations for users. So as you mentioned, it is about contextualizing the unstructured text in the data set. You can see this in this reference documentation.

AWS
answered 10 months ago
profile picture
EXPERT
reviewed 10 months ago
profile picture
EXPERT
reviewed 10 months ago
  • Thank you for the clarification. To confirm, does this mean that AWS Personalize is capable of extracting meaning from the provided unstructured text data? Additionally, since we do not currently have plans to translate our data into English, would it be advisable to simply Latinize the data and publish it in that form?

  • Yes, it uses NLP behind the scene to extract key elements from the metadata. can you elaborate what you mean by Latinizing ?

  • Apologies for any confusion caused by my previous posts. This is what I mean by Latinizing:

    In my language, the word for "sky" is represented by the characters "ცა." By Latinizing, I am referring to the process of replacing each of these characters with their Latin counterparts. In this particular example, the Latinized version of "ცა" would be "Tsa."

  • If your data includes any non-ASCII encoded characters, your CSV file must be encoded in UTF-8 format : https://docs.aws.amazon.com/personalize/latest/dg/data-prep-formatting.html

    So I don't think you need to Latinize your data.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions