Amazon Personalize Data Requirements

0

A customer is looking to make video recommendations using a small data set of just 150 videos including metadata on each video. They currently do not have any data on user interaction with the videos at this point as they haven’t released them yet to their customers.

  • Are they still able to utilize Personalize to begin making initial recommendations or do they need this interaction data (1,000 interactions/25 customers/2 interaction entries)?
    • For example - if a user finishes watching a video, could they recommend the next video based on similar genres/meta data of the videos themselves? And then of course integrate the user interaction data that they generate over time to make stronger recommendations.

Ultimately, I'm wondering if this lack of user interaction data is a blocker to utilizing Personalize or if it simply will result in weaker recommendations. If it's a blocker to Personalize, is there a better solution for basic recommendations for a team without any ML experience?

AWS
asked 4 years ago294 views
1 Answer
0
Accepted Answer

I recently helped a customer in a very similar situation. They had a small set of interactions, but not nearly enough for Personalize. When using Personalize, it was clear that Personalize resorted to its Popularity Count model (which happens when the other recipes can't provide good recommendations).

The customer did have metadata available, which wasn't great quality, but at least provided a bigger dataset than the interactions. I built a prototype with their small interaction dataset and their larger metadata dataset using the LightFM package, which is a variation of factorization machines. The results were promising, despite the bad data quality.

In the case of this customer, their business model meant they would never be able to gather enough interaction data for Personalize, so they are exploring the option of expanding upon the prototype I built. However, if your customer would be able to gather sufficient interaction data over time, then it is likely that Personalize would eventually provide better results. It could be worthwhile implementing an interim solution which places more emphasis on metadata, while gathering data to eventually move to Personalize.

AWS
S_Moose
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions