Skip to content

Contextual MetaData

0

We created an interactions dataset with an additional "TIME_OF_DAY" column. This column takes values "morning", "afternoon", "evening" or "night". The idea is to make recommendations time-period dependent by specifying the TIME_OF_DAY in the context.

The definition of the interaction dataset is below. From what I understand from AWS documentation such a context column is used by the "Top picks for you" and "Personalized-Ranking-v2" recommenders (https://docs.aws.amazon.com/personalize/latest/dg/contextual-metadata.html)

It seems to work for the "Top picks for you recommender": The "Top picks for you" test-page on the AWS console does not seem to allow specifying a context. However, if I request recommendations via the dotnet "AWSSDK", I do get different recommendations if I specify different values for the TIME_OF_DAY context.

It does not seem to work for the "Personalized-Ranking-v2": The "Personalized-Ranking-v2" recommender (same dataset) allows specifying a context on the test page. However, the context-value seems ignored because we always get the same result.

We tried different User IDs. Different rankings are returned for different User IDs, but specifying different Context values has no effect. Also when using the dotnet "AWSSDK" the TIME_OF_DAY context does not seem to have any effect for the "Personalized-Ranking-v2" recommender.

Questions: Is this the 'normal' way for making recommendations time-dependent? Is there something I'm doing wrong? Is there some way (e.g. logging) to find out whether a context parameter is used during recommendations (or why it is ignored)?

{ "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "EVENT_VALUE", "type": [ "float", "null" ] }, { "categorical": true, "name": "TIME_OF_DAY", "type": "string" } ], "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "type": "record", "version": "1.0" }

asked a year ago75 views
1 Answer
0

Hi Eric,

Clarifying the Issue

This issue highlights challenges with the use of contextual metadata in AWS Personalize, specifically with the "TIME_OF_DAY" column in two different recommenders. Let’s address the questions systematically:


1. Is this the 'normal' way for making recommendations time-dependent?

Yes, adding contextual metadata like "TIME_OF_DAY" to your interactions dataset is the recommended way to make recommendations context-aware in AWS Personalize. Context columns can enhance recommendations by incorporating additional factors, such as temporal patterns or device preferences, into the model.

However, contextual metadata works differently across recommenders:

  • "Top Picks for You" Recommender: Designed to optimize for item relevancy based on user preferences and historical interactions. It is capable of utilizing contextual metadata during predictions, as you’ve observed with the AWSSDK.

  • "Personalized-Ranking-v2" Recommender: Primarily focuses on re-ranking a given list of items for a specific user. While it supports contextual metadata, its sensitivity to context depends on how well the context interacts with the user's historical data.

Thus, your approach is correct, but the effectiveness of the context column depends on the recommender type and your dataset's patterns.


2. Is there something I'm doing wrong?

It seems unlikely that your implementation is wrong, but here are areas to verify:

a. Context Column Training Verification

Ensure the context column (TIME_OF_DAY) was included during the model training. If the context data wasn’t properly incorporated, the model will ignore it during inference. Double-check the following:

  1. Schema Definition: Confirm that "TIME_OF_DAY" was defined as a categorical column in the schema, as you’ve shown.
  2. Data Completeness: Ensure the "TIME_OF_DAY" field has no missing or null values in the dataset. Missing data can reduce the impact of the context during training.
  3. Training Dataset Upload: Check that the training job used the latest schema and dataset with the "TIME_OF_DAY" column.

b. Interaction Dataset Size and Patterns

AWS Personalize models require sufficient interaction data to learn meaningful patterns. For "TIME_OF_DAY" to have an impact, ensure that:

  • There are enough interactions spread across all time periods (morning, afternoon, etc.).
  • The "TIME_OF_DAY" context influences item preferences in a way that the model can detect (e.g., users exhibit distinct item preferences during different time periods).

c. Context Inclusion in Inference

When calling the Personalized-Ranking-v2 API, ensure that the "TIME_OF_DAY" context is passed correctly. For example:

var context = new Dictionary<string, string>
{
    { "TIME_OF_DAY", "evening" }
};

3. Is there a way to find out whether a context parameter is used (or why it is ignored)?

Unfortunately, AWS Personalize does not provide direct logging to show how context is used during recommendations. However, you can troubleshoot this using the following strategies:

a. Examine Model Metrics

  • Go to the Solution Version Metrics in the AWS console.
  • Check whether metrics like precision@K and normalized discounted cumulative gain (nDCG) improved after including the "TIME_OF_DAY" column. If there’s no improvement, the model may not be effectively leveraging the context.

b. Feature Importance Testing

One way to verify context usage is to retrain the model without the context column and compare its recommendations and metrics to the original model. If the removal significantly degrades performance, the context was likely being utilized.

c. Test Synthetic Data

To isolate the issue, try creating a small synthetic dataset with clear "TIME_OF_DAY" patterns (e.g., certain items are highly relevant only in the evening). Train a model with this dataset and test its sensitivity to the "TIME_OF_DAY" context.


4. Why does the context work with "Top Picks for You" but not with "Personalized-Ranking-v2"?

The discrepancy likely arises because of how each recommender utilizes contextual metadata:

  • "Top Picks for You": Designed to predict item relevancy, so it uses the context column to identify time-dependent item preferences.

  • "Personalized-Ranking-v2": Re-ranks a given list of items for a user, so its reliance on context may be weaker if the provided list lacks temporal variation. This recommender primarily leverages the user-item interaction history, which could overshadow context effects.


Recommendations for Moving Forward

  1. Validate the Training Data: Ensure the context column was included and had sufficient variation during training.
  2. Test Synthetic Context Effects: Create a small, controlled dataset to test whether the recommender responds to context changes.
  3. Analyze Metrics: Check the model metrics to see if including "TIME_OF_DAY" improved results.
  4. Experiment with Alternate Context Columns: If "TIME_OF_DAY" is not impactful, consider combining it with other context features (e.g., DEVICE_TYPE).

Cheers, Aaron 🚀

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.