Skip to content

Sagemaker Feature Store Lost Updates?

0

With AWS Sagemaker Feature Store, is there a risk of accidentally overwriting fields that are not intended to be changed? How is this typically prevented?

Example: A feature set stores data on cars Request A intends to update the price for Honda Accord 2025s to $40,000. It fetches the latest version in the feature set for Honda Accord where price was $45,000 and gas mileage was 40 mpg. At almost the same time Request B starts. It intends to update the gas mileage to 45 mpg. It fetches the same latest record with price $40,000 and mileage 40 mpg.

Request A writes first. Request B then writes, but it doesn't know Request A set the price to $45k. It resets the price to $40k but updates mileage to 45 mpg.

asked 2 years ago160 views
1 Answer
0

Each row for a feature store should have a unique identifier. In Feature Store, features are stored in a collection called a feature group. You can visualize a feature group as a table in which each column is a feature, with a unique identifier for each row. The latest record in online store after the series of requests given in the above example depends on the values of event time feature given for request A & B. As you may already know, online store will retain the record with the latest event time for retrieval through GetRecord API. If Req B had newer event time than Req A, that's what feature store would retain as the latest record. If both requests had same event time, its a race condition and there's no telling who wins.

Offline store is different. Every PutRecord request you ever submitted successfully will appear in Offline store (eventually). So records corresponding to both req A and B will be present in offline store when you query it at a later time.

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.