- Le plus récent
- Le plus de votes
- La plupart des commentaires
I understood that you wanted Glue to use the same match_id's for subsequent FindIncrementalMatches runs for incremental data.
Please note that the match_id's are arbitrary identifier which act as labeling for your data and it denotes the matching records which is predicted by your ML transform Algorithm. In case of incremental data the datasets is changing overtime and new or or modified records are being added thus there are now more candidate rows taken to be consideration by the machine learning model to decide the pairing and it is possible to have new match_id as it starts labeling from scratch, just like you provided as in the example. Unfortunately in Glue we do not have any option to enforce the previously generated match_id's to put it in use for the subsequent FindIncrementalMatches runs for incremental data as of now.
Contenus pertinents
- demandé il y a un an
- demandé il y a 8 mois
- demandé il y a 2 mois
- AWS OFFICIELA mis à jour il y a 4 mois
Thank you for the response. I have a follow-up question that there is a field with the key: "enforcedMatches" in FindIncrementalMatches. What is the purpose of this field?
It is mentioned in "https://docs.aws.amazon.com/glue/latest/dg/glue-etl-scala-apis-glue-ml-findincrementalmatches.html" but there is no implementation example for this.