Time series Feature extraction in AWS

0

Hello everyone!

I'm seeking advice on architecture design using AWS, specifically regarding the feature store process. Currently, I'm in the prototyping phase and using the tsfresh library for feature extraction. My goal is to incorporate this process into a deployment pipeline on AWS. If anyone has experience with tsfresh, I would greatly appreciate your recommendations on the most suitable AWS resources to use. I've considered using Lambda functions or Glue, but both seem to have limitations that may not be the best fit for my needs. Here's the AWS architecture I'm planning to deploy. However, I'm unsure if using Glue for Tsfresh is the ideal choice due to slow boot time and difficulties in installing additional libraries. On the other hand, Lambda has a payload limitation. For now, I'm looking for an easy and fast deployment solution to validate the process, even if it may not be the most optimal one.

Thank you in advance!

Enter image description here

Ali
gefragt vor 10 Monaten232 Aufrufe
1 Antwort
1

tsfresh is not library built for Spark, it won't distribute the processing and will default the point of using a Glue ETL cluster.
You have an option in the middle, using a Glue shell you can run a single process native Python libraries like in lambda but with more resources and no time constraints.

profile pictureAWS
EXPERTE
beantwortet vor 10 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen