Time series Feature extraction in AWS

0

Hello everyone!

I'm seeking advice on architecture design using AWS, specifically regarding the feature store process. Currently, I'm in the prototyping phase and using the tsfresh library for feature extraction. My goal is to incorporate this process into a deployment pipeline on AWS. If anyone has experience with tsfresh, I would greatly appreciate your recommendations on the most suitable AWS resources to use. I've considered using Lambda functions or Glue, but both seem to have limitations that may not be the best fit for my needs. Here's the AWS architecture I'm planning to deploy. However, I'm unsure if using Glue for Tsfresh is the ideal choice due to slow boot time and difficulties in installing additional libraries. On the other hand, Lambda has a payload limitation. For now, I'm looking for an easy and fast deployment solution to validate the process, even if it may not be the most optimal one.

Thank you in advance!

Enter image description here

Ali
asked 10 months ago220 views
1 Answer
1

tsfresh is not library built for Spark, it won't distribute the processing and will default the point of using a Glue ETL cluster.
You have an option in the middle, using a Glue shell you can run a single process native Python libraries like in lambda but with more resources and no time constraints.

profile pictureAWS
EXPERT
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions