Time series Feature extraction in AWS

0

Hello everyone!

I'm seeking advice on architecture design using AWS, specifically regarding the feature store process. Currently, I'm in the prototyping phase and using the tsfresh library for feature extraction. My goal is to incorporate this process into a deployment pipeline on AWS. If anyone has experience with tsfresh, I would greatly appreciate your recommendations on the most suitable AWS resources to use. I've considered using Lambda functions or Glue, but both seem to have limitations that may not be the best fit for my needs. Here's the AWS architecture I'm planning to deploy. However, I'm unsure if using Glue for Tsfresh is the ideal choice due to slow boot time and difficulties in installing additional libraries. On the other hand, Lambda has a payload limitation. For now, I'm looking for an easy and fast deployment solution to validate the process, even if it may not be the most optimal one.

Thank you in advance!

Enter image description here

Ali
質問済み 10ヶ月前232ビュー
1回答
1

tsfresh is not library built for Spark, it won't distribute the processing and will default the point of using a Glue ETL cluster.
You have an option in the middle, using a Glue shell you can run a single process native Python libraries like in lambda but with more resources and no time constraints.

profile pictureAWS
エキスパート
回答済み 10ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ