Streaming data from S3 to containers launched from AWS Batch

0

For processing genomics datasets in containers launched from AWS Batch, is it possible to stream the data directly from S3, using an approach similar to SageMaker pipe mode (or fast file mode)?

Also, are there any hardware limitations that I should be aware of? For example, in this article, the maximum transfer speed between EC2 and S3 is 100 Gbps (VPC endpoints and public IPs in the same region). In a 2018 article, the bandwidth between EC2 and S3 was stated as 25 Gbps

1 Respuesta
1

Hello,

Thanks for reaching out to AWS re:post. I understand that you want to stream data directly from S3. S3 does have a feature you can do this with Amazon OpenSearch Service.

OpenSearch is great for real-time application monitoring, log analytics, and website search. You can use it with s3 to stream data. I have provided a link in this response that shows you an example case of this. [1]

I recommend looking at Amazon FSx, it is great for genomics workflows. It provides scalability and data availability of S3 [2]. AWS also provides a series on how to build a genomics batch workflow on AWS. For more insight on this, I recommend look at this link [3].

Please contact if you have any further questions, and feel free to reach out to us via a support case to facilitate a discussion on the specifics of your resources.

[1] Streaming data Example- https://docs.aws.amazon.com/solutions/latest/text-analysis-with-amazon-opensearch-service-and-amazon-comprehend/example-of-streaming-data-from-s3.html

[2] Amazon FSx - https://aws.amazon.com/blogs/storage/using-amazon-fsx-for-lustre-for-genomics-workflows-on-aws/

[3] Genomics Batch Workflows on AWS - https://aws.amazon.com/blogs/compute/building-high-throughput-genomics-batch-workflows-on-aws-introduction-part-1-of-4/

respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas