Removing Hudi jars from EMR Serverless Image

0

Hi Team,

We are using emr serverless image v7.0.0.0 for data ingestion activities. However, this image has hudi-related jars that have a few vulnerabilities. is it possible to remove these jars from the image, what would be the impact of eliminating them for normal spark-related jobs? right now we are not using hudi tables.

질문됨 한 달 전45회 조회
1개 답변
0

Hello,

Thank you for writing on re:Post.

I see that you want to know about the vulnerabilities present in the Hudi jar that comes with EMR 7.0 Serverless image. I would like to inform you that EMR team is aware of the Vulnerabilities and working with Hudi to remove it from the upcoming fresh images.

However as you do not use Hudi, you can remove the Jars/using custom Images please follow document [+] to setup custom image and in the docker file remove dependency like

# Dockerfile
FROM public.ecr.aws/emr-serverless/spark/emr-7.0.0:latest

USER root

RUN rm -r /usr/lib/hudi/hudi-aws-bundle-0.14.0-amzn-1.jar \ 

# EMRS will run the image as hadoop
USER hadoop:hadoop

[+] https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/application-custom-image.html

You can test this on your setup. Feel free to reach back for further queries.

I hope I was able to address your query.

Thank you!

AWS
지원 엔지니어
답변함 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠