내용으로 건너뛰기

Upgrade AWS Glue to use Hudi 0.14

0

AWS Glue 4.0 support Apache Hudi 0.12.1 version. What steps can I follow to upgrade the version of Hudi to 0.14 in AWS Glue 4.0

AWS
질문됨 2년 전676회 조회
2개 답변
1

To upgrade AWS Glue 4.0 to use Hudi 0.14 follow the below steps

  1. Download the Spark Bundle jar for Hudi 0.14 - hudi-spark3.3-bundle_2.12-0.14.0.jar (you can download this from Maven repository
  2. Download Spark Avro JAR - spark-avro_2.13-3.3.0.jar (you can download this from Maven repository
  3. Upload the Jars to a S3 bucket.
  4. Go to the AWS Glue job, select a ETL Job and then go to Job Details
  5. Under Advanced Properties -> Libraries > Dependent JARs path box - enter the S3 URI to both the JARs , comma separated. e.g (s3://<bucket_name>/hudi/hudi-0.14-jars/hudi-spark3.3-bundle_2.12-0.14.0.jar,s3://<bucket>/ja_hudi/hudi-0.14-jars/spark-avro_2.13-3.3.0.jar)
  6. Under Job parameters , add a Key --extra-jars and Value as S3 URI to both the JARs , comma separated (same as the step above)
  7. (Optional) : If under Job Parameters, the job already has a Key --datalake-formats and value hudi, remove this property, since it will conflict with the HUDI jar that you are passing explicitly under Library section.
AWS
답변함 2년 전
AWS
지원 엔지니어
검토됨 2년 전
0

Hi,

You can follow the guidance of this very detailled video to upgrade Hudi's version: https://www.youtube.com/watch?v=HJ6QQN408AE

Best,

Didier

전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠