Glue Studio Designer: use pyspark functions

0

I designed a glue job using Glue studio designer canvas feature and am using a custom transform in there. However, I am struggling to leverage functions like the ones from 'from pyspark.sql.functions import *' in the code as I get error "NameError: name xyz not defined.
How can I leverage these functions in the canvas tool?

  • so is it a best practice to run the imports within the function fo the custom transform?

AWS
Marco
질문됨 2년 전635회 조회
2개 답변
1

Try importing specific function instead of import *. For example, "from pyspark.sql.functions import split" to import split function.

I tried replicating your problem, it complained that import * can only be used at module level. But when I changed to specific function, it worked.

Hope this help.

AWS-TDN
답변함 2년 전
AWS
전문가
검토됨 2년 전
0
수락된 답변

Hi ,

yes, any library you need for your custom transform should be imported within the function.

just to consider if you want to run SparkSQL you could also use the SQL transform.

hope this helps,

AWS
전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠