How to choose which Spark kernel to use in SageMaker Studio?

0

Available Amazon SageMaker Kernels include the following two Spark kernels:

  • PySpark (SparkMagic) with Python 3.7
  • Spark (SparkMagic) with Python 3.7
  • Spark Analytics 1.0
  • Spark Analytics 2.0

And at re:Invent 2022 there was an announcement that "SageMaker Studio now supports Glue Interactive Sessions." "The built-in Glue PySpark or Glue Spark kernel for your Studio notebook to initialize interactive, serverless Spark sessions."

It seems like the benefits of using one of the Glue Spark kernels are that you can "quickly browse the Glue data catalog, run large queries, and interactively analyze and prepare data using Spark, right in your Studio notebook." But can't you already do all that with the existing two SageMaker kernels?

In other words, how do you choose whether to use one of the existing two SparkMagic kernels in SageMaker Studio notebooks or to use this new Glue Interactive Sessions feature?

  • I just looked up SparkMagic and looks like it's "a set of tools for interactively working with remote Spark clusters in Jupyter notebooks" -- meaning it's for executing Spark on EMR from SageMaker? And this announcement now makes it possible to do the same, but with Glue?

AWS
질문됨 일 년 전537회 조회
1개 답변
0

The difference is that with SparkMagic you would need to provide a Spark cluster and link to it using SparkMagic configuration.
With Glue Interactive Sessions all that time consuming work is taken case for you, you can easily create and destroy Spark clusters as you need.

profile pictureAWS
전문가
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠