How to choose which Spark kernel to use in SageMaker Studio?

0

Available Amazon SageMaker Kernels include the following two Spark kernels:

  • PySpark (SparkMagic) with Python 3.7
  • Spark (SparkMagic) with Python 3.7
  • Spark Analytics 1.0
  • Spark Analytics 2.0

And at re:Invent 2022 there was an announcement that "SageMaker Studio now supports Glue Interactive Sessions." "The built-in Glue PySpark or Glue Spark kernel for your Studio notebook to initialize interactive, serverless Spark sessions."

It seems like the benefits of using one of the Glue Spark kernels are that you can "quickly browse the Glue data catalog, run large queries, and interactively analyze and prepare data using Spark, right in your Studio notebook." But can't you already do all that with the existing two SageMaker kernels?

In other words, how do you choose whether to use one of the existing two SparkMagic kernels in SageMaker Studio notebooks or to use this new Glue Interactive Sessions feature?

  • I just looked up SparkMagic and looks like it's "a set of tools for interactively working with remote Spark clusters in Jupyter notebooks" -- meaning it's for executing Spark on EMR from SageMaker? And this announcement now makes it possible to do the same, but with Glue?

AWS
質問済み 1年前598ビュー
1回答
0

The difference is that with SparkMagic you would need to provide a Spark cluster and link to it using SparkMagic configuration.
With Glue Interactive Sessions all that time consuming work is taken case for you, you can easily create and destroy Spark clusters as you need.

profile pictureAWS
エキスパート
回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ