how to display pandas objects nicely in athena notebook


Hi In Athena Notebooks,

df = spark.createDataFrame([
    (2, "Alice"), (5, "Bob")], schema=["age", "name"])

t = df.collect()
%table t

the displayed table looks like a typical pandas dataframe in a Jupyter/Colab notebook.

But is there a way to display a pandas dataframe in the same style in Athena notebooks?

df = spark.createDataFrame([
    (2, "Alice"), (5, "Bob")], schema=["age", "name"])


prints the dataframe, but not quite the same.

df = spark.createDataFrame([
    (2, "Alice"), (5, "Bob")], schema=["age", "name"])

t = df.toPandas()
%table t

doesn't really work.

asked 2 months ago287 views
1 Answer
Accepted Answer

The %table magic works only with the Spark DataFrame. I have confirmed that it does not generate the same output for Pandas DataFrame. As a workaround, you can convert the Pandas DataFrame to a Spark DataFrame before printing it out using the %table magic. Please refer to the sample code below:

import pandas as pd

# initialize list of lists
data = [[2, 'Alice'], [5, 'Bob']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Age','Name'])

# Create a PySpark dataframe
df_spark = spark.createDataFrame(df)

# Display as a table
t = df_spark.collect()
%table t

However, I was able to format and print the Pandas DataFrame using %matplot. Listing the code below for your reference:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

# hide axes

df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))

ax.table(cellText=df.values, colLabels=df.columns, loc='center')


%matplot plt
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions