- Newest
- Most votes
- Most comments
Hello,
Hope you are doing well. Thanks for contacting us for the support.
Per your description, I understand that you are facing an issue with calling method "applymap" with a DataFrame, and kindly see my explanation as below:
dfs = spark.read.format("binaryFile").load("s3://mybucketname/*.tdms")
The above code will return a "pyspark.sql.dataframe.DataFrame" object, and per [1], it does not have "applymap" function.
In case you need to use "applymap" function, you need to convert the DataFrame to "pyspark.pandas.frame.DataFrame" object which has the function [2]. Sample code to convert is as below [3]:
pd_dfs = dfs.to_pandas_on_spark()
Besides, you can check the actual datatypes of returned DataFrames, such as:
print(type(dfs))
print(type(pd_dfs))
Output:
<class 'pyspark.sql.dataframe.DataFrame'>
<class 'pyspark.pandas.frame.DataFrame'>
For your further question on what is the correct way of processing using Lambda, we need to look at your your use case to understand more about the context to give correct answers. Hence, I would request you to create a support request case, then we can go deep and discuss with you on the use case and provide appropriate solutions.
Thank you, and have a great day ahead.
===============
Reference:
[1] - https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.html
Relevant content
- asked 10 months ago
- asked 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 4 years ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 2 years ago