- 最新
- 投票最多
- 评论最多
It turns out, the documentation for Athena is either incorrect or at best misleading. The excellent answer by Alexandre says it best here https://stackoverflow.com/questions/52564194/athena-unable-to-parse-date-using-opencsvserde
Basically you need to store the date or the timestamp in UNIX Epoch time. You wouldn't know that, because of all the emphasis on the format of the time. I tried as a timestamp, and that is why I got this error. As soon as I stored it as UNIX time I got somewhere. However, the unix_timestamp()
function only returns time in seconds (long) and timestamp wants time in milliseconds (double). So I simply multiplied by 1000:
df = df.withColumn("time", f.unix_timestamp("time", 'dd-MM-yyyy HH:mm:ss') * 1000)
After doing this, you will have a 13 digit double, and Athena will properly produce a timestamp from it if you have selected Timestamp as the Data Type.
相关内容
- AWS 官方已更新 2 年前
- AWS 官方已更新 1 年前