GENERIC_INTERNAL_ERROR: integer overflow

0

Im receiving this error after crawling my table , but there is not more about the error , How can one get more details about the issue. Im crawling parquet files in s3

This query ran against the "searchdb" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 5f2017d7-637f-441f-b063-2b9dfc93fb29

질문됨 일 년 전312회 조회
1개 답변
0

Hello,

The error “integer overflow” occurs when a numeric value is larger than the range of an integer where the maximum value allowed for an integer is "2147483647” as mentioned here https://docs.aws.amazon.com/athena/latest/ug/data-types.html

As you may know, Athena uses Presto as a query engine in the backend, So when Presto reads a Parquet file, it attempts to get the chunk size as an integer. If the total chunk size in bytes is greater than the maximum value for an integer, Presto will return an integer overflow error.

There are few options to narrow down which column could be causing the issue :

  1. Run a simple select query on the individual columns and determine which succeed and which fail with the "GENERIC_INTERNAL_ERROR: integer overflow" error. OR
  2. Inspect the metadata of the parquet file using a tool or library like Pyarrow or parquet-tools.



As a workaround, it is suggested to use smaller block size for parquet depending upon how you are generating the parquet data
.In Spark you can try setting "parquet.block.size" and "dfs.blocksize”. Please find the 3rd party guide below 
http://what-when-how.com/Tutorial/topic-2059e313/Hadoop-The-Definitive-Guide-457.html


I hope the above information helps!

Thank you!

AWS
지원 엔지니어
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠