Why do I get zero records when I query my Amazon Athena table?

4 minutos de lectura
0

I ran a CREATE TABLE statement in Amazon Athena with expected columns and their data types. When I run the query SELECT * FROM table-name, the output is "Zero records returned."

Resolution

Here are some common reasons why the query might return zero records.

File selected in crawler settings

If you're using a crawler, be sure that the crawler is pointing to the Amazon Simple Storage Service (Amazon S3) bucket rather than to a file.

Incorrect LOCATION path

Verify the Amazon S3 LOCATION path for the input data. If the input LOCATION path is incorrect, then Athena returns zero records.

Double slash in LOCATION path

Athena doesn't support table location paths that include a double slash (//). For example, the following LOCATION path returns empty results:

s3://doc-example-bucket/myprefix//input//

To resolve this issue, copy the files to a location that doesn't have double slashes. Here is an example AWS Command Line Interface (AWS CLI) command to do so:

aws s3 cp s3://doc-example-bucket/myprefix//input// s3://doc-example-bucket/myprefix/input/ --recursive

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

Data for multiple tables stored in the same S3 prefix

Glue crawlers create separate tables for data that's stored in the same S3 prefix. However, when you query those tables in Athena, you get zero records.

For example, your Athena query returns zero records if your table location is similar to the following:

  • s3://doc-example-bucket/table1.csv
  • s3://doc-example-bucket/table2.csv

To resolve this issue, create individual S3 prefixes for each table similar to the following:

  • s3://doc-example-bucket/table1/table1.csv
  • s3://doc-example-bucket/table2/table2.csv

Then, run a query similar to the following to update the location for your table table1:

ALTER TABLE table1 SET LOCATION 's3://doc-example-bucket/table1';

Partitions not yet loaded

Athena creates metadata only when a table is created. The data is parsed only when you run the query. If your table has defined partitions, the partitions might not yet be loaded into the AWS Glue Data Catalog or the internal Athena data catalog. Use MSCK REPAIR TABLE or ALTER TABLE ADD PARTITION to load the partition information into the catalog.

MSCK REPAIR TABLE: If the partitions are stored in a format that Athena supports, run MSCK REPAIR TABLE to load a partition's metadata into the catalog. For example, if you have a table that is partitioned on Year, then Athena expects to find the data at Amazon S3 paths similar to the following:

  • s3://doc-example-bucket/athena/inputdata/year=2020/data.csv
  • s3://doc-example-bucket/athena/inputdata/year=2019/data.csv
  • s3://doc-example-bucket/athena/inputdata/year=2018/data.csv

If the data is located at the Amazon S3 paths that Athena expects, then repair the table by running a command similar to the following:

CREATE EXTERNAL TABLE Employee (
    Id INT,
    Name STRING,
    Address STRING
) PARTITIONED BY (year INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION 's3://doc-example-bucket/athena/inputdata/';

After the table is created, load the partition information:

MSCK REPAIR TABLE Employee;

After the data is loaded, run the following query again:

SELECT * FROM Employee;

ALTER TABLE ADD PARTITION: If the partitions aren't stored in a format that Athena supports, or are located at different Amazon S3 paths, run ALTER TABLE ADD PARTITION for each partition. For example, suppose that your data is located at the following Amazon S3 paths:

  • s3://doc-example-bucket/athena/inputdata/2020/data.csv
  • s3://doc-example-bucket/athena/inputdata/2019/data.csv
  • s3://doc-example-bucket/athena/inputdata/2018/data.csv

Given these paths, run a command similar to the following:

ALTER TABLE Employee ADD
    PARTITION (year=2020) LOCATION 's3://doc-example-bucket/athena/inputdata/2020/'
    PARTITION (year=2019) LOCATION 's3://doc-example-bucket/athena/inputdata/2019/'
    PARTITION (year=2018) LOCATION 's3://doc-example-bucket/athena/inputdata/2018/'

After the data is loaded, run the following query again:

SELECT * FROM Employee;

Hive hidden files

Verify that your file names don't start with an underscore (_) or a dot (.).

Example:

  • s3://doc-example-bucket/athena/inputdata/_file1
  • s3://doc-example-bucket/athena/inputdata/.file2

If the files in your S3 path have names that start with an underscore or a dot, then Athena considers these files as placeholders. Athena ignores these files when processing a query. For more information, see Athena cannot read hidden files. If all the files in your S3 path have names that start with an underscore or a dot, then you get zero records.

Note: If your S3 path includes placeholders along with files whose names start with different characters, then Athena ignores only the placeholders and queries the other files. Therefore, you might get one or more records.


Related information

Creating tables in Athena

Using AWS Glue crawlers

OFICIAL DE AWS
OFICIAL DE AWSActualizada hace 2 años