Newbie: Unable to use string with "like" in a query

0

I have a database of large.csv files which I am trying to query but it does not fetch any results when I use the string variable with "like" function. Here is my query: abc is declared as INT pc is declared as STRING

-- View Example CREATE OR REPLACE VIEW view_name AS SELECT abc FROM "test_log" WHERE pc like 'IG%';

  • Can you please verify your select statement and see whether you are able to see any data or not. For example :-

    1. Test the query - select * from test_log limit 100;
    2. Test the query - select * from test_log where pc like 'IG%';

    If the above are working then you should create the view and see the result.

asked a year ago236 views
1 Answer
0

We need more information to diagnose.

AWS
Yota_H
answered a year ago
  • Thanks Yota for the quick response. Here are my comments: Yes the query returns data if I remove the WHERE statement Yes the value for pc must start with 'IG'

  • I tried on my end but couldn't reproduce the issue. LIKE clause works as expected. If you're okay would you share minimal DDL and CSV content here for reproduction? (please deduct identifiable info) Alternatively you can contact AWS support for private support.

  • Sure Yota,

    DDL: CREATE EXTERNAL TABLE IF NOT EXISTS xxxxxx.xxxxxx_xxx ( priceper double, year int, dateoftransfer date, propertytype string, duration string, price int, postcode string, lad21cd string, transactionid string, id int, tfarea string, numberrooms string, classt string, CURRENT_ENERGY_EFFICIENCY string, POTENTIAL_ENERGY_EFFICIENCY string, CONSTRUCTION_AGE_BAND string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('field.delim' = ',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://xxxxxx/xxxxxx/' TBLPROPERTIES ('classification' = 'csv');

    -- View Example CREATE OR REPLACE VIEW view_name AS SELECT "year" FROM "xxxxxxx_xxx" ;

    CSV: priceper year dateoftransfer propertytype duration price postcode lad21cd transactionid id tfarea numberrooms classt CURRENT_ENERGY_EFFICIENCY POTENTIAL_ENERGY_EFFICIENCY CONSTRUCTION_AGE_BAND 4069.767442 2020 22/05/2020 F L 350000 IG11 0AB E09000002 {BA558B33-27A3-76EF-E053-6B04A8C0B4B7} 18447501 86 NA 11 83 83 NO DATA! 4818.181818 2020 06/08/2020 F L 265000 IG11 0AB E09000002 {BA558B33-2824-76EF-E053-6B04A8C0B4B7} 18504360 55 NA 11 83 83 NO DATA! 3837.209302 2020 27/03/2020 F L 330000 IG11 0AB E09000002 {CB0035E6-46A2-58AE-E053-6B04A8C091AF} 18466498 86 NA 11 84 84 NO DATA! 4050.632911 2021 29/01/2021 F L 320000 I

  • Tested after updating your DDL with WITH SERDEPROPERTIES ('field.delim' = '\t') and TBLPROPERTIES ('classification' = 'csv', 'skip.header.line.count'='1'). It's working fine, I can filter by WHERE postcode like 'IG%' for both table and view. Sorry I don't have further guidance on your issue - please wait for other comments or contact AWS support.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions