Athena - 删除引号并跳过 GZIP 压缩 CSV 文件中的第一行

0

【以下的问题经过翻译处理】 你好,

我尝试在 Athena 中运行以下 DDL 语句:

querying GZIP compressed CSV files

出于某种原因,它不会删除第一行,也不会从输出中删除引号字符 (")。

我在 AWS 论坛上找到了一个相关的帖子,其他人也遇到了这个问题: https://forums.aws.amazon.com/thread.jspa?messageID=755357&threadID=244207&tstart=0

很想听听您对此的看法。

DDL声明:

CREATE EXTERNAL TABLE IF NOT EXISTS table_name_here (
  eventID STRING,
  userID STRING,
  sessionID STRING,
  eventDate STRING,
  eventTimestamp STRING,
  eventName STRING,
  eventLevel INT,
  gaUserStartDate STRING,
  gaUserGender STRING,
  gaUserAgeGroup STRING,
  gaUserCountry STRING,
  gaUserAcquisitionChannel STRING,
  msSinceLastEvent STRING,
  browserName STRING,
  browserVersion STRING,
  campaign STRING,
  clientVersion STRING,
  collectInsertedTimestamp STRING,
  convertedProductAmount STRING,
  externalUserID STRING,
  mainEventID STRING,
  network STRING,
  operatingSystem STRING,
  operatingSystemVersion STRING,
  parentEventID STRING,
  platform STRING,
  productAmount STRING,
  productCategory STRING,
  productID STRING,
  productName STRING,
  productType STRING,
  realCurrencyAmount INT,
  realCurrencyType STRING,
  revenueValidated INT,
  signupSource STRING,
  transactionID STRING,
  transactionName STRING,
  transactionSourceId INT,
  transactionSourceName STRING,
  transactionStatus STRING,
  transactionType STRING,
  transactionVector STRING,
  userLevel INT,
  userType STRING,
  virtualCurrencyAmount INT,
  virtualCurrencyName STRING,
  virtualCurrencyType STRING,
  visitSource  STRING
)

    ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      ESCAPED BY '\\'
      LINES TERMINATED BY '\n'
LOCATION 's3://path/to/bucket'
TBLPROPERTIES (
	"skip.header.line.count"="1",
	"quoteChar"='"'
);

profile picture
ESPERTO
posta 5 mesi fa20 visualizzazioni
1 Risposta
0

【以下的回答经过翻译处理】 Athena 目前不支持 skip.header.line.count 和 quoteChar。

特别是,quoteChar 对 LazySimpleSerde (ROW FORMAT DELIMITED) 无效。它是 OpenCSVSerde 的属性,Athena 尚不支持它。

profile picture
ESPERTO
con risposta 5 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande