Redshift as as Data Source for a REST API for Analytics Query

0

i have a customer who is considering using Redshift as Data Source for a REST API that will give external entities access to analytical queries over a "huge" (let's assume 100s ot TBs) data set. The API will be used by 3rd parties, and we should expect some degree of unpredictability on the workload with peaks and dips. The REST API will support only read operations.

The main questions I have are

  • What are the best practices for this type of Redshift use case?
  • Do we have any customer reference who has used Redshift this way?
  • Is Redshift the best service for this use case? At which point you would recommend Athena, Aurora, or ES?

Performance is their key priority and they want to minimise query latency to less than 10 seconds per API call.

AWS
Manos_S
posta 5 anni fa595 visualizzazioni
1 Risposta
0
Risposta accettata

It depends on the query pattern and the SLA they want to offer to this API:

  • DynamoDB storing pre-calculated metrics is a good way to provide fast response time and high availability but the update logic is tricky to implement
  • Elasticsearch can support updates and live aggregations with fast response time and high availability but simple metrics and it's easy to overload an Elasticsearch cluster
  • Redshift/athena can be an option if query customization needs to be offered but isn't highly available and unpredictable queries are dangerous for concurrency. Compared to Athena, it can provides faster response time, SLA on query execution (no risk of contention on Athena clusters).
AWS
con risposta 5 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande