Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyse data directly in S3 using standard SQL. There's no infrastructure to manage - you simply point Athena at your data, define the schema, and start querying.

Key Features

Serverless: No clusters or infrastructure to provision
Standard SQL: Uses Presto/Trino under the hood
Schema-on-read: Define schemas without transforming source data
Multiple formats: Query CSV, JSON, ORC, Avro, and Parquet

Common Use Cases

Log analysis: Query application logs, CloudTrail, VPC Flow Logs
Ad-hoc analytics: Explore data without setting up a data warehouse
Data lake queries: Analyse large datasets stored in S3
Cost analysis: Query AWS Cost and Usage Reports

Example Query

SELECT 
  date,
  COUNT(*) as requests,
  AVG(response_time) as avg_response
FROM access_logs
WHERE status_code = 200
GROUP BY date
ORDER BY date DESC
LIMIT 30;

Performance Tips

Use columnar formats: Parquet and ORC significantly reduce query costs and time
Partition your data: Organise by date, region, or other common filters
Compress data: Use Snappy or GZIP compression
Use AWS Glue: Automate schema discovery with the Glue Data Catalog

What We Like

Pay per query: Only pay for data scanned
Zero maintenance: No servers, no tuning, no upgrades
Instant queries: No waiting for cluster spin-up
Integration: Works seamlessly with AWS Glue, QuickSight, and other services

What We Don't Like

Cost unpredictability: Poorly optimised queries on large datasets can be expensive
No indexes: Relies entirely on partitioning and columnar formats for performance
Concurrency limits: Default limits can be restrictive for multi-user workloads

Key Features​

Common Use Cases​

Example Query​

Performance Tips​

What We Like​

What We Don't Like​