AWS Athena
Serverless interactive SQL query service that executes queries directly against data in S3 using Presto/Trino, charging per TB of data scanned.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
All Athena API calls are over TLS. IAM policies can restrict query access to specific databases/tables. AWS Lake Formation enables column-level and row-level security. Query results in S3 should be encrypted at rest using SSE-S3 or SSE-KMS.
⚡ Reliability
Best When
Your data already lives in S3 in columnar format (Parquet/ORC) with well-defined partitions, and you need occasional SQL analytics without managing a persistent cluster.
Avoid When
Queries touch unpartitioned text files or scan full tables frequently, as costs compound quickly and query times degrade compared to a properly partitioned columnar layout.
Use Cases
- • Query partitioned Parquet or ORC tables in S3 data lakes with standard SQL without managing any infrastructure
- • Run ad-hoc cost investigations by querying AWS Cost and Usage Reports stored in S3 using Athena's native CUR integration
- • Automate async query pipelines: submit via StartQueryExecution, poll GetQueryExecution for SUCCEEDED status, fetch results from S3 output location
- • Create federated queries across RDS, DynamoDB, and S3 in a single SQL statement using Athena Federated Query connectors
- • Generate audit reports from CloudTrail logs stored in S3 by querying the Athena-managed CloudTrail table in the default Glue catalog
Not For
- • Low-latency OLTP queries requiring sub-second response — Athena query startup alone takes 1-3 seconds minimum
- • Transactional workloads with INSERT/UPDATE/DELETE patterns (use Aurora or DynamoDB instead)
- • Frequent repeated queries on the same data without caching — each scan costs $5/TB and identical queries are billed repeatedly unless results reuse is enabled
Interface
Authentication
Standard AWS IAM authentication. Requires athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject on output bucket, and Glue catalog read permissions. Fine-grained column-level security via Lake Formation.
Pricing
Costs add up quickly on unoptimized queries. Partition pruning and columnar formats (Parquet) can reduce scan costs by 10-100x. Results reuse caches identical queries for 7 days at no charge.
Agent Metadata
Known Gotchas
- ⚠ Query execution is asynchronous — StartQueryExecution returns immediately; agents must poll GetQueryExecution until State is SUCCEEDED or FAILED (not block on the initial call)
- ⚠ Results are stored in S3 at a configurable output location; GetQueryResults only returns the first 1000 rows — agents must paginate with NextToken or read directly from S3 for large result sets
- ⚠ Partition pruning only applies when WHERE clauses match the exact partition column names and types in the Glue catalog — typos or type mismatches cause full table scans and unexpected costs
- ⚠ DDL statements (CREATE TABLE, MSCK REPAIR TABLE) are separate query executions and cannot be combined with DML in the same StartQueryExecution call
- ⚠ The query timeout default is 30 minutes; long-running queries that exceed this are terminated silently — agents must check for CANCELLED state in addition to FAILED
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for AWS Athena.
Scores are editorial opinions as of 2026-03-06.