- Newest
- Most votes
- Most comments
AWS Lake Formation offers several key features that help manage and secure data lakes at scale:
-
Storage API: This allows you to scan and filter data using Lake Formation policies. It's charged based on the number of bytes scanned (rounded to the next megabyte, with a 10MB minimum). To optimize filtering costs, you can store data in columnar file formats like Parquet and ORC.
-
Governed Tables: These are AWS-managed table types on Amazon S3. They're charged based on the amount of metadata in number of files tracked, API calls that retrieve or manipulate the metadata, and the number of bytes processed by the storage optimizer.
-
Storage Optimizer: This runs in the background to optimize the layout of Governed Tables, improving query performance. You can manage storage optimization for governed tables using the Storage APIs.
These features are available in specific AWS regions, including US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), Europe (Frankfurt, Ireland, London, Stockholm), Canada (Central), and South America (São Paulo).
For more information, you can explore the AWS Lake Formation documentation, particularly the sections on Storage APIs and governed tables. The Storage APIs documentation describes the StorageOptimizer structure and operations like ListTableStorageOptimizers and UpdateTableStorageOptimizer that allow you to manage storage optimization on governed tables.
Lake Formation also offers broader data governance features like centralized data permissions, simplified security management, data sharing capabilities, and compliance monitoring.
Sources
AWS Lake Formation Pricing - Amazon Web Services
Storage APIs - AWS Lake Formation
Data Lake Governance – AWS Lake Formation Features – AWS
Supported Regions - AWS Lake Formation
Relevant content
- asked 10 months ago
