Bottom Line Up Front (BLUF):
The course introduces you to different database options available with AWS and how to use Amazon’s Well Architectural Framework to identify best fit for different customer needs. Also provides brief overviews of different tools used to create fully managed workflows with AWS Databases.
AWS Well Architected Framework
The course began with a brief overview of AWS Well Architected Framework principles and 6 Pillars concepts. These are used to ask direct questions and help guide customers to the right database solution.
The 6 Pillers
- Operational Excellence – running and monitoring systems for continual improvement
- Reliability – workloads performing their intended functions and how to recover quickly from failures
- Performance Efficiency – structured and streamlined allocation of IT and computing resources
- Cost Optimization – avoiding unnecessary costs
- Sustainability – minimizing the environment impacts of running cloud solutions
- Security – protecting information and systems
AWS Database management services provide a wide variety of solutions with varying degrees of reliability, performance, and security to give customers options for cost of the service.
Databases
When selecting a database service it is important to understand the data stored and how the data is accessed. Database services meant for large volumes and fast reliable access are going to be more expensive for that reliability.
Relational databases (Amazon Relational Database Service (Amazon RDS), Amazon Aurora)
- Store data with a defined schema
- Commonly used for transactional and traditional applications
- AWS enables managed setup, operation, and scaling capabilities
- RDS – Compatible with PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server
- Aurora – Is part of RDS and compatible with MySQL and PostgreSQL and has option to be serverless
Key-Value (DynamoDB, Amazon Keyspaces)
- Optimized to store and retrieve key-value pairs in large volumes
- Serverless
- Keyspaces – Specifically for Apache Cassandra
In-memory (ElastiCache, Amazon MemoryDB)
- For read-heavy and compute-intensive applications
- Serverless
- AWS offers these as managed Redis solutions for data-intensive applications
Document (DocumentDB)
- Store data as JSON Documents
- Storage and compute are decoupled to scale independently
Graph (Amazon Neptune)
- Optimized for querying and navigating relationships between highly connected datasets
- Option for serverless
- Uses Relicas for high availability
Ledger (Amazon Quantum Ledger Database (Amazon QLDB))
- Provides transparent and immutable logs
TIme-series (Amazon Timestream)
- Specialize in data that is tracked and aggregated over time
Data Workflow
Amazon provides many services to create a managed data workflow. Here are a few of those services and how they can work together.
- Amazon Kinesis Data Firehouse – captures, transforms, and loads data into storage services
- AWS Lambda – run code without provisioning or managing servers, can execute code in response to events
- Amazon S3 data lake – storage
- Amazon Redshift – Warehousing service for scalable analytic data
- Amazon Redshift Spectrum – queries and retrieves structured and semi-structured data from files in Amazon S3 without having to load that data into Amazon Redshift tables
- Amazon QuickSight – reports and dashboards in real-time
Takeaways
Each database has its advantages and disadvantages. For those teams focused on data ingestion, a managed service can help lower the cost of having an infrastructure team. For teams with analytic use cases; Redshift, QuickSight and other transformation tools can be useful but can more easily be replaced with open source tools when reliability and transaction logs are not a concern.