Home » Tech » Introduction to Building with AWS Databases

Introduction to Building with AWS Databases

By Lauren Clerkin

Bottom Line Up Front (BLUF):

The course introduces you to different database options available with AWS and how to use Amazon’s Well Architectural Framework to identify best fit for different customer needs. Also provides brief overviews of different tools used to create fully managed workflows with AWS Databases.

AWS Well Architected Framework

The course began with a brief overview of AWS Well Architected Framework principles and 6 Pillars concepts. These are used to ask direct questions and help guide customers to the right database solution.

The 6 Pillers of AWS Databases

Operational Excellence – running and monitoring systems for continual improvement
Reliability – workloads performing their intended functions and how to recover quickly from failures
Performance Efficiency – structured and streamlined allocation of IT and computing resources
Cost Optimization – avoiding unnecessary costs
Sustainability – minimizing the environment impacts of running cloud solutions
Security – protecting information and systems

AWS Database management services provide a wide variety of solutions with varying degrees of reliability, performance, and security to give customers options for cost of the service.

Databases

When selecting a database service it is important to understand the data stored and how the data is accessed. Database services meant for large volumes and fast reliable access are going to be more expensive for that reliability.

Relational databases (Amazon Relational Database Service (Amazon RDS), Amazon Aurora)

Store data with a defined schema
Commonly used for transactional and traditional applications
AWS enables managed setup, operation, and scaling capabilities
RDS – Compatible with PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server
Aurora – Is part of RDS and compatible with MySQL and PostgreSQL and has option to be serverless

Key-Value (DynamoDB, Amazon Keyspaces)

Optimized to store and retrieve key-value pairs in large volumes
Serverless
Keyspaces – Specifically for Apache Cassandra

In-memory (ElastiCache, Amazon MemoryDB)

For read-heavy and compute-intensive applications
Serverless
AWS offers these as managed Redis solutions for data-intensive applications

Document (DocumentDB)

Store data as JSON Documents
Storage and compute are decoupled to scale independently

Graph (Amazon Neptune)

Optimized for querying and navigating relationships between highly connected datasets
Option for serverless
Uses Relicas for high availability

Ledger (Amazon Quantum Ledger Database (Amazon QLDB))

Provides transparent and immutable logs

TIme-series (Amazon Timestream)

Specialize in data that is tracked and aggregated over time

Data Workflow

Amazon provides many services to create a managed data workflow. Here are a few of those services and how they can work together.

Amazon Kinesis Data Firehouse – captures, transforms, and loads data into storage services
AWS Lambda – run code without provisioning or managing servers, can execute code in response to events
Amazon S3 data lake – storage
Amazon Redshift – Warehousing service for scalable analytic data
Amazon Redshift Spectrum – queries and retrieves structured and semi-structured data from files in Amazon S3 without having to load that data into Amazon Redshift tables
Amazon QuickSight – reports and dashboards in real-time

AWS Databases Takeaways

Each database has its advantages and disadvantages. For those teams focused on data ingestion, a managed service can help lower the cost of having an infrastructure team. For teams with analytic use cases; Redshift, QuickSight and other transformation tools can be useful but can more easily be replaced with open source tools when reliability and transaction logs are not a concern.