📋 Inhaltsverzeichnis
AWS InfrastructureDesign
Skalierbare AWS-Architekturen mit EC2, ECS, RDS und Lambda - Best Practices und Kostenoptimierung für Enterprise-Cloud-Infrastrukturen.
🟠 AWS Grundlagen
Amazon Web Services (AWS) ist die weltweit führende Cloud-Computing-Plattform mit über 200 Services. Eine gut durchdachte AWS-Architektur ermöglicht Skalierbarkeit, Hochverfügbarkeit und Kosteneffizienz.
🌍 Global
33 Regionen, 105 Availability Zones
⚡ Skalierbar
Elastische Ressourcen-Skalierung
💰 Pay-as-Use
Bezahlen nur für genutzte Ressourcen
🌐 VPC & Networking
VPC Design Beispiel
# VPC mit Terraform
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "production-vpc"
Environment = "prod"
}
}
# Public Subnets für Load Balancer
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "public-subnet-${count.index + 1}"
Type = "Public"
}
}
# Private Subnets für Application Layer
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 10}.0/24"
availability_zone = var.availability_zones[count.index]
tags = {
Name = "private-subnet-${count.index + 1}"
Type = "Private"
}
}
# Database Subnets
resource "aws_subnet" "database" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 20}.0/24"
availability_zone = var.availability_zones[count.index]
tags = {
Name = "database-subnet-${count.index + 1}"
Type = "Database"
}
}
Networking Components
- • Internet Gateway - Internet-Zugang für Public Subnets
- • NAT Gateway - Outbound Internet für Private Subnets
- • Route Tables - Traffic-Routing Regeln
- • Security Groups - Instance-Level Firewall
- • NACLs - Subnet-Level Access Control
Best Practices
- • Multi-AZ Design - Hochverfügbarkeit
- • Subnet Segmentierung - Public/Private/DB Tiers
- • CIDR Planning - Ausreichende IP-Ranges
- • Security Groups - Principle of Least Privilege
- • VPC Endpoints - Private AWS Service Access
🖥️ EC2 Instances
Auto Scaling Group Setup
# Launch Template
resource "aws_launch_template" "web_server" {
name_prefix = "web-server-"
image_id = data.aws_ami.amazon_linux.id
instance_type = "t3.medium"
vpc_security_group_ids = [aws_security_group.web_server.id]
user_data = base64encode(templatefile("userdata.sh", {
app_version = var.app_version
}))
tag_specifications {
resource_type = "instance"
tags = {
Name = "web-server"
Environment = var.environment
}
}
lifecycle {
create_before_destroy = true
}
}
# Auto Scaling Group
resource "aws_autoscaling_group" "web_servers" {
name = "web-servers-asg"
vpc_zone_identifier = aws_subnet.private[*].id
target_group_arns = [aws_lb_target_group.web_servers.arn]
health_check_type = "ELB"
health_check_grace_period = 300
min_size = 2
max_size = 10
desired_capacity = 3
launch_template {
id = aws_launch_template.web_server.id
version = "$Latest"
}
tag {
key = "Name"
value = "web-server-asg"
propagate_at_launch = true
}
}
Instance Types
- • t3/t4g - Burstable Performance
- • m5/m6i - General Purpose
- • c5/c6i - Compute Optimized
- • r5/r6i - Memory Optimized
Kostenoptimierung
- • Reserved Instances - 1-3 Jahre Commitment
- • Spot Instances - Bis zu 90% Ersparnis
- • Savings Plans - Flexible Commitments
- • Right Sizing - Optimale Instance-Größe
Monitoring
- • CloudWatch - Metrics & Alarms
- • SSM Agent - Patch Management
- • Inspector - Security Assessment
- • Systems Manager - Operations
🐳 ECS & Fargate
ECS Service mit Fargate
# ECS Cluster
resource "aws_ecs_cluster" "main" {
name = "production-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
# Task Definition
resource "aws_ecs_task_definition" "web_app" {
family = "web-app"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = "512"
memory = "1024"
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
container_definitions = jsonencode([
{
name = "web-app"
image = "${var.ecr_repository_url}:latest"
portMappings = [
{
containerPort = 3000
protocol = "tcp"
}
]
environment = [
{
name = "NODE_ENV"
value = "production"
}
]
secrets = [
{
name = "DATABASE_URL"
valueFrom = aws_ssm_parameter.database_url.arn
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = aws_cloudwatch_log_group.app.name
awslogs-region = var.aws_region
awslogs-stream-prefix = "ecs"
}
}
healthCheck = {
command = ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval = 30
timeout = 5
retries = 3
startPeriod = 60
}
}
])
}
# ECS Service
resource "aws_ecs_service" "web_app" {
name = "web-app-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.web_app.arn
desired_count = 3
launch_type = "FARGATE"
network_configuration {
subnets = aws_subnet.private[*].id
security_groups = [aws_security_group.ecs_service.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.web_app.arn
container_name = "web-app"
container_port = 3000
}
deployment_configuration {
maximum_percent = 200
minimum_healthy_percent = 100
}
depends_on = [aws_lb_listener.web_app]
}
🗄️ RDS & Aurora
Aurora PostgreSQL Cluster
# Aurora Subnet Group
resource "aws_rds_subnet_group" "aurora" {
name = "aurora-subnet-group"
subnet_ids = aws_subnet.database[*].id
tags = {
Name = "Aurora DB subnet group"
}
}
# Aurora Cluster
resource "aws_rds_cluster" "aurora_postgresql" {
cluster_identifier = "aurora-postgresql-cluster"
engine = "aurora-postgresql"
engine_version = "13.7"
database_name = var.database_name
master_username = var.database_username
master_password = var.database_password
vpc_security_group_ids = [aws_security_group.aurora.id]
db_subnet_group_name = aws_rds_subnet_group.aurora.name
# Backup Configuration
backup_retention_period = 30
preferred_backup_window = "03:00-04:00"
# Maintenance
preferred_maintenance_window = "sun:04:00-sun:05:00"
# Encryption
storage_encrypted = true
kms_key_id = aws_kms_key.aurora.arn
# Performance Insights
enabled_cloudwatch_logs_exports = ["postgresql"]
# Point-in-time Recovery
copy_tags_to_snapshot = true
deletion_protection = true
tags = {
Name = "Aurora PostgreSQL Cluster"
Environment = var.environment
}
}
# Aurora Instances
resource "aws_rds_cluster_instance" "aurora_instances" {
count = 2
identifier = "aurora-instance-${count.index + 1}"
cluster_identifier = aws_rds_cluster.aurora_postgresql.id
instance_class = "db.r6g.large"
engine = aws_rds_cluster.aurora_postgresql.engine
engine_version = aws_rds_cluster.aurora_postgresql.engine_version
performance_insights_enabled = true
monitoring_interval = 60
monitoring_role_arn = aws_iam_role.rds_monitoring.arn
tags = {
Name = "Aurora Instance ${count.index + 1}"
}
}
✅ Aurora Vorteile
Performance
Bis zu 3x schneller als Standard PostgreSQL
Skalierung
Automatische Storage-Skalierung bis 128TB
Verfügbarkeit
99.99% SLA mit Multi-AZ Deployment
Backup
Automatische, kontinuierliche Backups
🔒 Security & IAM
IAM Role für ECS Task
# ECS Task Execution Role
resource "aws_iam_role" "ecs_execution" {
name = "ecs-execution-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "ecs_execution" {
role = aws_iam_role.ecs_execution.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
# Custom Policy für Secrets Access
resource "aws_iam_role_policy" "ecs_secrets" {
name = "ecs-secrets-policy"
role = aws_iam_role.ecs_execution.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ssm:GetParameters",
"secretsmanager:GetSecretValue",
"kms:Decrypt"
]
Resource = [
"arn:aws:ssm:*:*:parameter/myapp/*",
"arn:aws:secretsmanager:*:*:secret:myapp/*",
aws_kms_key.app.arn
]
}
]
})
}
# ECS Task Role (Runtime Permissions)
resource "aws_iam_role" "ecs_task" {
name = "ecs-task-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy" "ecs_task_s3" {
name = "ecs-task-s3-policy"
role = aws_iam_role.ecs_task.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
]
Resource = [
"${aws_s3_bucket.app_uploads.arn}/*"
]
}
]
})
}
🛡️ Security Best Practices
- • Least Privilege - Minimale erforderliche Berechtigungen
- • Role-based Access - IAM Roles statt Access Keys
- • Secrets Management - AWS Secrets Manager/SSM Parameter Store
- • Encryption - Verschlüsselung at Rest und in Transit
- • VPC Security - Private Subnets, Security Groups, NACLs
- • CloudTrail - API-Logging für Compliance
💰 Kostenoptimierung
Cost Optimization Strategien
- •Reserved Instances - 1-3 Jahre Commitment für bis zu 75% Ersparnis
- •Spot Instances - Bis zu 90% günstiger für flexible Workloads
- •Auto Scaling - Automatische Anpassung an Bedarf
- •S3 Intelligent Tiering - Automatische Storage-Optimierung
Monitoring & Alerting
- •AWS Cost Explorer - Detaillierte Kostenanalyse
- •Budgets & Alerts - Proaktive Kostenkontrolle
- •Trusted Advisor - Cost Optimization Empfehlungen
- •Resource Tagging - Kostenzuordnung nach Projekt/Team
Cost Budget mit Terraform
resource "aws_budgets_budget" "monthly_cost" {
name = "monthly-cost-budget"
budget_type = "COST"
limit_amount = "1000"
limit_unit = "USD"
time_unit = "MONTHLY"
cost_filters = {
Service = ["Amazon Elastic Compute Cloud - Compute"]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 80
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = [var.admin_email]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 100
threshold_type = "PERCENTAGE"
notification_type = "FORECASTED"
subscriber_email_addresses = [var.admin_email]
}
}
🎯 Zusammenfassung
Eine gut durchdachte AWS-Infrastruktur ist die Grundlage für skalierbare, sichere und kosteneffiziente Cloud-Anwendungen. Mit den richtigen Strategien erreichen Sie:
Erreichte Ziele
- ✅ Hochverfügbare Multi-AZ Architektur
- ✅ Automatische Skalierung und Wiederherstellung
- ✅ Enterprise-grade Security
- ✅ Kostenoptimierte Ressourcennutzung
Nächste Schritte
- 🔄 CI/CD Pipeline Integration
- 📊 Advanced Monitoring & Alerting
- 🏗️ Infrastructure as Code (Terraform)
- 🔧 Performance Optimization