My notes about AWS

23rd Mar 2024 research devops education

EC2
EBS
Elastic Load Balancer
Auto Scaling
ECS
EKS
Lambda
Elastic Beanstalk
S3
VPC
CloudFront
Route 53
API Gateway
RDS
DynamoDB
ElastiCache
CloudWatch
CloudTrail
CloudFormation
- Using CLI
Kinesis
Opensearch (Elasticsearch)
SNS
SQS
Developer tools
- CodeCommit
- CodeBuild
- CodeDeploy
- CodePipeline
- CodeStar
X-Ray
IAM
Cognito
KMS
Secret Manager
Other services
Use cases table

EC2

Instance	on launch: can enable termination protection can enable detailed monitoring (1-minute) can join to a directory (Windows instances only) can enable Elastic GPU (Windows instances only) can attach `AmazonEc2RoleforSSM` role for connection with Systems Manager (Session Manager) charging by the hour on Linux, charged by the second traditional EC2 instance types provide fixed CPU resources instance types: General Purpose Compute Optimized: high performance processors Memory Optimized: large data sets Accelerated Computing: calculations, graphics processing, data pattern matching Storage Optimized: low latency, high random I/O performance
EC2 burst balances	provide baseline level of CPU utilization with ability to burst CPU utilization above pay only for baseline CPU plus any additional burst lower compute costs burstable performance instances use credits for CPU usage earn credits when below CPU baseline no earns when equal CPU baseline spend credits when higher than CPU baseline accrued credits: earned credits can be used later to bust above baseline more credit spent than earned: standard mode: instance uses accrued credits if no accrued credits, instance slow to baseline CPU unlimited mode: instance uses accrued credits if no accrued credits, instance spends surplus credits future earns will pay surplus credits (sort of debt) .t2 unlimited allows applications to burst past CPU performance baselines can configure on instance launch free for 12 month if new accounts chargeable
Change physical host	can stop and start EC2 instance to move it to a different physical host if EC2 status checks are failing or there is planned maintenance on current physical host when stopped can modify instance type user data kernel RAM disk
IP address	public: reassigned if instance is stopped/started private: assigned automatically to all instances to primary network interface (eth0) Elastic IP: static public IP address (charged if unused) max 5 per Region (can increase by contacting AWS Support) region-specific DNS records for Elastic IPs can be configured by filling a form can associate single private IPv4 with single Elastic IP and vice versa BYOIP: bring part or all publicly routable IPv4/IPv6 address range from on-premises network to AWS not available in all Regions and for all resources
Bastion host	access VPC instances for management (SSH or RDP) need security group with the relevant permissions can restrict IP addresses CIDRs that can access the bastion host autoscaling groups for HA BEST PRACTICE: deploy Linux bastion hosts in two AZs use auto-scaling and Elastic IP can use Systems Manager Session Manager instead
Tag	can enforce standardized tagging via AWS Config or custom script e.g. EC2 instances not property tagged are stopped or terminated daily up to 50 tags
Resource group	mapping of assets defined by tags for metrics, alarms, config details, etc.
Instance store	provides temporary (non-persistent) block-level storage for instance differs from EBS which provides persistent storage and is also a block storage service that can be root or additional volume located on disks that are physically attached to the host computer ideal for temporary storage of information that changes frequently buffers, caches, scratch data, other temporary content can specify instance store volumes for an instance only on launch cannot move to another instance instance type determines the size and the type of hardware of storage some instance types use NVMe or SATA-based SSD to deliver high random I/O performance good option when need storage with very low latency but don’t need data to persist when instance is terminated good for distributed or replicated databases that need high I/O included as part of instance’s usage cost can be more cost-effective than EBS Provisioned IOPS
Monitor and logging	status checks: 1-minute checks that return pass or fail status if all checks pass, status of instance is OK otherwise is impaired `StatusCheckFailed_System`: problems with instance that require AWS involvement loss of network connectivity loss of system power software issues on physical host hardware issues on physical host that impact network reachability can recover, stop, terminate or reboot the instance `StatusCheckFailed_Instance`: problems with instance that require user involvement failed system status checks incorrect networking or startup configuration exhausted memory corrupted file system incompatible kernel can stop, terminate or reboot the instance unified CloudWatch Agent: see related section of CloudWatch integrated with CloudTrail can create a trail to enable continuous delivery of CloudTrail events to S3 bucket, including events for EC2 and EBS a trail enables to store records indefinitely if trail not configured, can view past 90 days events in Event history can use CloudTrail info to determine the request, the IP address, who made the request, when was made and additional details detailed monitoring is chargeable
Regional Data Transfer	data between instances in different regions is charged rates apply if: other instance in a different AZ, regardless type of address used public or Elastic IP addresses are used, regardless AZ of the other instance
On-demand instance	ideal for unpredictable workloads (dev/test) max On-Demand instances running across instance family: 20
Spot instance	take advantage of the unused capacity in the cloud up to a 90% discount can use for stateless, fault-tolerant of flexible applications such as big data containerized workloads CI/CD web servers high-performance computing pricing is determined by long term trends in supply and demand for EC2 spare capacity: don’t have to bid, just pay for the current hour two-minutes interruption notice when instances are about to be reclaimed by EC2 because EC2 needs capacity back not interrupted because of higher competing bids BEST PRACTICE: diversify to reduce impact of interruptions can use `RequestSpotFleet` API to launch thousands of Spot Instances and diversify can setup spot instances and spot fleets to respond to an interruption notice by stopping rather than terminating each instance family, AZ, instance size in every Region is in separate spot pool dynamic spot limit per Region
Reserved instance (RI)	when instance matches RI automatically discount is applied Standard: commitment of 1 or 3 years, charged whether it’s on or off 40-60% discount can change AZ, instance size, networking type with `ModifyReservedInstance` API cannot change instance family, OS, tenancy, payment options Scheduled: reserved for specific periods of time charged hourly, billend in monthly increments over the term (1 year) match capacity reservation to a predictable recurring schedule Convertible: commitment of 1 or 3 years, charged whether it’s on or off 31-54% discount can change AZ, instance size, networking type with `ExchangeReservedInstance` API can change instance family, OS, tenancy and payment options discount apply to selected AZ if no AZ is specified no reservation is created but discount applies to any instance in the family in the Region (Regional RI) max Reserved Instances purchase: 20
Dedicated host	physical servers dedicated available as On-Demand or Dedicated Host Reservation useful for server-bound software licenses that use metrics like per-core, per-socket or per-VM each dedicated host can only run 1 EC2 instance size and type good for regulatory compliance or licensing requirements complete isolation and predictable performance most expensive option billing is per host
Dedicated instance	virtualized instances on hardware available as On-Demand, Reserved Instances or Spot Instances also uses physically dedicated EC2 servers does not provide additional visibility and control of dedicated hosts billing per instance may share hardware with other non-dedicated instances in same account cost additional $2 per hour per Region

EBS

Persistence	persistent can attach multiple volumes to an instance most use cases: shared volume across EC2 instances replicated across multiple servers in an AZ failure rate of 0.1%-0.2% can enable termination protection root devices created under “/dev/sda1” or “/dev/xvda”
Volume	AZ specific
Volume type	SSD General Purpose (gp2/gp3) balance price/performance for boot volumes, low-latency interactive apps, dev and test 1GB - 16TB gp2 min 100 IOPS (at 33.33 GiB and below) maximum of 16000 IOPS (at 5,334 GiB) SSD Provisioned IOPS (io1/io2) highest performance for latency-sensitive transactional workloads for I/O intensive NoSQL and relational databases, boot volumes 4GB - 16TB ratio with disk dimension 50:1 e.g. 100GiB, max 5000 IOPS e.g. 200Gib, max 10000 IOPS max IOPS: 64.000 HDD Throughput Optimized (st1) low-cost for frequently accessed for big data, warehouses, log processing 125GB - 16TB max IOPS: 500 HDD Cold (sc1) lowest cost for less frequently accessed colder data 125GB - 16TB max IOPS: 250
Snapshot	can use to convert unencrypted volume to encrypted no granular backup saved incrementally only accessed through EC2 APIs Region-specific (volume are AZ-specific) can be taken of non-root volume while running consistent snapshot needs write to be paused deletion process deletes only data not needed by other snapshots can use to resize volume can use to copy between Regions (create an AMI image) create volume from snapshot, choosing AZ charged for data traffic to S3 and S3 storage cost (only changed blocks)
Encryption	can encrypt both the boot and data volumes data at rest inside the volume all data moving between the volume and the instance all snapshots created from the volume all volumes created from those snapshots same IOPS performance uses AES-256 data key stored on disk after being encrypted with CMK never appears on disk in plaintext same data key shared by volumes created from snapshot can check encryption status of volumes with AWS Config no direct way to change state cannot share encrypted volumes created using default CMK key can share unencrypted snapshots with other accounts by making them private and selecting the accounts to share with have to use non-default CMK key and configuring cross-account permissions cannot make encrypted snapshots public recommended that receiving account re-encrypt snapshot with its own CMK key
RAID	can be used to increase IOPS RAID 0 = 0 striping - data written across multiple disks increase performances same redundancy RAID 1 = 1 mirroring - 2 copies of data same performances increase redundancy RAID 10 = 10 combination of RAID 1 and 2 increase performances increase redundancy cost additional disks can configure multiple striped gp2 or standard volumes (RAID 0) can configure multiple striped PIOPS volumes (RAID 0) configured through the guest OS also EBS optimized EC2 instances increase performance not recommended for root/boot volumes
Monitoring and reporting	CloudWatch monitoring: Basic: data available in 5-minute period automatically sent by General Purpose (gp2), Throughput Optimized (st1) and Cold (sc1) not charged Detailed: automatically send 1-minute metrics automatically sent by Provisioned IOPS SSD (io1) charged log to CloudTrail Status checks: ok: enabled: I/O enabled I/O performance status (only Provisioned IOPS volumes): Normal warning: enabled: I/O enabled disabled: volume is offline and pending recovery or waiting for user to enable I/O I/O performance status (only Provisioned IOPS volumes) Degraded: performance is below expectations Severely Degraded: performance well below expectations impaired: enabled: I/O enabled disabled: volume is offline and pending recovery or waiting for user to enable I/O I/O performance status (only Provisioned IOPS volumes): Stalled: performance severely impacted Not Available: unable to determine I/O performance insufficient-data

Elastic Load Balancer

Automatically distributes incoming traffic across multiple targets.

Security	ELB node within a subnet ensure at least /27 subnet, at least 8 IP for ALB at least 2 subnets, for NLB at least 1 (2 recommended) use a DNS record with 60sec TTL automatically distributes traffic supports only valid requests so DDoS attacks (UDP and SYN floods) are not able to reach EC2 instances can add a WAF (AWS Web Application Firewall) to the ALB
Classic Load Balancer	(Layer 4/7): old, no more used protocols: all
Application Load Balancer	(Layer 7): routes based on content of requests target: IP, instance, Lambda, containers protocols: HTTP, HTTPS support SSL certificates through AWS Certificate Manager support SNI (Server Name Indication) allows multiple websites to use a single secure listener automatically scales its request handling capacity host-based routing and path-based routing (as nginx) microservices: load balancing multiple ports on single EC2 instance integration with Cognito (user pools, SAM) for authentication or OIDC IDP health checks sticky sessions (enabled at target group level): cookies to ensure client is bound to an individual back-end instance duration-based cookies (`AWSALB`) application based cookies: the cookie name for each target group, WebSockets connections are sticky to follow upgrade process ECS: can use 1 ALB, integrates with EC2 container using service load balancing
Network Load Balancer	(Layer 4): routes based on IP data target: IP, instance, ALB protocols: TCP handle millions of requests per second, low latency supports long-running/lived connections (ideal WebSocket)
Gateway Load Balancer	(Layer 3/4): firewalls, IDP/IPS systems target: IP, instance protocols: IP
Listener	define protocol and port to listen on no multiple listener on same port up to 10
Listener rule	priority one or more actions (one target group per action) optional host/path condition different target groups based on the content of the request default rule directs to default target group applied using round robin routing algorithm (priority order)
Target group	logical grouping of targets each listener contains a default rule, then can contain other rules that route requests to different target groups Auto Scaling Group scale target group individually
X-forwarded headers	LB intercepts traffic between clients and servers, so servers will see only IP and protocol of Load Balancer non-standard HTTP headers with “X-Forwarded” prefix X-Forwarded-For: help identify IP address of a client when using LB use attribute `routing.http.xff_header_processing.mode` to modify, preserve or remove X-Forwarded-For header append mode: store IP address of client in the header preserve mode: ensure header is not modified before send to target remove mode: remove the header before send to target X-Forwarded-Proto: help identify protocol that client used (HTTP/S) can be used by application to render response redirecting to appropriate URL X-Forwarded-Port: help identify destination port client uses to connect
Monitoring	CloudWatch (1min, only when requests active) can trigger SNS notification Access Logs: requester info, time of the request, client’s IP, request paths, etc. default disabled storable in S3 CloudTrail (capture API calls, storable in S3)

Auto Scaling

Adjust capacity to maintain steady, predictable performance at lowest cost.

Provides horizontal scaling (scale-out).

Auto Scaling Group	can merge ASG no additional cost, pay for the resources
Launch Configuration	template used to create new EC2 instances
Launch Template	same information of Launch Configuration multi-versions of a template
Integration with ELB	can attach one or more ELB can attach one or more Target groups
Lifecycle hooks	AS groups pauses to allow perform custom actions wait until lifecycle action is complete can configure to send SNS notification when instance is launched/terminated/fails to launch/terminate
Security and HA	HA is when instances are launched in at least 2 AZ cannot provide HA across multiple Regions support IAM policies: uses service-linked roles default is `AWSServiceRoleForAutoScaling` no support resource-based policies and ACLs
Cooldown period	setting, ensure AS not launching or terminating instances before previous scaling activity takes effect
Warm-up period	period in which a new EC2 instance (using Step Scaling Policy) is not considered towards metrics
Scaling options	Maintain: keep specific or minimum number of instances Manual: maximum, minimum or specific number of instances Scheduled: increase, decrease based on schedule Dynamic: scale based on real-time metrics Predictive: anticipates approaching traffic changes predicts future traffic including regularly occurring spikes using machine learning algorithms Target Tracking Policy: scales to keep metric at specific target value e.g. want to keep CPU usage at 70% Step Scaling Policy: scales based on set of scaling adjustments aka step adjustments e.g. vary adjustments based on size of the alarm breach Scaling based on SQS: custom CloudWatch metric that measures number of messages in queue per EC2 instance of AS group also tracks the number available for retrieval can base adjustments off the SQS Metric `ApproximateNumberOfMessages`.
Termination policies	control instances that are terminated first when scale-in event occurs default ensures EC2 instances span AZ evenly for HA can enable Instance Protection to prevent AS to scaling-in and terminate EC2 instances health checks grace period of 300 seconds in this period can use CLI `set-instance-health` to change instance to “healthy” can use Console or CLI to manually detach instances from AS group can suspend/resume any scaling process to investigate issues or make changes without scaling instances can be put in standby (still managed by AS and charged) can be used for performing updates/changes/troubleshooting without health checks invoking replacement
Monitoring, reporting and logging	examples of metrics sent every 1 min `GroupInServiceInstances` `GorupStandbyInstances` `GroupTerminatingInstances`: used to wait volume attachment in ACT migration `GroupTotalInstances` by default AS uses EC2 status checks support ELB health checks instance is unhealthy if ELB reports `OutOfService` BEST PRACTICE: enable ELB health checks, otherwise, in case of unhealthy ELB, an instance will be removed from service by ELB but not terminated by AS CloudTrail captures API calls for Auto Scaling as events if no trail is configured, can still view the most recent (up to 90 days) events in the Event History no additional charges for default CloudWatch metrics

ECS

Highly scalable, high performance container management service that supports Docker.

Security	EC2 instances use IAM role to access ECS can use IAM to control access at the container level container agent calls to ECS API through applied IAM roles need to apply before launch need to assign extra permissions to tasks through separate IAM roles (IAM Roles for Tasks) used to access services and resources Compute SLA: guarantees Monthly Uptime at least 99.99%
X-Ray	as a daemon: X-Ray agent runs in a daemon container as a “sidecar”: X-Ray agent runs alongside each container only way in Fargate task definition: X-Ray runs on port 2000 UDP must specify the daemon address must link containers together docker image can be deployed alongside application: `docker pull amazon/aws-xray-daemon`
Elastic Beanstalk	Single Container Docker: described in Dockerfile or `Dockerrun.aws.json` definition Multicontainer Docker: same set of containers in each environment defined in a `Dockerrun.aws.json` use when need multiple Docker containers to each instance
Elastic Container Registry	through IAM
Application Load Balancer	ALB supports target group that contains a set of instance ports can specify a dynamic port in task definition which gives the container an unused port when it is scheduled on the EC2 instance
Cluster	Container instance Task
Service	Service scheduler: can schedule ECS ensures specified number of tasks constantly running, reschedulates them when fail can ensure tasks registered against an ELB Custom scheduler: can schedule ECS custom scheduler to meet business needs leverages third party schedulers (Blox) leverages same cluster state information provided by ECS API to make appropriate placement decisions Task
Task definition	bind mounts: temporary storage docker volumes: created in `/var/lib/docker/volumes` in EC2 instance
ECS container agent	allows container instances to connect to the cluster runs on each infrastructure resource included in ECS optimized AMI (can be installed) manually install on non-AWS Linux instances
Launch types	Fargate Launch Type: serverless managed by AWS only supports images hosted on ECR or Docker Hub no support for docker volumes recently added support for EFS and EBS charged for running tasks EC2 Launch Type: EC2 instances only supports private repositories EFS and EBS integration charged for running EC2 instance
Auto Scaling	increase/decrease task count automatically leverage Application Auto Scaling service can use ECS CloudWatch metrics to deal high demand peaks or low utilization average CPU and memory usage Target tracking scaling policies based on a target value for a specific metric like a thermostat scaling is not performed when insufficient data Application Auto Scaling rounds actual metric data points fast scale out, normal scale in scale out during deployment, scale in suspended `ALBRequestCountPerTarget` metric for target tracking scaling policies not supported for the blue/green deployment type Step scaling policies based on step adjustments based on size of alarm breach can scale out when CPU utilization reaches a certain level, create an alarm using the `CPUUtilization` metric provided Scheduled scaling based on date and time
Cluster Auto Scaling	Capacity Provider: can be associated with an EC2 Auto Scaling Group managed scaling: automatically created scaling policy on ASG, new scaling metric (Capacity Provider Reservation) that the scaling policy uses manage instance termination protection: container-aware termination of instances in the ASG when scale-in

EKS

Managed service for running Kubernetes on AWS and on-premises.

Can run on EC2 or Fargate
integrates with ALB, IAM for RBA and VPC
provides scalable and ha Kubernetes control plane running across multiple AZs
automatically manages availability, scalability of Kubernetes API servers and etcd persistence layer
3 AZs to ensure high availability (automatically detects and replaces unhealthy control plane nodes)
use when organization needs a consistent control plane for managing containers across hybrid clouds and multi cloud environments

Lambda

Elastic Beanstalk

Quickly deploy and manage applications in AWS Cloud. PaaS (Platform as a Service). Relies on CloudFormation.

EB CLI	can use from local repository
Application	collection of environments environments confs application versions
Application version	reference section of deployable code point to S3 bucket containing code (source bundle) can protect to prevent data loss can apply to any environment retention by lifecycle policies: time-based: max age count-based: max number to retain not deletable (if in use)
Environment	provisioning of all resources deployed by a version can be dev, prod, uat, …
Environment tiers	web servers standard app HTTP requests over port 80 workers specialized apps background progressing task long-running task: listen messages on SQS queue can decouple application tiers (more environments for more workers) can define periodic tasks (cron.yaml)
Environment configurations	collection of parameters and settings of env configuration template
Deployment options	Single instance: for dev High availability with load balancer: for prod
Deployment policies	All at once: deploy all instances simultaneously fastest (good for quick iterations) outage (not ideal for critical systems) on failure need rollback and re-deploy original no additional cost Rolling: update few instances at time (batch) downtime for batch at time running both versions simultaneously environment capacity reduced by batch size (nr of instances) long deployment time reduce performances (not ideal for performance-sensitive systems) on failure need addition rolling update to rollback no additional cost Rolling with additional batch: update new instances full availability running both versions simultaneously additional batch removed in the end set bucket size related to speed of deployment good for production small additional cost Immutable: update new instances in new ASG swapping traffic once healthy zero downtime longest deployment great for production on failure quick rollback high cost (double instances) support Blue/green deployment: create new “staging” environment (green) green env validated independently (can rollback) Route 53 can use weighted policies to redirect % of traffic to green “swap URLs” when done zero downtime longest deployment on failure: swap URL extra cost
.ebextensions	folder containing config files (YAML or JSON ending with .config) `.ebextensions/<filename>.config` in the application source code root allows: package to install Linux user and groups shell commands services (RDS, ElastiCache, DynamoDB, etc) load balancer can modify default settings (`option_settings`) resources added by it get deleted if environment terminated
RDS	only for development environments on termination data are lost for production create RDS database outside Elastic Beanstalk migration from environment to standalone snapshot db enable deletion protection on db create new environment without RDS make application point existing db perform blue/green deployment swap environments terminate old environment (db won’t be deleted) connection env properties: RDS_HOSTNAME, RDS_PORT, etc.
SSL certificate	provisioned using ACM or CLI onto load balancer from Console code (`.ebextensions/securelistener-alb.config`) redirecting HTTP to HTTPS by application (e.g. nginx) by ALB rule ensure health checks not redirected

S3

Object storage; durable, highly available, infinitely scalable data storage, low cost.

Redundancy, HA, scalability	redundantly stored across multiple AZs automatically scales to high request rates (can offload using CloudFront edge locations)
Limits	can only store files alternatively, can use S3 Object Tagging across bucket and/or prefixes max file size: 5TB
Eventual consistency	eventual consistency of overwrite PUTS and DELETES atomic updates: either get the new object or the old one (when reading updated object), never get partial or corrupt data
REST web services interface
Websites	hosting static websites returns an HTML document cannot use dynamic content such PHP etc. automatically scales can use custom domain with a Route 53 alias bucket name must be the same as the domain can enable redirection (object/bucket level) URL: `<bucketname>.s3-website.amazonaws.com` doesn’t support HTTPS/SSL support only GET and HEAD S3 API needed and index document (default web page) and an error document (optional) access control is only public
User policies	IAM policies (programmatic access or Console) granting permissions for all S3 operations managing permission for users in account granting object permissions to user within the account
Buckets	flat, unlimited, region-specific container of objects doesn’t provide hierarchy of objects, object key name (prefix) mimics folders ownership not transferable, the account owner is the owner rather than the IAM user names cannot be changed, they are part of the URL `https://s3-<AZ>-amazonaws.com/<bucketname>` can backup into another bucket in another account all private by default bucket sub-resources: configuration containers associated to a bucket lifecycle rules Versioning ACLs Bucket policies CORS: allow requests to a different origin using `Access-Control-Allow-Origin` header
Bucket lifecycle rules	set of rules that define actions applied to a group of objects Transition actions: when objects transition to another storage class billing changes not occur until object has transitioned (if there is a delay) Expiration actions: when objects expire costs depend on when expire objects won't be charged for storage after the expiration time (if there is a delay) use examples: expiration of logs documents frequently accessed only on certain period of time upload some data for archival purposes, retained for regulatory compliance
Versioning	stores versions of an object protecting against accidental deletion or overwrites when delete, marker is placed on the object can delete marker to make object available again enables roll-back and undelete can be used for archive old versions are billable not replicating existing objects when enabled (they will have NULL version ID) can integrate with lifecycle rules can enable MFA for delete and changing versioning settings reverting is not replicated when enabled, only owner can permanently delete objects cannot be disabled, only suspended
Bucket ACL (Access Control Lists)	pre-defined groups: authenticated user group all users group Log Delivery Group: enable S3 to write server access logs (need WRITE permissions) permissions: READ: list the objects in the bucket WRITE: create, overwrite, delete any object in the bucket (recommended only for S3 Log Delivery Group) READ_ACP: read ACL WRITE_ACP: write ACL FULL_CONTROLL: all above permissions
Bucket policies	use AWS Policy Generator to create resource-based access policies can grant permission to the bucket and objects: individual users AWS accounts everyone (public/anonymous) all authenticated AWS users allow/deny based on elements: requester S3 actions resources aspects or conditions of the request (e.g. IP address, headers)
Objects	unique key (ID or name) identified through service endpoint, bucket name, key (optionally) version metadata tags: can use for IAM policies, lifecycle policies and costumize metrics object sub-resources: configuration containers associated to an object ACLs restore: restoring an archive
Object ACL (Access Control Lists)	by default grants resource owner full control pre-defined groups: authenticated user group: all AWS accounts all requests must be signed any authenticated user all users group: anyone in the world request can be unsigned (anonymous) BEST PRACTICE: never grant WRITE o FULL_CONTROL permissions: READ: read object data and metadata READ_ACP: read ACL WRITE_ACP: write ACL FULL_CONTROL: all above permissions
Storage classes	S3 Standard: durable, immediately available, frequently accessed S3 Intelligent-Tiering: moves data to the most cost-effective tier S3 Standard-IA: durable, immediately available, infrequently accessed S3 One Zone-IA: lower cost, infrequently accessed data with less resilience; single AZ S3 Glacier: archiving storage for infrequently accessed data
Glacier	must complete a job before get output cannot be set as storage class at creation of object encrypts data at rest using AES256 symmetric keys supports transfer over SSL multiple AZs (resilient to one AZ destruction) PUT API for direct upload, cannot use Console S3 lifecycle management for automatic migration not archiving metadata up to 40TB can use multipart upload synchronous upload and asynchronous download content not editable, data not available for real time access if lot of small objects, an archive is preferred with description, no metadata supported each upload has an unique archive ID retrieval: Expedited: 1-5 minutes, expensive Standard: 3.5 hours, cheaper, 10GB free per month Bulk: 5-12 hours, cheapest, use for large quantity of data can retrieve part of archive can send SNS notification when retrieval job completed can retrieve specific objects in archive specifying byte range in HTTP GET (need to maintain a DB of byte ranges) no charges for data transfer between EC2 and Glacier in same Region; charge if delete data within 90 days when restore pay for Glacier archive, requests, restored data in S3 storage tiers: Instant Retrieval: retrieval in milliseconds same performance as S3 standard Flexible Retrieval, ideal for backup, disaster recovery with large sets of data: retrieval in minutes to hours (free bulk retrievals) Deep Archive, lowest cost for long-term retention (7-10 years) ideal alternative to magnetic tape libraries retrieval within 12 hours
Server-side encryption	protects data at rest encrypting objects with a unique key encrypt the key itself with regularly-rotated master key using AES-256 need to use bucket policy for encrypt all objects in a bucket when uploading/downloading need `kms:Decrypt` along with `kms:ReEncrypt`, `kms:GenerateDataKey` and `kms:DescribeKey` encryption options: SSE-S3: S3 managed keys header `x-amz-server-side-encryption=AES256` SSE-KMS: CMKs (Customer Master Keys) stored in KMS additional benefits and charges provides audit trails on who/when CMK was used can use CMK or can select own key an envelope key protects custom keys header `x-amz-server-side-encryption=aws:kms` SSE-C: client provided keys client manage keys; S3 manages encryption keys are not stored by AWS if key lost, data cannot be decrypted provide key with following request headers: `x-amz-server-side-encryption-customer-algorithm`: must be AES256 `x-amz-server-side-encryption-customer-key`: 256-bit, base64-encoded key `x-amz-server-side-encryption-customer-key-MD5`: digest of the encryption key (RFC1321) used for message integrity check
Client-side encryption	encrypt before upload using local encryption process using CMK: send a request to KMS and it returns plaintext version of data key (used to encrypt object) and a cipher blob of data key to upload to S3 object metadata when downloading encrypted object came with cipher blob of data key then the client sends it to KMS to get plaintext data key in order to decrypt object master key stored in application: S3 encryption client generates a data key locally used to encrypt object (one for each object) client encrypts data key with the provided master key and upload it in metadata client uploads encrypted data with the encrypted data key as metadata (`x-amz-meta-x-amz-key`) when downloading material metadata tells which master key use to decrypt downloaded data key that will decrypt object
Event notifications	in response to PUTs, POSTs, COPYs or DELETEs enables to run workflow, send alerts, perform actions publishes notifications when new object created, object removal, restore object, RRS (Reduced Redundancy Storage) object lost, replication events sends event notification messages to destinations: SNS, SQS, Lambda function invoking; needs related permissions
CRR Cross Region Replication	automatically 1:1 replicate data across Regions asynchronous copying of objects configured at bucket level can define a destination bucket in different Region (can be different storage class) versioning must be enabled in both source/destination bucket can configure separate lifecycle rules replicate KMS-encrypted object by providing destination KMS key can be cross account can replicate all object of a subject by key name prefix data in transit are encrypted SSL needed permissions to replicate both charges as inter-region transfer and upload requests replication are triggered by object upload/delete/change (also metadata or ACL) only object created with SSE-S3 are replicated not replicated: object existing before enabling replication objects on which bucket owner has no permissions bucket-level subresources object that are replicated from another source encrypted with SSE-C or SSE-KMS deleting by object version ID deletes object, delete marker is not replicated
SSR Same Region Replication	destination bucket within the same Region can be different storage class automatic and asynchronous replicated object can be owned by different accounts ACL and tags are replicated
Multipart Upload	for files larger than 100MB; uploads object in parts independently, in parallel or any order can use for objects from 5MB to 5TB mandatory for objects larger than 5GB improves throughput can be paused/resumed can use Transfer Acceleration
Requester pays function	the requester pay (no anonymous access) not working in static website
Pre-signed URLs	provide temporary access to specific object to those who don’t have AWS credentials must configure expiration date and time usable both for download/upload
Copy	copy of objects up to 5GB in a single atomic operations objects cannot be changed, so copy allows to: generate copies rename objects changing che copy’s storage class or rest encryption move objects across locations/regions change object metadata
Transfer Acceleration	fast, easy and secure transfers of files over long distances between client and S3 bucket leverages CloudFront’s globally distributed Edge Locations charged only if benefit in timing URL: `<bucketname>.s3-accelerate.amazonaws.com`
Cloud Watch metrics	storage metric are enabled by default and reported once per day can enable 1-minute CloudWatch request metrics can call S3 PUT Bucket Metrics API to enable and configure S3 storage metrics can monitor: S3 requests bucket storage bucket size all requests HTTP 4xx/5xx errors
S3 Server Access Logging	can record actions taken by users, roles, services for log records for auditing and compliance provides detailed records of requests bucket being logged must not be the destination of logs (logging loop) can use in combination with CloudTrail CloudTrail is recommended for logging bucket and object-level actions

VPC

Logically isolated sections of the cloud, the resources are in a virtual network

region wide, each region has a default VPC (up to 5)

Dedicated tenancy	ensure dedicated hardware for instances in VPC
IP address	IP address space from master address range CIDR block between /28 and /16 netmask first 4 and last IP are reserved
Subnet	segment of VPC’s addresses range (cannot overlap each other), can place groups of isolated resources map 1:1 with AZ, cannot span zones public: traffic is routed to an Internet Gateway “Auto-assign public IPv4 addresses” is true route table must allow traffic to all destination (0.0.0.0/0) private: no connection to internet VPN-only: traffic routed to a private gateway for VPN connection
Internet Gateway	horizontally scaled, redundant, highly available VPC side of a connection to internet, attach to VPC VPC can have max 1 IG
NAT Gateway	highly available, managed by AWS, NAT to make resources in private subnet access internet must live on a public subnet uses Elastic IP for public IP multi-AZ redundancy (HA) up to 5 Gbps bandwidth that can scale up to 45 Gbps cannot use for VPC peering, VPN, Direct Connect (they need specific routes) no associated with any security group no need to disable source/destination checks no port forwarding cannot use as bastion host no metrics
NAT Instance	managed by user must live on a single public subnet must disable the source/destination check on the instance need to be assigned to security groups amount of traffic supported based on instance type can lead to bottleneck (not HA), but can be HA with ASG, multiple subnets in different AZ’s and script to automate failover scale up (instance type or enhanced networking) scale out by using multiple NATs in multiple subnets can use as bastion (jump) host monitor traffic metrics not supported for IPv6
Hardware VPC Connection	hardware-based VPN connection between VPC and corporate data center (site-to-site) or home network, co-location facility etc.
VPN Connection	Virtual Private Network: VPC side of a VPN connection Customer Gateway: your side of VPN connection
Router	directs traffic between gateways, subnets, AZs etc. can redirect to external destinations each subnet can have only one route table one route table can be assigned to many subnets the default rule allows VPC subnets to communicate with one another
Peering Connection	route traffic via private IP addresses between two peered VPC
VPC Endpoints	private connectivity to services without gateway, NAT, VPN, firewall etc.
Egress-only Internet Gateway	stateful gateway to provide egress only access of IPv6 traffic from VPC to internet prevent inbound access to IPv6 instances must create a custom route for ::/0 to be preferred to NAT Gateway for IPv6
Security groups	act like a firewall at instance level (network interface level) second line of defense implicit deny rule at the end, support allow rules stateful all outbound traffic allowed by default cannot delete default SG of a VPC names can be destination of other security groups and its own inbound rules members of SG can be in any AZ or subnet of the VPC changes take effect immediately cannot block specific IP addresses, use NACLs
Network ACLs	act like firewall at subnet level (subnets must be associated with a NACL) first line of defense support permit/deny rules, evaluated in order from lowest number to explicit deny stateless, responses are subject to rules for direction of traffic not apply to traffic within subnet (do not filter traffic between instances in same subnet) custom NACL deny all traffic by default can block specific IP addresses or ranges software firewall installed on instances is recommended
VPC Flow Logs	diagnosing overly restrictive security group rules monitoring traffic that is reaching instance determining direction of traffic to and from the network interfaces
S3 Object Ownership	bucket-level setting to control ownership of objects uploaded to bucket and enable/disable ACL when ACL disabled bucket owner owns all objects in the bucket manage access exclusively using access management policies BEST PRACTICE: ACL A majority of modern use cases in Amazon S3 no longer require the use of ACLs, and we recommend that you keep ACLs disabled except in unusual circumstances where you must control access for each object individually. With ACLs disabled, you can use policies to more easily control access to every object in your bucket, regardless of who uploaded the objects in your bucket. Object Ownership has three settings that you can use to control ownership of objects uploaded to your bucket and to disable or enable ACLs: ACLs disabled Bucket owner enforced (default) – ACLs are disabled, and the bucket owner automatically owns and has full control over every object in the bucket. ACLs no longer affect permissions to data in the S3 bucket. The bucket uses policies to define access control. ACLs enabled Bucket owner preferred – The bucket owner owns and has full control over new objects that other accounts write to the bucket with the bucket-owner-full-control canned ACL. Object writer – The AWS account that uploads an object owns the object, has full control over it, and can grant other users access to it through ACLs.

CloudFront

Web service that distributes content with low latency and high data transfer speeds; distribution of frequently accessed static content (popular images, videos, media files, software downloads).

used for dynamic, streaming, interactive content
ingress to upload, egress to distribute

PCI DSS	PCI DSS compliant BEST PRACTICE: not to cache credit card information at edge locations
HA	HIPAA compliant DDoS protection (distributes traffic across multiple locations)
Support	support Perfect Forward Secrecy new private key for each SSL session support wildcard SSL certificates support Dedicated IP support custom SSL and SNI Custom SSL (cheaper)
Edge Location	where content is cached request are automatically reloaded to nearest not tied to AZs or Regions PUT/POST/PATCH/OPTIONS/DELETE proxy methods and dynamic content go directly to web origin from edge location (not passing for Regional Edge caches) can write content can upload file
Regional Edge Caches	between origin web servers and global edge locations have larger cache-width than any individual edge location longer duration of caching aim to get closer to user
Origins	origin of the files that CDN will distribute S3 bucket static website EC2 instance use AMI that install software for web server use ELB and specify its URL as domain name of the origin server ELB Route 53 external non-AWS (must specify DNS name and ports)
Distributions	CDN configuration: content origins access (public/restricted) security (HTTP/HTTPS) cookie or query-string forwarding geo-restrictions (restrict file access at country level, can use 3rd party geo-location service for finer granularity) access logs must be disabled before delete Web Distribution: static and dynamic content (html, css, php, graphics) distributed over HTTP/HTTPS add/update/delete objects submit forms live streaming (real time event) RTMP: streaming media files using Adobe Flash Media Servers’s RTMP protocol allow consuming media file before file finished download file must be stored in S3 need both distributions to serve real time streaming with media player
Expiration	object cached for 24 hours expiration time controlled through TTL
High Availability with Origin Failover	uses an origin group, when primary origin fails automatically switches to second origin works with Lambda@Edge functions
Encryption	in-transit encryption with ACM ensure request/import certificate in Region us-east-1
Signed URLs	additional informations (e.g. expiration date) key pair private key encrypts portion of URL public key is hold by CloudFront use when: restrict access to individual files client doesn’t support cookies need a signer can be a trusted key group in CloudFront (recommended) can be an account that contains key pair only root can generate CloudFront key pair when using root can only have up to 2 active key pairs per account
Signed Cookies	allow control who can access content provide access to multiple restricted files e.g. files in subscriber area application must authenticate and send three `Set-Cookie` headers use when: multiple restricted files don’t want to change current URL
Origin Access Identity (OAI)	used in combination with signed URLs and Cookies restrict direct access to S3 bucket prevent bypassing CloudFront consist in a user associated with distribution with restrict access to S3 bucket if user requests files they are denied
WAF	web app firewall that monitor HTTP/S requests to control access to content can be based on conditions in a web ACL associated with distribution (e.g. Origin IP address, values in query strings) can also deliver custom error page (403)
Domain names	`<name>.cloudfront.net` can add domain name using Route 53 alias for other service providers can use CNAME support wildcard CNAME can move subdomains need AWS support for root domain
Charges	can use reserved capacity 12 months 10TB of data transfer in single region pay for: data transfer out to internet data transfer out to origin number of requests invalidation requests dedicated IP custom SSL field level encryption requests don’t pay for: transfer between regions and CloudFront Regional edge cache ACM SSL/TLS certificates Share CloudFront certificates
Monitoring and auditing	distributions can create access logs and cookie logs to S3 bucket (of the requests made) can analyze access logs with Athena integrated with CloudTrail which requests source IP address etc.

Route 53

Highly available Domain Name System (DNS) service. Offers domain name registry, DNS resolution, health checking of resources.

DNS service	worldwide distributed DNS service, located alongside all edge locations can use to route Internet traffic for domain registered with another domain registrar uses UDP port 53 100% uptime max domain names: 50 can increase by contacting support Private DNS: authoritative DNS within VPC without exposing DNS records
Domain	can transfer domains to Route 53 only if TLD is supported transfer domain to another registrar needs AWS support pay for domain names
Hosted zones	collections of records for a specific domain (DNS zone file) can be public or private for VPC private needs `enableDnsHostname` and `enableDnsSupport` properties private needs DHCP options set cannot automatically register EC2 instances, need scripting automatically creates NS and SOA records 4 unique name servers NS specified by FQDN health checks pointed at endpoints IP addresses or domain names can check status of their health checks can check status of CloudWatch alarm additional charging different prices for AWS vs non-AWS endpoints pay per hosted zone per month
Records	A, AAAA for IPv6, MX, NS, SOA, TXT, Alias
Alias	map resource in hosted zone works like CNAME: points the DNS name of the service can be: ELB CloudFront distribution Beanstalk environment S3 bucket as website another record of the hosted zone alias record and its target must exist in Route 53 can map custom domain names (e.g. `api.example.com`) to: API Gateway custom regional APIs API Gateway edge-optimized API VPC interface endpoints can map one DNS name to another target DNS name can be used for resolving apex/naked domain names use it when possible supports wildcard entries except NS records no charge for alias queries (as CNAME records do)
Routing policies	determine how Route 53 responds to queries Simple Routing Policy A record associate with one or more IP addresses uses round robin doesn’t support health checks failover to a secondary IP address associated with a health check route only when the resource is healthy can be used with ELB Geo-location Routing Policy used for localizing content and presenting in the language of users protects distribution rights can be used for spread load evenly between Regions if overlapping will route to the smallest geographic region can create a default IP that doesn’t map geographic location higher pricing Geo-proximity Routing Policy routing traffic based on the location of resources shift traffic from resources in one location to resources in another requires Route Flow higher pricing Latency Routing Policy database of latency from different parts of the world improve performance by routing to Region with lowest latency can create latency records for resource in multiple EC2 locations queries are more expensive Multi-value Answer Routing Policy responds to DNS queries with up to 8 healthy records selected at random Weighted Routing Policy like Simple but can specify a weight per IP address each record has a relative weight
Traffic flow	provide Global Traffic Management (GTM) create routing configurations for resources using routing types such as failover and geolocation create policies that route traffic based on specific constraints: latency, endpoint health, load, geo-proximity, geography versioning feature allows maintain history of changes to routing policies easily roll back to a previous policy version additional charges
Route 53 Resolver	bi-directional querying between on-premises and AWS over private connections used for enabling DNS resolution for hybrid clouds Resolver Endpoints: inbound query capability allowing DNS queries originate on-premises to resolve hosted domains connectivity between on-premises DNS infrastructure and AWS through Direct Connect (DX) or VPN conditional forwarding rules: outbound DNS queries rules trigger when query is made to one of those domains attempt to forward DNS requests to your DNS servers requires DX or VPN

API Gateway

Fully managed service to publish, maintain, monitor and secure APIs at any scale

can scale to any level of traffic received by an API

Endpoints	hostname for an API deployed to a specific Region `<api-id>.execute-api.<region>.amazonaws.com` Edge-optimized Endpoint (default) around the world ideal for geographically distributed clients requests are routed to nearest CloudFront Point of Presence (POP) capitalizes names of HTTP headers (e.g. `Cookie`) any custom domain name applies across all Regions Regional Endpoint same region intended for clients in same Region reduce connection overhead when small number of client with high demand or client running on EC2 instances in same Region custom domain with Route 53 can perform tasks such as latency-based routing headers name trough as-is custom domain name specific to Region any custom domain name applies across all Regions, if multiple Region Private Endpoint same VPC only accessed from VPC use an interface VPC Endpoint (ENI) headers name through as-is
API	collection of HTTP resources and methods integrated with backend HTTP endpoints Lambda functions other services organized in a resource tree according to application logic API Gateway WebSocketAPI: collection of WebSocket routes integrated with backend HTTP endpoints Lambda functions other services API method invoked through front-end WebSocket connections associated with a registered custom domain name methods: HTTP methods associated with an API resource each resource URL can be GET, PUT, POST, DELETE and ANY (catch-all) pay when your APIs are in use no minimum fees or upfront commitments pay only far receiving calls and the amount of data transferred out not for Private APIs
API throttling	maximum concurrent requests within an account: 5000 maximum requests per second: 10.000 if over maximum then 429 (`Too Many Requests`) is returned client can resubmit the failed requests in a rate-limiting way Server-side throttling limits: applied across all clients avoid API to being overwhelmed by too many requests Per-client throttling limits: clients that use API key associated with usage policy as client identifier applies settings in the following order: per-client per-methods set for a stage in a usage plan per-client throttling set in a usage plan default per-method limits (individual) in stage settings account-level throttling
Usage plans	specifies who can access API stages and methods, how much and how fast can configure throttling and quota limits enforced on individual client API keys can use together with Lambda authorizers to control access to APIs can generate API keys or import from external source (as CSV file)
Deployments	snapshot of API resources and methods must be associated with a stage for anyone to access API stage: logical reference to a lifecycle state of REST/WebSocket API e.g. dev, beta, prod, v2 identified by API ID and stage name stage variables: environment variable ARN, HTTP endpoint, parameter mapping templates can use to configure HTTP endpoint for stage can use to configure parameters passed to Lambda through mapping templates passed to the “context” object of Lambda used with Lambda aliases
Mapping templates	map/modify request/response parameters, body content, headers can map JSON to XML uses VTL (Velocity Template Language) filter output results can be specified as Integration request/response data is referenced at runtime as context and stage variables
Integration and Method	Integration request: internal interface of API map body and parameters of request to the formats required by backend Integration response: internal interface of API map status codes, headers, payload from the backend to the response format Method request: public interface of API defines parameters and body that client must send in request to access backend through API Method response: public interface of API defines status codes, headers and body models that client should expect in response from API
Integration type	AWS Integration: exposes AWS service actions must configure Integration request and response setting up mappings AWS_PROXY Integration: Lambda function invocation action (flexible, versatile and streamlined integration setup) direct interaction between client and Lambda function (Lambda proxy integration) so Integration request/response are not needed incoming request from client are input of function preferred integration type to call Lambda function HTTP Integration: exposes HTTP endpoints (custom integration) must configure Integration request and response setting up data mappings HTTP_PROXY Integration: HTTP endpoints invocation (flexible, versatile and streamlined integration setup) Integration request/response are not needed MOCK Integration: returns a response without sending the request further useful for testing purposes, simulations enable collaborative development: working with another team API without waiting it’s complete returns CORS-related headers to permite CORS access
Caching	can add cache specifying size in GB allow cache endpoint’s response reduce number of calls to backend (improve latency) TTL (default 300 seconds) defined per stage caches are provisioned for a specific stage can encrypt caches capacity between 0.5GB to 237GB can override cache settings for specific methods can flush entire cache immediately clients can invalidate with `Cache-Control: max-age=0` header data caching charged at hourly rate varies based on cache size
Security	Same Origin Policy: prevent cross-site scripting attacks Cross-Origin Resource Sharing (CORS): allows restricted resources (eg fonts) to be requested from another domain outside can enable if using JavaScript / AJAX IAM Resource-Based Policies: JSON policy document to attach to an API to control invoke permission users from specified account source IP address ranges / CIDR blocks VPCs or VPC endpoints usable for every type of endpoint IAM Identity-Based Policies: verifies IAM permissions passed by the caller leverages sigv4 capability where IAM credentials are passed in headers handles authentication and authorization great for user/roles within account Lambda Authorizer: uses Lambda to validate token in header can cache result of authentication need to implement a function user controls authentication process Cognito, instead, is self-managed Lambda must return an IAM policy for the use handle authentication and authorization good for OAuth, SAML, 3rd party auth pay per Lambda invocation Cognito User Pools (see related paragraph) can deliver temporary, limited-privilege credentials support unauthenticated users identities are not credentials identities are exchanged for credentials using STS SDK Generation of iOS, Android and Javascript reduced latency and DDoS protection using CloudFront
Logging and monitoring	log API calls, latency, error rates to CloudWatch can monitor through the API Gateway dashboard (visually) meter utilization by third-party developers integrated with CloudTrail: full auditable history of changes to API important metrics: IntegrationLatency: measure backend responsiveness Latency: measure overall responsiveness CacheHitCount and CacheMissCount: optimize cache capacities to achieve a desired performance
Open API / Swagger	can import/export Swagger / Open API 3.0 definitions (YAML or JSON): API definition as code can import through POST request with Swagger definition in the payload can update existing through PUT request can use mode query parameter in request URL to specify options

RDS

Online Transaction Processing (OLTP) type of database.

Maintenance window	maintenance window to allow DB instances modifications to take place (scaling, software patching, etc) can be defined or AWS will schedule a 30-minute window
Encryption	encrypt instances and snapshot at rest using KMS also encrypt backups and Read Replicas cannot encrypt an existing DB, need to create a snapshot, copy it, encrypt the copy then build an encrypted DB from the snapshot if master and Read Replicas are in different regions, can encrypt using the encryption key of the Region support SSL encryption between applications and DB instances generate a certificate for the instance
DB Subnet Groups	collection of subnets (private) designated for DB instances should have at least 2 AZ recommended to configure group with subnets in each AZ
Scalability	can only scale up: compute (not MS SQL, need to recreate instance from snapshot) storage (cannot decrease allocated storage) scaling compute causes downtime can be immediate or within maintenance window max DB size: 64TB max MS SQL DB size: 16TB
Storage type	use EBS volumes for DB and log storage General Purpose (SSD) - gp2: moderate I/O requirement cost effective 3 IOPS/GB burst up to 3000 IOPS Provisioned IOPS (SSD): use for I/O intensive workloads low latency and consistent I/O Magnetic: not recommended anymore
Multi-AZ	replica in another AZ (cannot choose): standby DB recommends Provisioned IOPS failover on: loss of primary AZ loss of network connectivity on primary compute unit failure on primary storage unit failure on primary primary DB instance changed patching of the OS on the primary DB instance manual failover (reboot selecting the option to failover) use the second node on failover can take 1 to a few minutes application must have connection retries and use endpoint rather than IP address alert on failover standby DB cannot be used as read node snapshots and automatic backups are performed on the standby Read Replica support Multi-AZ: combining them build a resilient disaster recovery strategy simplify database updates Read Replica in a different Region can be used as standby and promoted to new production database in case of regional disruption allow to scale reads whilst having multi-AZ multi-AZ deployments version upgrades causes outage both primary and standby are updated at the same time ensure security groups and NACLs allow application to communicate with both primary and standby
Read Replicas	used for read-heavy DBs and replication is asynchronous workload sharing and offloading read-only created from a snapshot of the master instance automated backups must be enabled on the primary asynchronous replication to update the read replica whenever there is a change to the source DB instance cannot enable automated backups on PostgreSQL up to 5 read replicas of a production DB cannot have more than four instances in replication chain can have read replicas of read replicas for MySQL and MariaDB (not PostgreSQL) can specify the AZ of read replica storage type of replicas can be different compute of replicas should be at least the performance of source in multi-AZ failover read replicas are switched to the new primary must be explicitly deleted if only source DB is deleted, replica becomes standalone single AZ instance can promote replica to primary (takes several minutes) promoted replicas retain backup retention window, backup window and DB parameter group each replica has its own DNS endpoint can create replicas of multi-AZ source can be in another Region
Snapshots	enable backup and restore of DB instances in a state as frequently as wanted cannot be used for point-in-time recovery stored in S3 remain until manually deleted backup taken within defined window I/O suspended briefly while backups may increase latency on single-AZ restored DB is always a new instance with new endpoint can restore up to the last 5 minutes only default DB parameters and security groups are restored BEST PRACTICE: take final snapshot before deleting an RDS instance snapshots can be shared across accounts
Pricing	DB instance/hours storage GB/month I/O requests/month (for magnetic storage) provisioned IOPS/month (for RDS provisioned IOPS SSD) egress data transfer backup storage (free up to the provisioned EBS volume size) multi-AZ charges for: Multi-AZ DB/hours provisioned storage double write I/Os data transfer during replication from primary to standby IS NOT charged Oracle and Microsoft SQL Licenses included can use on-demand and reserved instance pricing reserved instances (not changeable) based on DB engine DB instance class deployment type (standalone/multi-AZ) license model Region reserved instances can be moved between AZs in the same Region available for multi-AZ scaling is achieved through changing instance class and modifying storage capacity (additional storage allocation)

DynamoDB

Fully managed NoSQL database service. Stores three geographically distributed replicas to enable high availability and data durability.

Not ideal for traditional RDS apps, joins or complex transactions, BLOB data and large data with low I/O rate

Storage	ideal for session data storage BEST PRACTICE: keep item size small BEST PRACTICE: compress larger attribute values
Authentication and access control	managed by IAM identity-based policies permissions policy to user or group permissions policy to a role doesn’t support resource-based policies can use special IAM condition to restrict user access to their own records
Security	VPC endpoints encryption at rest encryption in transit (SSL/TLS)
Integrations	ElasticCache can be used in front of DynamoDB triggers integrate with Lambda RedShift: advanced business intelligence, can perform complex data analysis queries including joins Apache Hive on EMR: allow querying using SQL-like language (HiveQL) copy data from table to S3 bucket (vice versa) copy data from table into HDFS (vice versa) perform JOIN operations on tables BEST PRACTICE: store objects larger than 400KB in S3 use pointers (S3 Object ID)
TTL	automatically delete items after expiry date/time items marked for deletion allows to remove irrelevant or old data session data event logs temporary reduce storage and manage table size over time enabled per row (TTL date time column) deletes expired items within 48 hours of expiration deletes items in LSI / GSI no extra cost nor capacity use
Exponential Backoff	on network errors retries, with progressively longer waits between retries for improved flow control after 1 minute not working request size may be exceeding throughput consider offloading using DAX or ElastiCache consider increasing the WCUs
Optimistic locking	strategy to prevent writes from being overwritten by the writes of others
Table	items: size of items cannot exceed 400KB attributes BEST PRACTICE: when storing serial data use separate tables of days, weeks, months BEST PRACTICE: store more-frequently and less-frequently accessed data in separate tables BEST PRACTICE: design tables in a way that can use `Query`, `Get`, `BatchGetItem`
Primary keys	partition key: unique, input to an internal hash function that determines the partition or physical location best practices: use high-cardinality attributes use composite attributes cache popular items (use DAX for caching reads) add random numbers from predetermined range for write-heavy use cases composite key: partition key + sort key in combination two items may have same partition key but different sort key items in same partition are sorted according to the sort key
Index	data structure that allow perform fast queries on specific columns in a table run search on the index instead of the entire dataset Local Secondary Index (LSI): provides an alternative range key for table, local to the hash key can have up to 5 LSI per table must be a scalar String, Number or Binary must be created at table creation time same partition key as original table (different sort key) queries based on this sort key are much faster can query on additional values other than partition key / sort key Global Secondary Index (GSI): speed up queries on non-key attributes can create at any time different partition key and different sort key different view of data speeds up queries relating to this alternative partition and sort key is a new “table” on which project attributes on partition key and sort key of original table are always projected (KEYS_ONLY) can specify extra attributes to project (INCLUDE) can use all attributes from main table (ALL) must define RCU/WCU for the index: has to be at least the same or more as in main table to avoid throttling on main table
Transaction	make coordinate, all-or-nothing changes to multiple items provide atomicity, consistency, isolation and durability (ACID) checks pre-requisite before writing to a table write API can group multiple Put, Update, Delete and ConditionCheck actions submit the actions as a single `TransactWriteItems` that succeeds or fails as a unit performs two underlying reads or writes: one to prepare the transaction and one to commit visible in CloudWatch metrics
Scan	return one or more items by accessing every item in a table or a secondary index max 1 MB use a lot of RCUs can use `ProjectionExpression` to only return some attributes can provide a filter expression to refine results scan operations proceed sequentially can request a parallel Scan for faster performance need provide `Segment` and `TotalSegment` parameters can configure Parallel scan by dividing a table/index into segments scanning each segment in parallel BEST PRACTICE: avoid parallel scans if table/index is already incurring in heavy read/write activity from others eventually consistent reads for consistent copy of data set `ConsistentRead` parameter could use up the provisioned throughput in just a single operation (if large table) BEST PRACTICE: avoid scan if possible
Query	finds items in table based on primary key and a distinct value to search for can use `ProjectionExpression` to only return some attributes eventually consistent reads for consistent copy of data set `ConsistentRead` parameter more efficient than scan, doesn’t deteriorate on growing tables
Pages	smaller page size reduces impact of a query or scan
Stream	captures a time-ordered sequence of item-level modifications in any table and stores informations in an encrypted log for up to 24 hours enable or modify stream with `CreateTable` and `UpdateTable` log can be accessed using a dedicated endpoint, near-real time by default records only Primary key can be event source for Lambda that can, for example, writes to CloudWatch logs see how the stream is configured: `StreamSpecification` parameter `StreamEnabled` `StreamViewType`: information that will be written, can be: KEYS_ONLY (only key attributes) NEW_IMAGE (entire item after modification) OLD_IMAGE (entire item before modification) NEW_AND_OLD_IMAGES (both new and old item) stream read request unit: each `GetRecords` API call to Streams return up to 1MB of data
Partitions	allocation of storage for a table automatically replicated across multiple AZ handled entirely by DynamoDB, allocates sufficient partitions so can handle provisioned throughput requirements increase additional partitions on: increase of provisioned throughput existing partition fills capacity
Provisioned Capacity	evenly distributes provisioned throughput among partitions read capacity units (RCUs); 1 RCU can perform: 1 strongly consistent read request / second for items up to 4KB 2 eventually consistent read request / second for items up to 4KB half transactional read request / second (need 2 RCUs to perform one) for items up to 4KB write capacity units (WCUs); 1 WCU can perform: 1 standard write request / second for items up to 1KB half transactional write request / second (need 2 WCUs to perform one) for items up to 1KB Replicated Write Capacity Unit (rWCU): for global tables if access pattern exceeds 3000 RCU or 1000 WCU for a single partition, request might be throttled throttling occurs when configured RCU/WCU are exceeded (`ProvisionedThroughputExceededException`) use burst capacity effectively: DynamoDB retains up to 5 minutes of unused read and write capacity which can be consumed quickly reading/writing above the limit can be caused by: uneven distribution based on partition key frequent access to same key in partition (most popular item, hot key) request rate greater than provisioned throughput BEST PRACTICE: larger number of smaller operations will allow other request succeed without throttling
On-Demand Capacity	don’t need to specify requirements, instantly scales up and down based on activity (useful for unpredictable / spikey workloads) can switch between Provisioned Capacity / On-Demand Capacity once per day pay for what you use
Consistency models	eventually consistent reads (default): response might not reflect the result of a recent write operation (stale data) repeat read request after a short time strongly consistent reads: response with the most up-to-date data might not available when network delay or outage (HTTP 500) higher latency not supported on Global Secondary Indexes use more throughput capacity
Application Auto Scaling	enables table or GSI to increase provisioned read/write capacity to handle increases in traffic without throttling create a scaling policy for a table/GSI; specify: read/write/both capacity, minimum and maximum provisioned capacity target utilization: percentage of consumed provisioned throughput at a point in time target tracking: algorithm to adjust the provisioned throughput so that the actual capacity utilization remains at target utilization if table/GSI is created using Console, auto scaling is enabled by default
DAX (DynamoDB Accelerator)	fully managed, HA, in-memory cache that delivers up to 10x performance improvement microseconds performance, even at millions of requests per second improve only READ performance ideal for read-heavy and bursty workloads auction applications gaming retail sites special sales data is written to the cache and back-end store at the same time return data if items is in the cache (cache hit) if not in cache performs eventually consistent `GetItem` operation reduces provisioned read capacity differences with ElastiCache: optimized for DynamoDB not support lazy loading less management overhead don’t need to modify application less datastores supported pay for the capacity provisioned runs on EC2 instances cluster, charged by the node
API	`PutItem`: create data or full replacement (consumes WCU) `UpdateItem`: partial update of attributes add new item if not exists Conditional writes accept a write/update only if conditions are met for example to ensure not overwriting data on eventual consistency `GetItem` can use parameter `ConsistentRead` `DeleteItem` `DeleteTable` `BatchWriteItem`: put or delete up to 25 items in one call reduces the number of API calls and so the latency operations are done in parallel
Global tables (Cross Region Replication)	global tables provide fully managed solution for deploying a multi-region, multi-master database create identical tables in these regions and propagate ongoing data changes ideal for massively scaled applications (globally dispersed users) global table: collection of one or more replica tables replica table: single table, part of global table can have one replica table per Region strongly consistent reads/writes no supported across Regions if required must be on the same Region ensure that each replica table and secondary index in global table has identical write capacity to ensure proper replication

ElastiCache

Fully managed implementations of in-memory data stores: Redis and Memcached.

Improve latency and throughput for many read-heavy/compute-intensive workloads

Compute nodes	a maintenance window can be defined for software patching EC2 nodes cannot be accessed from the Internet or other instances in other VPCs on-demand or reserved instances (not spot)
Use cases	offload reads from a database improve latency and throughput for many read-heavy/compute-intensive workloads store the results of computations and session state streaming data dashboards: landing spot for streaming sensor data on the factory floor, live real-time dashboard displays
Memcached	ideal for database caching: use Memcached in front of RDS cache popular queries to offload work simple scalable in/out (adding/removing nodes) scalable up/down (changing node family type) supports multi-thread multi-core caches objects like database queries caches contents of a DB caches data from dynamically generated web pages ideal for transient session data ideal for high frequency counters max number of nodes per Region: 100 max number of nodes per cluster: 1-20 (soft limits) can integrate SNS for node failure/recovery notification doesn’t support multi-AZ failover or replication doesn’t support snapshots can place nodes in different AZs each node represents a partition of data
Redis	open-source in-memory key-value store ideal for load-balanced web servers, store web session information if server is lost, session info is not and can be picked up ideal for leaderboards can provide live leaderboard for millions of users of mobile app supports encryption supports HIPAA and HA replication supports clustering supports complex data types (sets and lists) data is persistent (can be used as datastore) not multi-threaded master/slave replication and multi-AZ for cross-AZ redundancy automatic failover and backup/restore (backup clusters and metadata) can restore creating a new Redis cluster and populating from a backup shard: a subset of the cluster’s keyspace include a primary node and 0 or more read replicas even across Regions scales by adding shards clustering mode: disabled: can have only one shard replication from primary node is asynchronous enabled: can have up to 15 shards recommended taking snapshots from read replicas can slow down nodes automatic and manual snapshots (S3) can only move snapshots between Regions by exporting them Multi-AZ failover: failures are detected by ElastiCache then automatically promotes the replica that has lowest replica lag DNS records remain the same but point to the IP of the new primary by enabling cluster mode and Multi-AZ failover you have fully automated, fault tolerant Redis
Cluster	collection of one or more nodes using same caching engine cannot move a cluster from outside VPC into VPC need to configure subnet groups for VPC hosting EC2 instances and cluster if not using VPC, can control access to cluster through Cache Security Groups applications connect to the cluster using endpoints no charge for data transfer between EC2 and ElastiCache within same AZ
Node	runs an instance of Memcached or Redis protocol-compliant service has its own DNS name and port failed nodes are automatically replaced controlled by VPC Security Groups and Subnet groups deployed in clusters can span more than one subnet pay per node/hour
Lazy Loading	loads data into the cache only when necessary (if cache miss occurs) avoids filling up cache with not requested data if data is not in the cache, returns null then app fetches data from database and writes it into the cache (available next time) can produce stale data without further strategies available only in ElasticCache, not in DAX
Write Through	cache is updated whenever a new write or update is made to underlying database cache data remain up to date add wait time to write operations without a TTL will end up with a lot of cached data never read
TTL	mitigates drawbacks of cache strategies specifies number of seconds until the key (data) expires when reading an expired key, application checks value in database is a cache miss for Lazy Loading

CloudWatch

Monitoring service used to collect track metrics, log files and set alarms. Monitor operational health.

Metrics	system-wide visibility into resource utilization monitor application performance time-ordered set of data points exist within a Region defined by name, namespace and zero or more dimensions metrics retention: data points of < 60 seconds available for 3 hours (high resolution) (higher charges on high resolution metric) data points of 60 seconds available for 15 days data points of 300 seconds (5 min) available for 63 days data points for 3600 seconds (1 h) available for 15 months can publish custom metrics using CLI or API statistic set: publish an aggregated set of data points resolution: standard resolution: one-minute granularity high resolution: one second granularity immediate insight into sub-minute activity can specify an high-resolution alarm with a period of 10/30 seconds
Namespace	container for metrics; metrics in different namespaces are isolated e.g.: ApiGateway, EC2, Lambda, etc.
Dimensions	`--dimension` parameter clarifies what the metric is and what data stores e.g.: by AutoScaling Group or Per-Instance metrics
Statistics	metric data aggregations Minimum: lowest value; can use to determine low volumes of activity Maximum: highest value; can use to determine high volumes of activity Sum: total volume of a metric Average: Sum / SampleCount can use to determine the full scope of a metric (how close average is to max and min) helps to know when to increase or decreases resources SampleCount: count (number) of data points pNN.NN: value of specified percentile (not available for negative values)
Alarms	automatically initiate actions action is a notification sent to SNS topic or an Auto Scaling policy invokes actions for sustained state changes only
Event	delivers near real-time stream of events describing changes in resources can use to schedule automated actions that self-trigger using cron or rate expressions targets includes: EC2 instances, Lambda functions, streams, delivery streams, log groups, ECS tasks, pipelines, SNS topics, SQS queues, etc.
API	`PutMetricData`: can specify each dimension as `MyName=MyValue` `aws cloudwatch put-metric-data --metric-name Buffers --namespace MyNameSpace [...] --dimensions Intanced=139391,InstanceType=m1.small` publishes a single metric data points create specified metric if not exist every PutMetricData API call for custom metric is charged `GetMetricStatistics`: specify each dimension as `Name=MyName, Value=MyValue`” `--namespace MyNameSpace [...] --dimensions Name=InstanceId,Value=139391 Name=InstanceType,Value=m1.small` is the same for “put-metric-alarm” API must specify a value for every defined dimension e.g.: metric `BucketSizeBytes` includes `BucketName` and `StorageType` aggregates data points based on the length of the period specified maximum number of data points: 1.440 `GetMetricData`: retrieve up to 500 different metrics in a single request `PutMetricAlarm`: creates or updates an alarm and associates with specified metric, metric math expression or anomaly detection model (this one cannot have Auto Scaling actions)
CloudWatch Logs	centralizes logs from systems, applications and services help monitor and troubleshoot systems and apps using existing custom log files (log files from EC2, CloudTrail, Route 53) Log retention: by default retained indefinitely configurable from 1 day to 10 years can use for real time application, system monitoring and long term log retention CloudTrail logs can be sent to CloudWatch Logs for real-time monitoring Log agent: install into EC2 instances to collect both logs and metrics (e.g. memory and disk utilization) collects more system-level metrics from EC2 instances collects system-level metrics from on-premises servers (hybrid environment or not managed by AWS) retrieves custom metrics from apps and services using StatsD and collectd protocols
CloudWatch Logs Insight	interactively search and analyze log data in CloudWatch Logs can perform queries (purpose-built query language) include sample queries can identify potential causes of issues and validate deployed fixes discover fields in logs from services Route 53 Lambda CloudTrail VPC any application or custom log that emits log event as JSON cannot access log events with timestamps that pre-date creation time of log group
CloudWatch Metric Filter	search and filter log data coming into CloudWatch Logs define terms and patterns to look for in log data as it is sent CloudWatch Logs uses these metric filters to turn log data into numerical metrics can graph them can set an alarm on them can assign dimensions and a unit to the metric changing the unit for the filter later will have no effect supported only for log groups in the Standard log class only publish metric data point for events happen after filter creation

CloudTrail

A web service that records activity made on account, delivers log files to an S3 bucket.

Use case	enables governance, compliance and operational and risk auditing visibility into user activity by recording actions security analysis, resource change tracking and compliance auditing
Log	logging history of API class in AWS account logs: identity of the API caller time of the API call source IP request parameters response for an AWS account, trail can: record events in all regions and deliver to an S3 bucket records events in a single region and deliver to an S3 bucket additional single trails can use the same or different bucket can integrate with CloudWatch Logs to deliver data events through a CloudWatch Logs log stream log file integrity validation feature: determines whether log file was unchanged, deleted or modified since delivered to S3 bucket
Events	data events: provide insight into the resource operations (data plane operations) e.g.: S3 object-level API activity (GetObject, DeleteObject, PutObject API) e.g.: Lambda function execution activity (Invoke API) management events: provide insight into management operations (control plane operations) include non-API events e.g.: configuring security, registering devices, configuring rules for routing data
Encryption	log files are encrypted using S3 SSE can enable KMS for additional security: single KMS key can be used to log files to all Regions
Multi account	can consolidate logs from multiple accounts using S3 bucket: turn on CloudTrail paying account create bucket policy that allows cross-account access turn on CloudTrail in the other account and use the bucket of the paying account
Alarms	no native alarming

CloudFormation

Infrastructure as Code using a template (YAML or JSON). “Template-drive provisioning”.

best practices:
- provides Python “helper scripts” which help install software and start services on EC2 instances
- use Stack Policies to protect sensitive portions of stack

Infrastructure as Code	infrastructure is provisioned consistently less time and effort than configure manually can use version control and peer review templates BEST PRACTICE: use a version control system manage updates and dependencies no charges, pay for resources
Template	can upload directly or use S3 read the template and makes API calls resulting resources are the Stack logical IDs to reference resources within the template physical IDs to reference resources outside templates after being created mandatory elements: list of resources and config values not mandatory elements: template parameters (up to 60) output values (up to 60) list of data tables
Resources	mandatory resources are declared and can reference each other `Type` `Properties`
Parameters	parameters custom values as inputs, useful for template reuse `Type` `Default` `AllowedValues` `Description` pseudo parameters predefined parameters by CloudFormation can use as argument of `Ref` function `AWS::AcountId` `AWS::NotificationARNs`: list of notification ARNs `AWS::Region` `AWS::StackId`
Mappings	matches key to corresponding set of named values fixed variables good for differentiation between regions, environments, AMIs etc. not user specific, use parameters can set value based on Region
Outputs	output values that can be imported into other stacks (cross-stack references) returned in response cannot delete a stack if outputs are being referenced by another stack can use Export Output Values to export the name of resource output for a cross-stack reference unique in Region
Conditions	statements that define the circumstances under which entities are created/configured: creation of resources based on a condition (resources and outputs): `Conditions:` `CreateProdResources: !Equals [!Ref EnvType, prod]` intrinsic functions: `If`, `Equals`, `Not`
Transform	specifies one or more macros to process template can reference additional code stored in S3 Lambda code or snippets of CloudFormation code
Intrinsic functions	Ref GetAtt: value of an attribute from a resource (YAML) `!GetAtt logcalNameOrResource.attributeName` (JSON) `Fn::GetAtt [logicalNameOrResource,attributeName]` FindInMap: value corresponding to keys in a two-level map declared in Mappings section: (YAML): `!FindInMap [MapName, TopLevelKey, SecondLevelKey]` ImportValue: value of an output exported by another stack (cross-stack references) Join: construct a string value see example in the notebook Sub: substitute variables in an input string with values specified construct commands or outputs that include values that aren’t available until create or update stack
Stack	entire environment; automatic rollback by default on error updating stacks: direct update creating and executing ChangeSet
StackSet	create, update, delete stacks across multiple accounts and Regions with a single operation can select target accounts must set up a trust relationship between the administrator and target accounts
NestedStack	allow reuse code for common use cases
ChangeSet	summary of proposed changes to see how might impact existing resources BEST PRACTICE: use to identify potential trouble spots in updates
Drift Detection	detect whether a stacks’s actual configuration differs from expected configuration work on resources that support drift detection resources not supporting are assigned with NOT_CHECKED support drift detection on private resource types provisionable provisioning type is FULLY_MUTABLE or IMMUTABLE can perform drift detection on stacks with following statuses: CREATE_COMPLETE, UPDATE_COMPLETE, UPDATE_ROLLBACK_COMPLETE, and UPDATE_ROLLBACK_FAILED not detecting drift on any nested stacks that belong to that stack can initiate a drift detection operation directly on the nested stack
Serverless Application Model	can use SAM to deploy serverless applications extension to CloudFormation for serverless applications simplified syntax for defining serverless resources: APIs, Lambda functions, DynamoDB tables, etc. can use to package deployment code, upload it to S3 and deploy serverless application AWS Serverless transform: takes an entire template in AWS SAM syntax and transforms and expands it into CloudFormation template `Transform: AWS::Serverless-2016-10-31` in resources set type of Lambda function as “`AWS::Serverless:Function`” and use syntax resource examples: `AWS::Serverless::Function` `AWS::Serverless::API` `AWS::Serverless::SimpleTable`

Using CLI

Create stack	*aws cloudformation create-stack* --stack-name *--template-body file:///filepath.yml* *--parameters ParameterKey=Parm1,ParameterValue=test1 ParameterKey=Parm2,ParameterValue=test2* *NoEcho* doesn’t mask information stored in Metadata template section Outputs Metadata of resource definition do not use for sensitive information
Describing and listing stacks	*aws cloudformation list-stacks: get a list of any of the stacks created --stack-status-filter* e.g. --stack-status-filter CREATE_COMPLETE *aws cloudformation describe-stacks* information on running stacks *--stack-name*
View stack event history	*aws cloudformation describe-stack-events* can track status of resources AWS CloudFormation is creating and deleting *--stack-name* *--resource-status*
List stack resources	*aws cloudformation list-stack-resources* summary of each resource in stack that specified with the *--stack-name* parameter
Retrieve template	*aws cloudformation get-template* --stack-name
Validate template	*aws cloudformation validate-template* *--template-body* *--template-url*
Update local artifacts to S3 bucket	some resource properties require an S3 location (bucket and file name) can specify local references instead (local artifacts) e.g. can specify S3 location of Lambda function source code instead of manually uploading and specify S3 location can specify local references can use *package* command to quickly upload them uploads directly to S3 returns a copy of template replacing local references with S3 location can use returned template to create or update a stack local artifact: path to file or folder (like Lambda function code) folder: command creates a .zip and upload it
Deploy template with transforms	use change set to create template including transformations can use *aws cloudformation deploy* creates a change set initiates and terminates change set reduce number of required steps --template /path/template.json --stack-name my-new-stack --parameter-overrides Key1=Value1 Key2=Value2

Kinesis

Collect, process and analyze real-time, streaming data (timely insight). Collection of services for processing streams of various data.

Security	control access / authorization using IAM policies encryption in flight using HTTPS endpoints encryption at rest using KMS can encrypt data on the client side VPC endpoints available
Differences with SQS	must provision throughput (not needed in SQS) ordering at the shard level (in SQS no ordering guarantee except FIFO queues)
Differences with SNS	pull data (SNS push data to subscribers) must provision throughput (not needed in SNS)
Stream	Shards: uniquely identified groups or data records in a stream; base throughput unit of data stream each shard ingest up to 1000 records/second default limit of 500 shards, can be requested increase data input max capacity: 1MB/sec data output max capacity: 2MB/sec max records/second per PUT: 1000 Record: data units stored in a Kinesis Stream Partition Key: group data by shard Sequence number data blob (up to 1 MB, before base64 encoding) Transient data store: retention from 24 hours (default) to 7 days
Kinesis Data Stream	real-time processing of streaming big data useful for rapidly moving data off data producers, continuously processing the data stores data for later processing by applications use cases: accelerated log data feed intake real-time metrics and reporting real time data analytics complex stream processing producers: continually push data to Kinesis Data Streams creates the data that makes up the stream can be used through: Kinesis Stream API Kinesis Producer Library (KPL) Kinesis Agent consumers: EC2 instances that analyze the data received from a stream (Kinesis Stream Applications) process the data in real time can store results using DynamoDB, Redshift or S3 resharding: adapt to changes in rate of data flow shard split: divide one into two (increase cost) shard merge: combine two into one KMS master key for encryption permissions to access the needed master key replicates synchronously across 3 AZs pay per shard
Kinesis Data Firehose	easiest way to load streaming data into data stores and analytics tools captures, transforms and loads streaming data enables near real-time analytics with existing business intelligence tools and dashboards Kinesis Data Stream can be the source can configure to transform data before delivering it can invoke Lambda function to transform don’t need to write an application can batch, compress and encrypt data before loading it encrypt data with existing KMS key replicates synchronously across 3 AZs maximum size for a record (before Base64 encoding): 1000 KB Source: where streaming data is continuously generated and captured Delivery Stream: underlying entity of KDF stores data records for up to 24 hours Destination: data store where data will be delivered S3 Redshift Elasticsearch Splunk no shards, tot ally automated
Kinesis Data Analytics	process and analyze real-time streaming data provides real-time analysis can use SQL queries to process data streams use cases: generate time-series analytics feed real-time dashboards create real-time alert and notifications can ingest data from Data Streams and Firehose output to: S3 Redshift Elasticsearch Kinesis Data Streams input: streaming source for application streaming data source: continuously generated data read into application reference data source: static data that app uses to enrich data coming from streaming data sources application code: SQL statements that process input and produce output output: in-application streams to hold intermediate results destinations: persist the result Kinesis Data Streams Kinesis Firehose S3 Redshift Elasticsearch can use IAM to provide permission to read from source and write to destinations
Kinesis Client Library	Java library help read records from Kinesis Stream with distributed applications sharding the read workload provides a layer of abstraction specifically for processing data in a consumer role intermediary between record processing and Data Streams differs from Kinesis Data Streams API that helps creating streams, resharding, putting and getting records worker (consumer) can be: EC2 instance Beanstalk on-premises servers connect to the stream enumerates the shards coordinates shard association with other workers (if any) instantiates a record processor for every shard pull data from stream pushes the records to the corresponding record processor checkpoints processed records balances shard-worker associations when worker instance count changes balances shard-worker associations when shards are split or merged each shard is processed by 1 worker each shard has exactly 1 corresponding record processor never need multiple instances to process 1 shard 1 worker can process multiple shards if there are 2 consumers it load balances and creates half processor on one instance and half on another scaling out consumers: ensure number of instances not exceed number of shards progress is checkpointed into DynamoDB IAM access required records are read in order at the shard level

Opensearch (Elasticsearch)

Open source, distributed search and analytics suite based on Elasticsearch. Search and analytics engine built on Apache Lucene.

Use case	log analytics, full-text search, security intelligence, business analytics, operational intelligence log analytics interactively, real-time application monitoring, website search, performance metric analysis supports multiple query languages DSL (Domain-Specific Language), SQL, PPL integrates with Logstash, OpenTelemetry and ElasticSearch APIs ELK stack (Elasticsearch, Logstash, Kibana): aggregates logs from all systems and apps, analyzes these logs and creates visualizations. Useful for infrastructure monitoring, troubleshooting, security analytics etc.
ELK stack	(Elasticsearch, Logstash, Kibana): aggregates logs from all systems and apps analyzes these logs and creates visualizations useful for infrastructure monitoring, troubleshooting, security analytics etc.
Cluster	need to specify number of instances, instance type and storage options can perform upgrades without downtime built-in monitoring and alerting with automatic notifications
VPC domain	domains can be launched into VPC enable secure communication between other services in the VPC extra layer of security to be accessible from internet VPC domains require VPN or proxy display less information cluster health (not including shard information) cannot apply IP-based access policies cannot switch later to use public endpoints cannot launch on VPC with dedicated tenancy cannot change VPC can change subnets and security groups settings user must have access to the VPC to access Dashboards
Security	encryption of data at rest (AES-256) uses KMS for storage and management encryption keys can encrypt node to node communications using TLS 1.2 once enabled cannot be disabled support access policies: Resource-based policies Identity-based policies IP-based policies fine-grained access control role-based access control security at the index, document and field level multi-tenancy HTTP basic authentication supports authentication through SAML and Cognito

SNS

Fully managed messaging service for A2A (application-to-application) and A2P (application-to-person) communication.

Pub/sub provides messaging for high-throughput, push-based, many-to-many use cases
sending notification between distributed systems, microservices, event-driven serverless applications
can send:
- SMS
- email
- SQS queues
- trigger Lambda function
- Kinesis Data Firehose
- HTTP endpoint
- platform application endpoint (mobile push)
inexpensive and based on a pay-as-you-go model
pub-sub model whereby users or applications subscribe to SNS topics:
- “access-point” for allowing recipients to dynamically subscribe for identical copies of the same notification
- stored redundantly across multiple AZ
- instantaneous, push-based delivery
can fanout messages to many subscribers including SQS queues
- SQS manages subscription and any necessary permissions
- supported for A2A messaging

SQS

Distributed queue system that enables web service applications to quickly and reliably queue messages that one component in the application generates to be consumed by another component.

Send, store and receive messages between software components
temporary repository for messages awaiting processing
acts as a buffer between producer and receiver
resolves issues if the producer work faster than the consumer
allow decoupling / loose coupling
pull-based
guarantees that your messages will be processed at least once

Limits	messages up to 256 KB messages can be kept in the queue from 1 minute to 14 days default 4 days
CloudWatch integration	automatically collects every 5 minutes. considers a queue to be active for up to 6 hours if it contains any messages or if any API action accesses it captures API calls from SQS and logs to a specified S3 bucket when using long-polling no charge in addition (no detailed monitoring)
Queue	must be unique name between a Region queue policy can specify permissions finer grained control control over the requests that come in cannot change queue type after creation
Standard queue	default queue type unlimited transactions/second (TPS) guarantee message is delivered at least once occasionally more copies of message delivered out of order best-effort ordering: ensure messages delivered in same order they are sent can have different priorities scaling is performed by creating more queues data is stored within a single, highly available Region with multiple redundant AZs
First in First Out (FIFO) queue	exactly-once processing order strictly preserved message delivered once and remain available until consumer processes and deletes it no duplicates support message groups: multiple ordered message max transactions/second (TPS): 300 same capability of standard queues deduplication MessageDeduplicationId deduplication interval of 5 minutes content based duplication: the ID is generated as SHA-256 with message body sequencing MessageGroupId strict ordering between messages messages with different MessageGroupId may be received out of order messages with same MessageGroupId delivered once
Visibility timeout	provided job is processed before the visibility timeout expires, then message is deleted if job not processed within visibility timeout, message become visible again and another reader will process it could result in a message delivered twice default timeout is 30 second can be increased (max 12 hours)
Polling	short polling (default): returns immediately (even if queue is empty) queries subset of available servers ReceiveMessageWaitTime is set to 0 more requests, higher cost long polling: doesn’t return until a message arrives in the queue or the long poll times out can be enabled at queue level can be enabled at API level using WaitTimeSeconds eliminates false empty responses by querying all servers waits until a message is available, before sending a response requests contain at least one of the available messages up to the maximum number of messages specified in the ReceiveMessage action should not be used if expect an immediate response ReceiveMessageWaitTime set to non-zero value up to 20 seconds fewer requests and reduces cost same charge per million of requests as short polling
SQS Delay Queues	postpones delivery of new messages to queue for a number of seconds messages remain invisible over that period default is 0 second, maximum 900 seconds (15 min) when enabled, doesn’t affect delay of messages already in the Standard queue when enabled, do affect delay of messages already in the FIFO queue use for: large distributed applications that need to introduce a delay in processing need to apply delay to an entire queue update to sales or stock control databases before sending a notification to a customer confirming an online transaction
SQS Extended Client Library for Java	can use to manage messages alternatively, can use S3 useful for storing and consuming messages up to 2 GB in size can use for: send message that references a single message object stored in an S3 bucket get the corresponding message object from an S3 bucket delete the corresponding message object from an S3 bucket
Security	can use IAM policies to control who can read/write messages authentication can be used to secure messages in queues in-flight security with HTTPS can enable server-side encryption with KMS only encrypt the message body, not the attributes
API	`CreateQueue` `DeleteQueue`: needs specified `QueueUrl` `PurgeQueue`: Delete messages in QueueUrl `SendMessage` `ReceiveMessage`: up to 10 can use `WaitTimeSeconds` to enable long-poll `DeleteMessage`: `RecepitHandle` to select the message `ChangeMessageVisibility`: changes visibility timeout manipulate up to 10 messages with a single action (reduce costs): `SendMessagesBatch` `DeleteMessageBatch` `ChangeMessageVisibilityBatch`

Developer tools

CodeCommit

Fully managed source control that hosts secure Git-based repositories. Secure and highly scalable ecosystem.

From source code to binaries
repositories are private
scales seamlessly
integrated with Jenkins, CodeBuild and other CI tools
continuous integration tool
can transfer files using HTTPS or SSH
encrypted repositories through KMS using customer-specific keys
monitor repositories via CloudTrail and CloudWatch
authentication:
- Git credentials: username and password pair over HTTPS
- SSH keys: public-private key pair
- AWS access key
authorization:
- user/roles with IAM
- identity-based policies (not resource-based)
- can attach tags or pass tags in a request; control access based on tags
notifications:
- SNS, Lambda:
  - on deletion of branches
  - on pushes on master
  - notify external build system
  - trigger Lambda function to perform codebase analysis
- CloudWatch Event rules
  - on pull request updates (created, updated, deleted, commented)
  - on commit comment
  - event rules go into an SNS topic

CodeBuild

Fully managed CI service that compiles source code, runs tests and produces packages to deploy.

Scales continuously, multiple builds concurrently, builds are not left waiting in queue
alternative to other tools like Jenkins
can extend capabilities by custom Docker images

Build project	location of the source code build environment to use build commands to run and where to store the output
Build environment	operating system programming language runtime build tools (Maven, Gradle, npm, etc.) Preconfigured: for Java, Python, Node.js, Ruby, Go, Android, .NET Core for Linux and Docker Custom: custom environment can package runtime and tools into a Docker image and upload to ECR, then specify the location of Docker image and CodeBuild will pull
Build specification	YAML file that describes collection of commands and settings to run a build need a buildspec.yml at root of source code can define specific commands such as installing tool packages, run unit tests and packaging code has sample build specification files for common scenarios Maven, Gradle or npm can define environment variables: plaintext variables secure secrets using SSM parameter store phases: install: install dependencies pre-build: final command before build build post build: finishing touches (e.g. zip file output)
Build	source code from GitHub, CodeCommit, S3 etc. timeout: build in queue not started after x minutes is removed default timeout: 8 hours can override timeout with value between 5 minutes and 8 hours Artifacts: uploaded to S3 encrypted with KMS IAM permissions for build VPC for network security pay based on time to complete builds
Cache	file to cache (usually dependencies) to S3 for future builds
Monitoring and debugging	CloudTrail for logging API calls CloudWatch alarms to detect failed builds can run locally for deep troubleshooting using Docker leverages CodeBuild agent

CodeDeploy

Deployment service that automates application deployments to EC2 instances, on-premises instances, serverless Lambda functions or Amazon ECS services.

Can deploy Lambda functions, configuration files, executables, packages, scripts, multimedia files, etc.
integrates with CI/CD tools (Jenkins, GitHub, Atlassian, CodePipeline)
fully managed

Application	information of what to deploy and how on EC2/On-premises, Lambda, ECS
Deployment Group	(dev, test, prod, etc.) deployment configuration: set of rules success / failure conditions notification configuration for deployment events CloudWatch alarms to monitor a deployment deployment rollback configuration
In-place deployment	only EC2/On-premises application on each instance in the deployment group is stopped, latest application is installed, new version is started can use a load balancer so that each instance is deregistered during its deployment and then restored after deployment is complete
Blue/green deployment on EC2	not on-premises instances are provisioned for the replacement environment latest application revision is installed on the replacement environment optional wait time occurs for activities (testing, verification) instance in the replacement environment are registered with ELB causing traffic to be rerouted to them instances in the original environment are deregistered, can be terminated or kept running for other users replacement: can use Auto Scaling group as template for replacement environment (e.g. number of running instances) can specify instances to be counted as replacement
Blue/green deployment on Lambda	traffic is shifted from current serverless environment to one with updated Lambda function versions traffic shifting configured in deployment configuration: linear: equal increments with an equal number of minutes between each increment canary: two increments, from predefined canary options can specify percentage of traffic shifted in first increment and the interval before second increment all-at-once: all traffic is shifted all at once only way for Lambda compute platform deployments do not need to specify deployment type
Blue/green deployment on ECS	traffic is shifted from the task set with the original version of the app to a replacement task set traffic shifting configured in deployment configuration: linear canary all-at-once test listener can serve traffic to the replacement task while validation tests are run
Deployment on EC2	instances are identified by tags or Auto Scaling Group names instance proper must have IAM instance profile CodeDeploy agent must be installed on each instance instances are grouped by Deployment Group
appspec.yml	file must be at the root of the source code can be appspec.yaml for ECS or Lambda) files: specifies how to source and copy from S3 / GitHub to filesystem hooks: set of instructions to be run to deploy the new version EC2 (examples): ValidateService: last deployment lifecycle event used to verify the deployment was completed successfully AfterInstall: can use for tasks such as configuring application or changing file permissions ApplicationStart: typically use to restart services that were stopped during ApplicationStop AllowTraffic: during this deployment lifecycle event, internet traffic is allowed to access instances after a deployment this event is reserved for the CodeDeploy agent and cannot be used to run scripts ECS (examples): BeforeInstall AfterInstall AfterAllowTestTraffic BeforeAllowTraffic AfterAllowTraffic Lambda (examples): BeforeAllowTraffic: specify task or functions to run before traffic is routed to new function AfterAllowTraffic: specify the tasks or functions to run after the traffic has been routed to new function for ECS: must specify task definition ARN (TaskDefinition) must specify where load balancer reroutes traffic during a deployment (LoadBalancerInfo) for Lambda: `-myLambdaFunction` CurrentVersion TargetVersion Revision: includes everything needed to deploy the new version AppSpec file, application files, executables, config files

CodePipeline

Fully managed continuous delivery service: automate release pipelines (build, test, deploy) every time there is a code change. Enables to deliver features and updates rapidly and reliably.

Integrates with GitHub or custom plugins
structured in:
- pipelines: workflow that describes how software changes go through release process
- artifacts: files or changes worked on by stages
  - each pipeline stage can create artifacts
  - artifacts are passed, stored in S3 and then passed to the next stage
- stages: can be build, deploy, test, load test, etc
  - can define manual approval
  - source: S3, CodeCommit, GitHub, ECR, Bitbucket Cloud
  - build: CodeBuild, Jenkins
  - deploy: CloudFormation, CodeDeploy, ECS, Elastic Beanstalk, Service Catalog, S3
- actions: stages contain at least one action on artifacts as input, output or both
- transitions: processing from one stage to another inside of a pipeline
code changes pushed to repository automatically enter workflow
can create SNS notification on state changes (CloudWatch Events)
- failed pipelines
- cancelled stages
can audit API calls with CloudTrail
need IAM service role attached to the pipeline with permissions
only pay for what use; no upfront fees or long-term commitments

CodeStar

Unified user interface to easily manage software development activities. Fast setup of CD toolchain.

Makes easy to work in team, manage access and add owners, contributors and viewers
can easily track progress across development process
useful for unified development toolchain with collaboration between team members, synchronization, centralized management of CI/CD pipeline
templates:
- project templates: websites, web applications, web services, Alexa skills, etc.
- templates with code for getting started on supported programming languages: Java, JavaScript, PHP, Ruby, Python, etc.
support IDEs:
- Cloud9 natively
- Visual Studio, Eclipse, etc.
no additional charge

X-Ray

Helps to analyze, debug and trace production, distributed applications using microservices architecture.

Capabilities	can understand how app and underlying services are performing to troubleshoot root cause of performance issues and errors provide end-to-end view of requests show map of app underlying components can analyze development, production, from simple three-tier app to complex microservices apps should not be used as an audit or compliance tool doesn’t guarantee data completeness can view and filter data by properties such as: annotation value, average latencies, HTTP response status, timestamp, database table used, etc.
Applications support	EC2 / On-premises Linux system must run X-Ray daemon EC2 need instance role ECS/EKS/Fargate create Docker image that runs daemon or use X-Ray Docker image ensure port mapping and network settings and IAM task roles Lambda need X-Ray integration to be ticked in configuration Lambda will run daemon IAM role is the Lambda role Elastic Beanstalk set configuration in console or use `.ebextensions/xray-daemon.config`
SDK	captures metadata for requests made to MySQL and PostgreSQL databases (RDS, Aurora) and DynamoDB captures metadata for requests to SQS and SNS installed in the application and forwards to X-Ray Daemon which forwards to X-Ray API interceptors: add them to code to trace incoming HTTP requests client handler to instrument SDK client that app uses to call other AWS services HTTP client to use to instrument calls to other internal and external HTTP web services X-Ray console to visualize what is happening
X-Ray Agent	can assume a role to publish data into different account
Trace	set of data points that share same trace ID
Segments	single component that encapsulates all data points e.g. authorization services of distributed application
Subsegments	more granular timing information and details about downstream calls additional details about a call to a service, external HTTP API or SQL database can define arbitrary subsegments to instrument specific functions or lines of code use for services that don’t send their own segments (DynamoDB): subsegments generate inferred segments and downstream nodes on the service map can see all downstream dependencies, even if they don’t support tracing or are external
Annotations	system-defined or user-defined data associated with a segment system-defined: include data added to the segment by services user-defined: metadata added to a segment by a developer key/value pairs used to index traces and with filters can use to record information on segments or subsegments (indexed for search)
Sampling	X-Ray (to be performant and cost-effective) doesn’t collect data for every request, but a statistically significant number of requests
Metadata	key/value pairs, not indexed and not used for searching

IAM

Identity and Access Management Service, centralized control of account; enables shared access; not used for application-level authentication.

Infrastructure	global (not per region) eventually consistent replicate across multiple data centers support PCI DSS compliance
CLI	by obtaining security credentials from STS `aws sts get-session-token`
API	request temporary security credentials, MFS parameters in the STS API request can use IAM Query API to make direct calls
SDK	can use to make programmatic API calls
Root user	BEST PRACTICE: don’t use (except for billing) BEST PRACTICE: don’t share root credentials BEST PRACTICE: create IAM admin user with permissions instead
Princial	entity that can take action on resource can be users or roles first principal: administrative IAM user support federated users con configure Identity Federation allowing secure access to resources without creating IAM user account support programmatic access
User	entity representing a person or a service (service accounts) up to 5000 users per account unique ID BEST PRACTICE: enable MFA can define password policy enforcing password length, complexity, etc. not apply to root BEST PRACTICE: configure strong password policy can allow/disallow to change password groups: collection of users with policies attached can use to assign permissions to users cannot nest groups
Role	created and assumed by trusted entities define a set of permissions for making service requests can delegate permissions to resources for user and services without using permanent credentials can be assigned to federated user who signs in using external IDP EC2 instances: applications retrieve temporary security credentials from instance metadata BEST PRACTICE: use roles for applications on EC2 instances instance profile: grant applications running on EC2 instances to API requests only one role at instance at time CLI `aws iam create-instance-profile` `aws iam add-role-to-instance-profile` `aws iam list-instance-profiles(-for-role)` `aws iam get-instance-profile` `aws iam remove-role-from-instance-profile` `aws iam delete-instance-profile` role delegation: trust between two accounts, one owning resource (trusting account), one containing users (trusted account) can be same account separate accounts of same organization different organizations accounts permissions policy: grant user of role the required permissions on resource trust policy: specify trusted account members that are allowed to assume role BEST PRACTICE: use roles to delegate permissions
Requests	Action: operation that principal want to perform defined by services can be viewing, creating, editing, deleting, etc. Resource: entity that exists within a service upon which actions are performed Principal information environment from which request is made etc. Request context principal (requester) aggregate permissions associated with principal environment data: IP address, user agent, SSL status, etc. resource data
Authentication	principal sending a request must be authenticated; methods: console password access key (for API and CLI): combination of access key and secret access key can use for programmatic calls can create, modify, view or rotate secret access key is returned only at creation time (if lost requires new key) must ensure stored securely can disable a user access key (IAM Identity Center) server certificate: SSL/TLS certificate to authenticate use only when must support HTTPS in Region not supported by Certificate Manager BEST PRACTICE: change regularly key and password
Policies (authorization)	stored in JSON documents values from request context to match policies can apply to users, groups and roles most restrictive policy is applied can use IAM policy simulator tool to understand effects BEST PRACTICE: validate your policies BEST PRACTICE: least-privilege principle actions on specific resources at specific conditions condition: element to apply further conditional logic BEST PRACTICE: use policy conditions for extra security User (identity) based policies attached to identity (user, group or role) specify what identity can do IAM permissions boundaries: maximum permission on identity based policy can grant to an entity Resource-based policies attached to resources (S3, SQS, VPC endpoints, KMS, etc.) can specify who has access and what can perform on resource only inline (not managed) AWS Organizations service control policies (SCPs): specifies maximum permissions for an organization or OU Session policies: parameters passed when programmatically create temporary session for a role or federated user evaluation logic: default: all requests denied explicit allow overrides implicit deny explicit deny overrides any explicit allow Managed policy created by AWS for common use cases cannot change permissions same policies are designed for specific job function Administrator Billing Database Administrator Data Scientist Power User Network Administrator System Auditor System User System Administrator View-only User Customer managed policy: standalone policy that administrator creates Inline policy: 1:1 relationship between entity and policy when deleting entity, inline policy is deleted BEST PRACTICE: use managed policies instead of inline Billing policy is not enough: need to activate access to Billing to each user that needs it
Security Token Service	web service that enables to request temporary, limited-privilege credentials for IAM users or federated users use cases: identity federation enterprise identity federation authenticate users in organization’s network without creating AWS identities single sign-on approach to temporary access support SAML 2.0 custom federation broker web identity federation let users sign in using known third-party identity provider such Amazon, Facebook Google or any OIDC exchange credentials from provider for temporary permissions to resources Cognito recommended for identity federation for mobile applications support same identity providers as STS support unauthenticated access let migrate user data when user sign in provide API operations for sync user data between devices roles for cross-account access use user identities in different account of same org delegation approach to temporary access roles for EC2 EC2 instances that need access to resources temporary security credentials available to all apps in the instance don’t need to store any long-term credentials available as global service all STS requests go to a single endpoint (`sts.amazonaws.com`) can send STS requests to endpoints of any Region (reduce latency) support CloudTrail which records calls and deliver log into S3 bucket BEST PRACTICE: monitor activity similar to long-term key credentials temporary can configure to last anywhere not stored with user but generated dynamically user can request new credentials if has permissions temporary security credentials have limited lifetime no need to rotate or revoke cannot be reused security credentials consist of: access key: access key ID + secret ID session token expiration or duration of validity can use API to request session token: `AssumeRole`: IAM users `AssumeRoleWithSAML`: user passing SAML auth response that indicates auth from known trusted IDP `AssumeRoleWithWebIdentity`: user passing web identity token from known trusted IDP `GetSessionToken`: IAM user or root `GetFederationToken`: IAM user or root (not MFA)
Cross Account Service	useful for separate AWS accounts e.g. development and production resources resource-based policy: needed permissions on resource in different accounts identity-based policy: assuming a role with needed permissions within different account
IAM Access Analyzer	help identify resources that are shared with external entity help identify unused access BEST PRACTICE: remove unnecessary credentials validate policies against policy grammar and best practices custom policy checks: help validate policies against specified security standards generate policies based on access activity in CloudTrail logs
Access Advisor	use data analysis to help set permission guardrails confidently provide service last accessed information for accounts organizational units organization managed by Organizations permissions guardrails help control which services can be accessed by developers and applications can use service control policies (SCP) to restrict access to service can determine the services not used by IAM users and roles

Cognito

Identity Broker that lets add user sign-up, sign-in and access control to web and mobile apps quickly and easily. Provides authentication, authorization and user management.

Federation	work with external IDP that support SAML or OpenID connect social identity providers federation allows users to authenticate with a Web IDP authenticate first with the Web ID provider and receives an auth token then exchanged for temporary credentials to assume an IAM role allowing access to required resources can integrate custom IDP
Temporary credentials	no need for the application to store credentials locally provides temporary security credentials to access app’s backend resources in any service behind API Gateway
User pools	authentication user directories that provide sign-up and sign-in options for application users is an IDP built-in, customizable web UI social sign-in MFA checks for compromised credentials account takeover protection phone and email verification customized workflow through Lambda triggers pre sign-up Lambda trigger post confirmation Lambda trigger pre authentication Lambda trigger post authentication Lambda trigger pre token generation Lambda trigger custom message Lambda trigger migrate user Lambda trigger invoked when user doesn’t exist in the user pool after Lambda returns success, Cognito creates the user in the user pool invoked in the forgot-password flow challenge Lambda trigger: user pool custom authentication flow define auth challenge invoked to initiate custom auth flow create auth challenge invoked after “define auth challenge” to create custom challenge verify auth challenge response invoked to verify response from end user for custom challenge is valid or not can incorporate new challenge types e.g. include CAPTCHAs e.g. include dynamic challenge questions API: `InitiateAuth` `RespondToAuthChallange` after authentication, issue a JWT to secure access to APIs can be seen as Active Directory
Identity pools	authorization create unique identities for users and authenticate them width IDP can obtain temporary, limited-privilege credentials uses Push Synchronization to push updates and synchronize user data across multiple devices silent push notification using SNS to all devices support following IDP public providers: login with Amazon, Facebook, Google Amazon Cognito User Pool OpenId or SAML IDP Developer Authenticated Identities can be seen as a IAM role
Cognito Sync	client library that enables cross-device syncing of application-related user data cache data locally so app can read/write data regardless of device connectivity status similar to AppSync can synchronize across devices (not users as AppSync does) push sync Cognito streams: allow push dataset change to Kinesis stream in real-time Cognito events: allow execute Lambda function in response to events in Cognito function can evaluate and manipulate data before sync other devices e.g. issuing an award when player reaches new level

KMS

Highly available key storage, management and auditing solution for encrypt.

Key	Alias: unique alias and description Key material: used to encrypt and decrypt data Metadata key ID creation date description key state can generate keys in KMS, in CloudHSM cluster or import them cannot export keys (CloudHSM allows this) support for symmetric and asymmetric keys encryption key are Regional encrypt data up to 4KB BEST PRACTICE: recommended to delete keys no longer in use
AWS Managed Key	can only be used by service that created them in a particular Region created on the first time encryption is implemented in the service do not pay a monthly fee can be subject to fees for use in excess of free tier cost covered in some services
Customer Managed Key	greater flexibility can perform rotation, governing access and key policy configuration can enable and disabled when no more required monthly fee and a fee for use in excess of free tier
AWS Owned key	collection of keys that service owns and manage in multiple accounts cannot view, use, track or audit them no charge
Data key	key used for encrypt data, including large amounts of data can use KMS keys to generate, encrypt and decrypt data keys KMS doesn’t store, manage o track data keys must use and manage outside KMS if ta service encrypt data than will use data keys protected by master key no limits on number of data keys integrated with client-side toolkit that use a method known as envelope encryption to encrypt data generates data keys used to encrypt data and are themselves encrypted using master keys KMS key encrypt data key (envelope key) envelope key decrypt data
Usage policies	determine which users can use keys to encrypt/decrypt data and under which conditions
Master key	protected by hardware security modules (HSMs) and are only ever used within those modules can submit data directly to KMS to be encrypted/decrypted using master keys service stores data key along with encrypted data when service needs to decrypt data, it requests KMS to decrypt the data key using the master key all request to use master keys are logged to CloudTrail can schedule a deletion (7 to 30 days) during the waiting period can verify the impact of deletion in applications can cancel key deletion max master keys per account per Region: 1000 (excluding managed key)
Custom key store	combine CLoudHSM with KMS can configure a custom CloudHSM cluster and authorize KMS to use it as a dedicated key store rather than the default key store master keys generated in custom key store never leave the cluster in plaintext all the operations that use those keys are only performed in HSMs
API	can use KMS APIs directly to encrypt and decrypt data using master keys encrypt (`aws kms encrypt`) encrypts plaintext into ciphertext by using master key can use to move encrypted data from one Region to another decrypt (`aws kms decrypt`) re-encrypt (`aws kms re-encrypt`) can use to change the customer master key can use when you manually rotate enable-key-rotation automatic rotation of key material for specified symmetric customer master key cannot perform on key in different account GenerateDataKey (`aws kms generate-data-key`) GenerateDataKeyWithoutPlaintext (`aws kms generate-data-key-without plaintext`) generates unique symmetric data key returns data key encrypted under a customer master key GenerateDataKeyPair requests an asymmetric data key pair GenerateDataKeyPairWithoutPlaintext

Secret Manager

Protect secrets needed to access applications, services and IT resources. Enables to rotate, manage, retrieve database credentials, API keys, and other secrets.

can configure VPC endpoints to keep traffic
can use client-side caching libraries to improve the availability and reduce the latency

Secret storage	key/value type: String or Binary (encrypted) store API keys, OAuth tokens charges apply per secret
Rotation	Rotation with built-in integration for RDS, Redshift and DocumentDB can extend rotate other secrets by modifying sample Lambda functions automatic key rotation for some services (RDS) for others use Lambda
Security	encrypts secrets at rest using KMS transmits secret securely over TLS fine-grained permissions IAM resource-based policies
Auditing and monitoring	auditing and monitoring audit secret rotation, third-party services and on-premises
SSM Parameter Store	No native key-rotation, can use custom Lambda key/value type: String, StringList, SecureString (encrypted) hierarchical keys free for standard, charged for advanced can store: passwords database strings AMI IDs license codes parameter values can store as plain text or encrypted data can reference in scripts, commands, SSM documents and config automation workflows by using unique name of the parameter

Other services

AWS Systems Manager	operations hub for AWS applications and resources secure end-to-end management solution for hybrid and multi cloud environments enables secure operations at scale AppConfig: help create, manage, and deploy application config and feature flags support controlled deployments to applications of any size can use with applications hosted on: EC2 instances Lambda containers mobile applications edge devices include validators
AWS Systems Manager Session Manager	can manage EC2 instances, edge devices, on-premises servers and VMs can use interactive one-click browser-based shell or CLI secure and auditable node management without: open inbound ports maintain bastion hosts manage SSH keys useful for: improve security and audit posture reduce operation overhead centralizing access control reduce inbound node access monitor and track managed node access and activity close down inbound ports allow connections to managed nodes that don’t have public IP grant and revoke access from single location one solution to users for Linux, macOS and Windows Server want to connect to managed node with just on click from CLI no use of SSH keys
AWS Systems Manager Parameter Store	See section related to Secret Manager.
Resource Access Manager (RAM)	enable to share resources easily and securely with any account of Organization can share Subnets, License Manager configurations, Route 53 resolvers, etc. eliminates the need to create duplicate resources in multiple accounts create resource centrally in multi-account environment: create a Resource Share specify resources specify accounts Reduces Operational Overhead Improves Security and Visibility: leverages existing policies and permissions (IAM) comprehensive visibility into shared resources to set alarms and view logs through CloudWatch and CloudTrail Optimize Costs: leverage licenses in multiple parts of company no additional cost
Cloud Development Kit (CDK)	open-source soft-dev framework for defying cloud infrastructure in code providing through CloudFormation Infrastructure as Code (IaC) CDK Construct Library: pre-written modular pieces of code (constructs) can integrate to develop infrastructure quickly reduce complexity required to define and integrate services together CDK Toolkit: command line tool for interacting with CDK apps create, manage and deploy CDK projects can define constructs with programming languages: support TypeScript, JavaScript, Python, Java, C#, .Net and Go can compose constructs into stacks and apps can deploy CDK apps to CloudFormation to provision/update resources creating an app: create app from template initialize app `cdk init` build the app (optionally to catch errors and syntax) BEST PRACTICE: synthesize one or more stacks to create CloudFormation template catch logical errors deploy stacks to account `cdk deploy`
AppSync	can synchronize mobile app data across devices and users support for additional devices and data types based on GraphQL
Serverless Application Repository	managed repository for serverless applications store and share reusable applications don’t need to clone, build, package or publish source code before deploying can use pre-built applications in your serverless architectures help reduce duplicated work ensure organizational best practices get to market faster integration with IAM: resource-level control of each application can publish built applications and share with specific accounts publicly shared apps include link to source code package applications with SAM template that defines resources used no additional cost, pay for resources
Step Functions	coordinate the components of distributed applications as a series of steps in a visual workflow define steps of workflow in the JSON-based Amazon States Language visual console graphs each step start an execution, the console highlights real-time status managed workflow and orchestration platform scalable and highly available can create tasks, sequential steps, parallel steps, branching paths or timers apps can interact and update the stream via Step Function API built-in error handling: retry failed, timed-out tasks, catch specific errors, recover gracefully automatic scaling: underlying compute automatically scales in response to changing workloads execution event history: detailed event log (where and why) high availability: built-in fault tolerance; multiple AZ administrative security: IAM policies to control access pay only for the transition from one step to the next (state transition) metered by state transition regardless of how long each state persists (up to one year)
Fault injection simulator	fully managed service for run fault injection experiments improve application performance, observability and resiliency used in chaos engineering, stressing an application in testing/production environment by creating disruptive events helps create real-world conditions needed to uncover the hidden bugs, monitoring blind spots, performance bottlenecks in distributed systems
Trusted advisor	inspect environment and makes recommendations to: save money improve system availability and performance help close security gaps Basic or Developer Support plane access all checks in the Service Limits category access 6 checks in the Security category Business, Enterprise On-Ramp or Enterprise Support plan can use API (as well as console) access all checks can use CloudWatch Events to monitor status of checks
AWS Billing - Consolidated billing	use in AWS Organizations billing and payment for multiple accounts can track charges across multiple accounts can download combinate cost and usage data can combine usage across accounts and share: volume pricing discounts RI discounts Saving Plans no extra charge
AWS Budget	can use to track and take action on costs and usage monitor aggregate utilization and coverage metrics for RI or Savings Plan can enable simple-to-complex cost and usage tracking setting monthly cost budget with fixed target amount to track all costs alert for actual and forecasted setting monthly cost budget with variable target amount to track all costs setting daily utilization to track RI or Savings Plan update up to 3 times a day updates occur 8-12 hours after previous types of budgets: cost budgets: how much to spend on a service usage budgets: how much to use on services RI utilization budgets: see if RI are unused or under-utilized receive alert when RI usage falls below threshold RI coverage budgets Savings Plans utilization budgets Savings Plans coverage budgets

Use cases table

Action	Service
Manage hybrid and multi-cloud environments	AWS Systems Manager
Create, manage and deploy application config and feature flags	AppConfig
Generate resource-based access policies	AWS Policy Generator
Access VPC instances for management (SSH or RDP)	AWS System Manager Session Manager (recommended) EC2 Bastion host
Speed up queries on DynamoDB table on non-key attributes	DynamoDB Global Secondary Index
Node management without open inbound ports, maintain bastion host and manage SSH keys	AWS Systems Manager Session Manager
Upload libraries to Lambda functions without including them in deployment package	Lambda Layer
Compute unpredictable workloads (dev and test)	EC2 On-Demand
Centralize access control	AWS Systems Manager Session Manager
Define a rule on object transition to another storage class	S3 Bucket Lifecycle rules
Create or update an alarm and associated with specified metric, math expression or anomaly detection model	CloudWatch API
Share resources (Subnets, License Manager, configs, Route 53 resolvers) with accounts or Organizations	AWS Resource Access Manager
Increase by 10 times the performance of DynamoDB	DynamoDB Accelerator (DAX)
Create resource in multi-account environment	AWS Resource Access Manager
Restrict user access to his records of a DynamoDB table	IAM Condition
Map custom domain names to API Gateway custom regional API	Route 53 Alias
Defyne cloud infrastructure in code using programming languages	Cloud Development Kit (CDK)
Coordinate components of distributed applications as series of steps in visual workflow	AWS Step Functions
Run fault injection experiments (improve performance, observability and resiliency)	AWS Fault injector simulator
Accelerated log data feed intake	Kinesis
Log API calls from SQS to S3 bucket	CloudWatch
Avoid API to being overwhelmed by too many requests	API Gateway Server-side throttling limits
Log calls from STS to S3 bucket	CloudTrail
Real-time processing of streaming big data	Kinesis Data Stream
Specify percentage of consumed provisioned throughput of DynamoDB at a point in time	DynamoDB Target utilization
Real-time data analytics with SQL	Kinesis Data Analytics
Log bucket and object-level actions	CloudTrail
Enable table or GSI to increase provisioned read/write capacity to handle traffic variations without throttling	DynamoDB Application Auto Scaling
Real-time analytics with existing business intelligence tools and dashboards	Kinesis Firehose
*Log actions taken by users, roles, services on S3 objects for auditing and compliance*	S3 Server Access Logging
Capture, transform and load streaming data into data store or analytics tools	Kinesis Firehose
Measure backend responsiveness of API	CloudWatch IntegrationLatency metric
Configure a fully automated, fault tolerant in-memory storage	ElastiCache Redis Multi-AZ
Map custom domain names to VPC interface endpoints	Route 53 Alias
Leverage CloudFront Edge Locations to transfers files over long distances between client and bucket	S3 Transfer Acceleration
Monitor health of serverless app via execution status	Lambda Destination
Enforce standardized tagging	AWS Config EC2 custom scripts
Read stream records with distributed applications sharding workload	Kinesis Client Library
Read-heavy database replication	RDS Read Replicas
Create streams, reshard, put and get records in streams	Kinesis Data Stream API
Enable SSL certificates on Application Load Balancer	AWS Certificate Manager
Send, store and receive messages between software components	SQS
Introduce a delay in processing of large distributed applications	SQS Delay Queues
Provide temporary access to specific S3 object to those who don’t have AWS credentials	S3 pre-signed URLs
Increase read performance of auction applications, gaming, retail sites or special sites	DynamoDB Accelerator (DAX)
Rename an S3 object, or change its storage class or rest encryption	S3 Copy
Update sale or stock control database before sending notification to confirm transaction	SQS Delay Queues
Setup a global table with replicas in different Regions	DynamoDB Cross Region Replication
Send, get or delete message that references message object stored in S3 bucket	SQS Extended Client Library
Back-up or restore a database	RDS Snapshot
Deny request with specific header or IP address to access S3 bucket	S3 Bucket Policies
Store data from streams	DynamoDB Redshift S3 Elasticsearch
Automate release of Lambda function	CodePipeline CodeDeploy
Checkpoint progress of stream	DynamoDB
Prevent cross-site scripting attacks on APIs	API Gateway Same Origin Policy
Customize CloudFront content, request and response at lowest network latency	Lambda@Edge
Encrypt streams	KMS
Bring publicly routable IPv4/IPv6 address range from on-premises to AWS	EC2 BYOIP
Localize content and presenting in the language of users	Route 53 Geo-location Routing Policy
Change S3 object metadata	S3 Copy
In-transit message encryption	SQS HTTPS
Support live streaming (real time event)	CloudFront Web Distribution
Allow signed request to read object ACL	S3 Object ACL
Allow authentication to APIs with OAuth, SAML or 3rd party auth	Lambda Authorizer
Protect object against accidental deletion	S3 Versioning
Server-side message encryption	KMS
Debug and trace distributed applications using microservices	X-Ray
Control who can read/write messages	IAM Policies
Host server-bond software licenses that uses metrics like per-core, per-socket or per-VM	EC2 Dedicated host
Ensure client to be bound to an individual back-end instance (e.g. WebSocket)	ALB sticky sessions
Cache in-memory with less management overhead	DynamoDB Accelerator (DAX)
Protect distribution rights	Route 53 Geo-location Routing Policy
Retrieve up to 500 metrics in a single request	CloudWatch API
Control access to cache cluster without using VPC subnet groups	ElastiCache Cache Security Groups
Delegate permissions for user/services without permanent credentials	IAM Role
Copy an EBS volume	EBS snapshot
Storage for frequently accessed big data at low cost	EBS HDD Throughput Optimized
See the underlying reads or writes performed by a DynamoDB Transaction	CloudWatch
Host static website	S3 Bucket static website
Scale out ECS tasks using CPUUtilization metric	ECS Step Scaling Policies
Measure overall responsiveness of API	CloudWatch Latency metric
Define APIs as code	Swagger Open API
Upload dependency to Lambda function larger than 50MB	S3
Understand effects of IAM policies	IAM Policy Simulator
Specify maximum permissions for an organization	AWS Organizations service control policies (SCP)
Increase IOPS redundancy at same performance	EBS RAID 1
Execute advanced business intelligence and perform complex data analysis queries	RedShift
Setup direct interaction between client and Lambda function through an API	API Gateway AWS_PROXY Integration
Storage with low-latency for I/O intensive databases or boot volumes	EBS SSD Provisioned IOPS
Allow restricted resources (e.g. fonts) to be requested from another domain outside through an API	API Gateway Cross-Origin Resource Sharing
Scale an ELB target group	Auto Scaling Group
Send notification to SNS topic or invoke Auto Scaling policy action on metric sustained state change	CloudWatch Alarms
Take advantage of unused capacity in the cloud	EC2 Spot Instance
Move S3 object across location	S3 Copy
Route traffic based on location of resources	Route 53 Geo-proximity Routing Policy
Identify resources shared with external entity	IAM Access Analyzer
Attach boot volume for low latency apps for dev and test	EBS SSD General Purpose
Automatically delete items in DynamoDB table	DynamoDB TTL
Store BLOB data with low I/O rate	RDS
Validate policies (against syntax, best practices or custom checks)	IAM Access Analyzer
Determine request, IP address, who made the request and when on EC2 instance	CloudTrail
Generate policies based on access activity in CloudTrail logs	IAM Access Analyzer
Storage with low latency but don’t need persistence on instance termination	EC2 Instance Store
Configure throttling and quota limits enforced on individual client API keys	API Gateway Usage plans
Enable SSL on Elastic Beanstalk serverless application	AWS Certificate Manager Elastic Beanstalk CLI
Perform query on DynamoDB table primary key on different sort key	DynamoDB Local Secondary Index
Load data in ElastiCache cache only when necessary	ElastiCache Lazy Loading
Control which services can be accessed (permissions guardrails)	Access Advisor
Define a scaling policy to scale basing on set of step adjustments	Auto Scaling Step Scaling Policy
Store infrequently accessed data in a durable, immediately available class	S3 Standard-IA
Upload files larger than 100MB	S3 Multipart Upload
Update dashboard to least amount of delay from 1KB SQS messages sent seldom	SQS Long polling
Allow consuming media files before file finished download (media streaming)	CloudFront RTMP Distribution
Get last accessed information for accounts or organizations	Access Advisor
Serve Web Socket APIs	API Gateway WebSocket API
Audit history of changes to API	CloudTrail
Allow any authenticated user to read object data and metadata	S3 Object ACL
Monitor HTTP/HTTPS requests to control access to CloudFront content	AWS WAF
Restrict access to service with service control policies (SCP)	Access Advisor
Grant access to bucket and its objects to anyone on internet	S3 Bucket Policies
Need database for massively scaled applications and globally dispersed users	DynamoDB Global tables
Retrieve archived data in milliseconds	S3 Glacier Instant Retrieval
Encrypt RDS instances and snapshots at rest	KMS
Deliver real-time stream of events following changes in resources to EC2 instances, Lambda functions or streams	CloudWatch Events
Allow secure access to resources without creating IAM user	Cognito User pools (recommended) IAM Identity Federation
Authenticate with external or custom IDP (JWT)	Cognito User pools
Define a scaling option for scale based on real-time metrics	Auto Scaling Dynamic scaling option
Improve performance by routing to Region with lowest latency	Route 53 Latency Routing Policy
Make coordinate, all-or-nothing change to multiple items in a DynamoDB table	DynamoDB Transaction
Execute joins or complex transactions on database	RDS
Monitor request, source IP etc. to a CloudFront distribution	CloudTrail
Increase IOPS performance and redundancy	EBS RAID 10
Encrypt and EBS volume	EBS snapshot
Keep ElastiCache cache always update at every database write	ElastiCache Write Through
Perform SQL-like JOIN operations on DynamoDB tables	Apache Hive on EMR
Protect from DDoS attacks	Elastic Load Balancer CloudFront
Storage for less frequently accessed colder data at low cost	EBS HDD Cold
Enable S3 to write server access logs (S3 Log Delivery Group)	S3 Bucket ACL
Auto scale ECS tasks based on existing Auto Scaling group	ECS Cluster Auto Scaling Capacity Provider
Encrypt S3 data providing audit trails on who/when used CMK	S3 SSE-KMS
Control access to APIs with usage plans	Lambda authorizers
Store web session information so if server is lost, session info can be recovered by next server	ElastiCache Redis
Handle millions of requests/second at low latency on network	Network Load Balancer
Store frequently accessed data in a durable, immediately available class	S3 Standard
Restrict access to S3 bucket, prevent bypassing CloudFront	CloudFront Origin Access Identity
Route to a CloudFront distribution or an Elastic Load Balancer	Route 53 Alias
Limit specific client’s requests to an API	API Gateway Per-client throttling limits API Gateway usage plans
Create unique identities for users and authenticate them with IDP	Cognito Identity pools
Disable a user access key	IAM Identity Center
Detect whether a stacks’s actual configuration differs from expected	CloudFormation Drift detector
Reduce number of calls to backend of APIs improving latency	API Gateway Cache
Configure database caching in front of RDS	ElastiCache Memcached
Deliver real-time stream of events following changes in resources to ECS tasks, pipelines, SNS topic or SQS queues	CloudWatch Events
DNS resolution for hybrid clouds	Route 53 Resolver
Host on virtualized instance	EC2 Dedicated instance
Ensure specified number of tasks constantly running and reschedule them on fail	ECS Service scheduler
Deploy a multi-region, multi-master database	DynamoDB Global tables
Auto scale based on number of messages in a queue per EC2 instance	Auto Scaling - Scaling based on SQS
Certificate on Regions not supporting AWS Certificate Manager	IAM server certificates
Configure in-memory storage for leaderboards	ElastiCache Redis
Interactively search and analyze log data in CloudWatch Logs	CloudWatch Logs Insight
Configure long term log retention	CloudWatch Logs
Setup direct interaction between client and HTTP endpoint through an API	API Gateway HTTP_PROXY Integration
Verify IAM permissions passed by a caller on APIs	IAM Identity-Based Policies
Mitigate the drawbacks of the ElastiCache cache strategies	ElastiCache TTL
Serve APIs reducing connection overhead for small number of clients with high demand	API Gateway Regional Endpoint
Request temporary limited-privilege credentials for IAM or federated users	Cognito Identity pools (recommended) IAM Security Token Service
Route by specifying a weight per IP address	Route 53 Weighted Routing Policy
Log API calls, latency and error rates	CloudWatch
Request temporary security credentials to access backend resources behind API Gateway	Cognito
Increase/decrease number of ECS tasks based on CloudWatch alarm	ECS Step Scaling Policy
Dynamic temporary credentials	IAM Security Token Service Long-term key credentials
Create policies that route traffic based on latency, load or geo-proximity	Route 53 Traffic flow
Build a resilient disaster recovery strategy for database	RDS Multi-AZ RDS Read Replicas
Identify unused access	IAM Access Analyzer
Perform authoritative DNS within VPC without exposing DNS records	Route 53 Private DNS
Serve real time streaming with a media player	CloudFront Web Distribution CloudFront RTMP Distribution
Check status of IP address or domain names or CloudWatch alarm	Route 53 Health Checks
Find items in DynamoDB table by primary key	DynamoDB Query
Cache complex data types	ElastiCache Redis
View resource utilization	CloudWatch
Improve latency and throughput for read-heavy/compute-intensive workloads	ElastiCache
Configure multi-thread or multi-core in-memory cache	ElastiCache Memcached
Define rules when log expires or documents are frequently accessed on certain period	S3 Bucket Lifecycle rules
Add domain name to a CloudFront distribution	Route 53 Alias
Prevent Auto Scaling to scale-in and terminate EC2 instances	Auto Scaling termination policy
Retries network requests on DynamoDB on network errors	DynamoDB Exponential Backoff
Configure in-memory cache that can be encrypted	ElastiCache Redis
Cache in-memory always strongly consistently and optimized for DynamoDB	DynamoDB Accelerator (DAX)
Write DynamoDB Stream log to CloudWatch logs	Lambda
Define a scaling policy to scale keeping specific target value	Auto Scaling Target Tracking Policy
Monitor CloudTrail logs in real-time	CloudWatch Logs
Retrieve archived data in minutes/hours for disaster recovery	S3 Glacier Flexible Retrieval
Don’t want to specify provisioned capacity of DynamoDB	DynamoDB On-Demand Capacity
Support connection of firewalls or IPS systems on Layer 3 and 4 ISO/OSI	Gateway Load Balancer
Push updates and synchronize user data across multiple devices	Cognito Push Synchronization Cognito Sync
Increase/decrease number of ECS tasks based on CloudWatch metric	ECS Target Tracking Scaling Policy
Increase IOPS performances at same redundancy	EBS RAID 0
Need up to 64000 IOPS for a volume storage	EBS SSD Provisioned IOPS
Check encryption status of EBS volumes	AWS Config
Cache objects like database queries	ElastiCache Memcached
Search and filter log data coming into CloudWatch Logs	CloudWatch Logs Metric filters
Temporary storage of information changing frequently (buffers, caches, scratch data, etc.)	EC2 Instance Store
Notify to SNS, SQS, or Lambda an event on objects in S3	S3 Event notifications
Route to a DNS name	Route 53 CNAME
Push updates and synchronize user data across multiple devices and users	AppSync
Allow all authenticated users to list objects in a bucket	S3 Bucket ACL
Ensure an instance is removed from load balancer when unhealthy instead of terminated by Auto Scaling Group	Auto Scaling ELB health checks
Publish a single metric data point	CloudWatch API
Analyze CloudFront access logs	AWS Athena
Route to an S3 Bucket as website	Route 53 Alias
Configure a landing spot for streaming sensor data on factory floor	ElastiCache
Serve API endpoint for geographically distributed clients around the world	API Gateway Edge-optimized Endpoint
Send SNS notification when Auto scaling event terminates	Auto Scaling lifecycle hooks
Store and persist session data	DynamoDB
Avoid to be charged after expiration of object storage	S3 Bucket Lifecycle rules
DNS querying between on-premises and AWS over private connections	Route 53 Resolver
Transfer domain from Route 53 to another registrar	AWS Support
Centralize logs from systems, applications and services	CloudWatch Logs
Push Cognito data change to Kinesis stream in real-time	Cognito streams
Enable long-running/lived connections (for WebSocket)	Network Load Balancer
Execute Lambda function in response of Cognito events before sync other devices	Cognito events
Store infrequently accessed data in a less resilient, single-AZ class at lower cost	S3 One Zone-IA
Accept a write/update to a DynamoDB table only if conditions are met	DynamoDB API Conditional writes
Collect system-level metric from EC2 instance	CloudWatch Log Agent XRay
Replicate bucket across Regions	S3 Cross Region Replication
Serve APIs only from a VPC using ENI	API Gateway Private Endpoint
Control permissions to invoke API from specific users, source IPs, VPC endpoint, etc.	IAM Resource-Based Policies
Specify capacity of DynamoDB	DynamoDB Provisioned Capacity
Cache data from dynamically generated web pages	ElastiCache Memcached
Control CDN content expiration time	CloudFront TTL
Resolve apex/naked domain names	Route 53 Alias
Retrieve archived data within 12 hours	S3 Glacier Deep Archive
Offload workload of a database	RDS Read Replicas
Remove session data or event logs from DynamoDB table	DynamoDB TTL
Manage repository for serverless applications	Serverless Application Repository
Allow to encrypt all objects in a bucket	S3 Bucket Policies
Configure live real-time dashboard displays	ElastiCache
Use pre-built applications in serverless architectures	Serverless Application Repository
Compute with discounts reserving 1 or 3 years of instance	EC2 Reserved Instance
Route randomly responding to DNS queries with up to 8 healthy records	Route 53 Multi-value Answer Routing Policy
Configure in-memory store for high frequency counters	ElastiCache Memcached
Capture and log time-ordered sequence of item-level modifications in DynamoDB table	DynamoDB Stream
Offload S3 request rate	CloudFront Edge Location
Validate token in header of an API request	Lambda Authorizer

Previous Post Next Post