Guangning Yu's Blog
Home
Code
Data
Setup
Industry
MachineLearning
Archive
AWS Certified Solutions Architect Associate Notes
2019-12-05 17:59:42
|
AWS
# Compute ## EC2 - Billing for interrupted Spot Instance ![title](/api/file/getImage?fileId=5de62eb6494730054e000061) ![title](/api/file/getImage?fileId=5de62ed6494730054e000062) - When you launch an instance from AMI, it uses either paravirtual (PV) or hardware virtual machine (HVM) virtualization. HVM virtualization uses hardware-assist technology provided by the AWS platform. - The information about the instance can be retrieved from: - http://169.254.169.254/latest/meta-data/ - http://169.254.169.254/latest/user-data/ - The underlying Hypervisor for EC2: - Xen - Nitro - Standard Reserved Instances cannot be moved between regions. You can choose if a Reserved Instance applies to either a specific AZ or an entire region, but you cannot change the region. - About **EC2 Auto Scaling** - Can span multi-AZ - About **Placement Group** - Three types of Placement Groups - Clustered Placement Group - Within a single AZ - Used for applications that need low network latency, high network throughput, or both - Only certain instances can be launched into a Clustered Placement Group - AWS recommend homogenous instances within Clustered Placement Groups. - Spread Placement Group - Each placed on distinct underlying hardware. - Used for applications that have a small number of critical instances that should be kept separate from each other. - Can span multiple AZs. - Spread Placement Group can only have 7 runninig instances per AZ. - Partition Placement Group - Divides each group into logical segments called partitions. Each partition has its own set of racks. No two partitions share the same racks. - Used for applications like HDFS, HBase and Cassandra. - Can span multiple AZs. - The name for a placement group must be unique within your AWS account. - Only certain types of instances can be launched in a placement group (Compute/Memory/Storage Optimized, GPU). - You can't merge placement groups. - You can't move an existing instance into a placement group. You can create an AMI from your existing instance and then launch a new instance from the AMI into a placement group. - There is no charge for creating a placement group. ## Lambda - An event-driven compute service - Scales out (not up) automatically - Supports synchronous and asynchronous invocation. - Can do things globally. You can use it to back up S3 buckets to other S3 buckets, etc. - **AWS X-ray** allows you to debug Lambda functions - AWS Lambda automatically monitors functions on your behalf. You can check Lambda logs in **Amazon CloudWatch Logs**. - AWS encrypts the environment variables in Lambda functions using KMS. When your Lambda function is invoked, those values are decrypted and made available to the Lambda code. - About charges - Number of requests - Duration - (The price also differs for different memory sizes.) - **Timeout** - The default timeout is 3 seconds - The maximum execution duration per request is 900 seconds (15 min) - **Lambda@Edge** - Lambda@Edge lets you run Lambda functions to customize the content that CloudFront delivers, executing the functions in AWS locations closer to the viewer. ## ECS - Amazon ECS enables you to inject sensitive data into your containers by storing your sensitive data in either **AWS Secrets Manager** secrets or **AWS Systems Manager Parameter Store** parameters and then referencing them in your container definition. # Storage ## S3 - Features - Tiered Storage Available ![title](/api/file/getImage?fileId=5de74dca494730054e00006e) - Retrieval time: - Glacier: Configurable retrieval times, from minutes to hours - Glacier Deep Archive: Within 12 hours - Lifecycle Management - ![title](/api/file/getImage?fileId=5de84b1e4947300553000016) - There is a constraint that objects must be stored at least **30 days** in the current storage class before you can transition them to STANDARD_1A or ONEZONE_1A. - Versioning - Stores all versions of an object (including delete markers) - Once enabled, versioning cannot be disabled (but can be suspended) - Cross Region Replication - Versioning must be enabled on both the source and destination buckets - Files in an existing bucket are not replicated automatically - Subsequent updated files will be replicated automatically - Deleting individual versions or delete markers will not be replicated. - Encryption - Encryption in transit - SSL/TLS - Encryption at rest - Server side - S3 managed keys (SSE-S3) - KMS managed keys (SSE-KMS) - Provides you with an audit trail that shows when youor CMK was used and by whom - Customer provided keys (SSE-C) - Client side - KMS managed master keys (CSE-KMS) - Need to send the key to AWS - Custom master key (CSE-C) - About Access Control - Object ACL - Bucket ACL - Bucket Policies - IAM Policies - MFA delete - About charges - ![title](/api/file/getImage?fileId=5de8875e494730055300001e) - Features that charge: - Storage - Requests - Storage Management - Data transfer - Transfer Acceleration - Cross Region Replication - About object - Can be from 0 Bytes to 5TB - Successful uploads will generate a HTTP 200 status code. - Read after Write consistency for PUTs of new objects - Eventual consistency for overwrite PUTs and DELETEs (can take some time to propagate) - Until 2018 there was a hard limit on S3 puts of 100 PUTs per second. To achieve this, care needed to be taken with the structure of the name key to ensure parallel processing. As of July 2018, the limit was raised to 3500 and the need for the key design was basically eliminiated. - About bucket - One account can have up to **100** S3 buckets by default. - Buckets are a universal name space. - There is unlimited storage. - All newly created buckets are **private** by default. - Can be configured to create access logs which log all requests made to the S3 bucket. This can be sent to another bucket (even in another account). - **Transfer Acceleration** utilizes the CloudFront Edge Network to accelerate the uploads to S3 - **Expedited Retrievals (in Glacier)** allows you to quickly access your data when occasional urgent requests for a subset of archives are required. (usually within 1~5 minutes) - **Provisioned Capacity** ensures that retrieval capacity for Expedited Retrievals is available when you need it. - **Multipart Upload API** - Upload parts in parallel to improve throughput. - Quick recovery from any network issues - For objects larger than 100MB, customers should consider using the Multipart Upload capability. (The largest object that can be uploaded to S3 in **a single PUT** is **5GB**.) - Does not provide the lowest-latency access to the data unlike EBS. ## EBS - Amazon EBS volume data is automatically replicated across multiple servers in **an Availability Zone**. - Compare EBS types ![title](/api/file/getImage?fileId=5de71284494730054e00006d) - You can change EBS volume sizes on the fly, including changing the size and storage type. - About snapshot - Snapshots exist on S3. - Snapshots are incremental. - To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the instance before taking the snapshot. - However you can take a snapshot while the instance is running. - Can use **Amazon Data Lifecycle Manager (DLM)** to automate the backup of EBS volumes - About AZ and region - Volumes will always be in the same AZ as the EC2 instance. - Amazon EBS provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to Amazon S3. These snapshots can be used to create multiple new EBS volumes or **move volumes across Availability Zones**. - To move an EC2 volume from one AZ to another, take a snapshot of it, create an AMI from the snapshot and then use the AMI to launch the EC2 instance in a new AZ. - To move an EC2 volume from one region to another, take a snapshot of it, create an AMI from the snapshot and then copy the AMI from one region to the other. Then use the copied AMI to launch the new EC2 instance in the new region. - About encryption - Example ![title](/api/file/getImage?fileId=5de8858a494730055300001b) - Snapshots of encrypted volumes are encrypted automatically. - Volumes restored from encrypted snapshots are encrypted automatically. - You can share snapshots (with other AWS accounts or make public), but only if they are unencrypted. - You can now encrypt root device volumes upon creation of the EC2 instance. - In order to enable encryption at rest using EC2 and EBS, you must configure encryption when creating the EBS volume. - Methods for increasing the performance of your volumes: - Schedule snapshots of HDD based volumes for periods of low use (**only applies to HDD based volumes**) - Stripe volumes together in RAID 0 configuration - Ensure that your EC2 instances are types that can be optimized for use with EBS ## EFS - Amazon EFS is a regional service storing data within and across multiple Availability Zones (AZs) for high availability and durability. Amazon EC2 instances can access your file system across AZs, regions, and VPCs, while on-premises servers can access using AWS Direct Connect or AWS VPN. ![title](/api/file/getImage?fileId=5de6326f494730054e000064) - Supports the Network File System version 4 (NFSv4) protocol - EFS storage capacity is elastic, growing and shrinking automatically as you add and remove files. - You only pay for the storage you use. - Can scale up to the petabytes. - Can support thousands of concurrent NFS connections. - Data is stored across multiple AZs within a region. - Read After Write consistency. - Life-cycle management and Infrequent Access storage is available for EFS. - Using EFS with Microsoft Windows is **NOT** supported. ## Storage Gateway - AWS Storage Gateway's software appliance is available as a virtual machine image (**VMware ESXi** or **Microsoft Hyper-V**) to install on a host in your data center. - Three types of storage - File Gateway (NFS & SMB) - Files are stored as objects in S3 and accessed through a Network File System mount point. - Once objects are transferred to S3, they can be managed as native S3 objects. ![title](/api/file/getImage?fileId=5de7b2744947300553000003) - Volume Gateway (iSCSl) - Presents cloud-based iSCSI block storage volumes to your on-premises applications. - Data written to these volumes can be asynchronously backed up as point-in-time snapshots and stored in cloud as EBS snapshots. - Two modes: - **Stored Volumes**: store primary data locally while asynchronously backing up data to AWS (1GB-16TB in size) ![title](/api/file/getImage?fileId=5de7b3c84947300553000004) - **Cached Volumes**: use S3 as primary data storage while retaining frequently accessed data locally (1GB-32TB in size) ![title](/api/file/getImage?fileId=5de7b4534947300553000005) - Tape Gateway (VTL) # Database ## RDS - Supports 6 types of databases - SQL Server - Storage up to 16TB (all other types are up to 64TB) - Oracle - MySQL - The recommended storage engine for MySQL is InnoDB, not MyISAM. - PostgreSQL - MariaDB - Aurora - MySQL-compatible - About size - Start with 10GB and scales in 10GB increments to 64TB (storage autoscaling) - Compute resources can scale up to 32vCPUs and 244GB of memory - About high availability - 2 copies of your data is contained in each AZ with minimum of 3 AZs, which means 6 copies of your data. - Designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. - Storage is self-healing. Data blocks are continuously scanned for errors and repaired automatically. - 2 types of replicas available (Aurora Replicas and MySQL replicas). Automated failover is only available with Aurora Replicas. - About backup - Automated backups are always enabled. Backups do not impact database performance. - Support taking snapshots which does not impact performance. Can share snapshots with other AWS accounts. - Aurora Serverless is serverless. - RDS runs on virtual machines and is not serverless. - About backups - Automated Backups - Allow you to do a point in time recovery down to a second within the retention period (1 to 35 days) - Automated Backups are enabled by default. - The backup data is stored in S3 and you get free storage space equal to the size of your database. - Backups are taken within a defined window. During the backup window, storage I/O may be suspended and you may experience elevated latency. - The automated backups will be deleted once the original RDS instance is deleted. - Database Snapshots - Done manually - They are stored even after you delete the original RDS instance. - About encryption - Encryption at rest is supported by all the 6 types of databases. - Encryption is done using the KMS service. - Once your RDS instance is encrypted, the data stored at rest is encrypted, as are its automated backups, read replicas and snapshots. - Two key features - Multi-AZ - Used for Disaster Recovery only - Not available for Aurora (since Aurora is Mutli-AZ by default) - You can force a failover from one AZ to another by rebooting the RDS instance - If the primary instance fails, the **CNAME** will be switched from the primary to the standby instance. - When running the primary instance, the standby instance **cannot** be used for read and write operations. - Read Replicas - Used for performance improvement (**not a solution for system outages**) - Not available for SQL Server - Create read-only copy by using **asynchronous** replication from the primary RDS instance to the read replica - Must have automatic backups turned on in order to deploy a read replica - You can have up to 5 read replica copies of any database - You can have read replicas of read replicas (but watch out for latency) - Each read replica will have its own DNS endpoint - You can have read replicas that have Multi-AZ - Read replicas can be promoted to be their own databases. This breaks the replication. - You can have a read replica in a second region. - **IAM Database Authentication** - Manages your database user credentials through IAM users and roles - Works with MySQL and PostgreSQL ## DynamoDB - A fully managed database - Supports both document and key-value data models - Stored on SSD storage - Spread 3 geographically distinct data centers - Eventual Consistent Reads by default (supports Strongly Consistent Reads as well) - Chargeable features - Read and write capacity - Data storage - Supports performing operations by using a user-defined primary key. The primary key in DynamoDB can either be a single-attribute or a compose. - **Auto Scaling** - Uses AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf. - **DynamoDB Accelerator (DAX)** - A fully managed, highly available, in-memory **cache** that can reduce DynamoDB response times from milliseconds to microseconds, even at millions of requests per second. - **DynamoDB Stream** - An ordered flow of information about changes to items in a DynamoDB table - Use Case: Enable DynamoDB Stream and create an AWS Lambda trigger. The data from the stream record will be processed by the Lambda function which will then publish a message to SNS Topic that will notify the subscribers via email. ## Redshift - Can be configured as follows - Single Node (160Gb) - Multi-Node - Leader Node (manages client connections and receives queries) - Up to 128 Compute Nodes (store data and perform queries and computation) - Better compression by using columnar data stores - Save space by not using indexes or materialized views - About backup - Enabled by default with 1 day retention period (maximum is 35 days) - Always attempts to maintain at least three copies of your data (the original and replica on the compute nodes and a backup in S3) - Supports asynchronously replicating snapshots to S3 in another region for disaster recovery - About pricing - Compute Node Hours - Backup - Data transfer (no charge between Redshift and S3 within the same region) - About security - Encrypted in transit using SSL - Encrypted at rest using AES-256 encryption - By default Redshift takes care of key management - Manage your own keys through HSM - AWS KMS - Currently only available in 1 AZ (can restore snapshots to a new AZ) ## ElastiCache - Supports two open-source in-memory caching engins - Memcached - Supports Multi-thread - Redis - Supports Multi-AZ - Supports persistence - Support backup and restore - Can use **Redis AUTH** command to improve data security by requiring the user to enter a password before they are granted permission on a Redis server. # Network ## Route 53 - About Alias Record - An Alias Record can be used for naked domain names (zone apex) but a CNAME cannot. Given the choice, always choose an Alias Record over a CNAME. - Alias Records provide a Route 53-specific extension to DNS functionality. - Alias Records can point to AWS Resources that are hosted in other accounts by manually entering the ARN. - About Health Checks - Can set Health Checks on individual record sets - You can set SNS notifications if a health check failed - About supported Routing Policies - Simple Routing - Return all values to the user in a random order - Weighted Routing - Latency-based Routing - Failover Routing - Geolocation Routing - Geoproximity Routing (Traffic Flow Only) - Multivalue Answer Routing - Similar to Simple Routing however it allows you to put health checks on each record set. - There is a default limit of 50 domain names in Route 53. However, this limit can be increased by contacting AWS support. ## VPC - Think of a VPC as a logical data center in AWS ![title](/api/file/getImage?fileId=5de7e1b7494730055300000e) - By default, **5** VPCs are allowed in each AWS Region for each account. - **Internet Gateway** - You can only have 1 Internet Gateway per VPC. - **Network Address Translation (NAT) Gateway** - **Used to provide Internet traffic to EC2 instances in a private subnet** - Redundant inside the AZ - Starts at 5Gbps and scales currently to 45Gbps - Not associated with security groups - Automatically assigned a public IP - No need to disable Source/Destination Checks - To create an AZ-independent architecture, create a NAT gateway in each AZ and configure your routing to ensure that resources use the NAT gateway in the same AZ. - **Route Tables** - **Network Access Control Lists (ACL)** - ![title](/api/file/getImage?fileId=5de88695494730055300001d) - Your VPC automatically comes with a default network ACL and it allows all inbound and outbound traffic. - Each custom network ACL **denies** all inbound and outbound traffic by default. - Each subnet must be associated with one (and only one) network ACL. - Block IP addresses using network ACL (not using Security Group) - Network ACLs contain a numbered list of rules that is evaluated in order starting with the lowest numbered rule. - ACL rules are executed immediately when a matching allow/deny rule is found (**does not check for conflicts**) - Network ACLs are **stateless**. Responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice versa). - **Subnets** - One subnet is in one AZ. - AWS always reserves 5 IPs within your subnets. - **Security Group** - Changes to Security Group take effect immediately. - You can have any number of EC2 instances within a Security Group. - You can have multiple Security Groups attached to EC2 instances. - You can specify allow rules, but not deny rules. - Security Groups can't span VPCs. - **Elastic IP Addresses** - How to bring your own IPs - Create a Route Origin Authorization (ROA) then once done, provision and advertise your whitelisted IP addresses range to your AWS account. - **Elastic Network Interface** - A logical networking component in VPC that represents a virtual network card - Three ways to attach a network interface to an EC2 instance - Hot attach: when it's running - Warm attach: when it's stopped - Cold attach: when it's being launched - An ENI is never detached when an instance is stopped. - **VPC Peering** - You can peer VPCs with other AWS accounts as well. - You can peer between regions. - Peering is in a star configuration; No transitive peering. - **VPC Flow Logs** - Enables you to capture the IP traffic going to and from network interfaces in your VPC. - The data is stored using CloudWatch Logs. - Three levels - VPC - Subnet - Network Interface - Not all IP traffic is monitored. For example, the traffic generated by contacting Amazon DNS will not be logged. - Works on **Layer 4 of TCP/IP stack (not Layer 7)** and it is restricted to network-level monitoring - **Bastions** - Used to securely administer EC2 instances - **VPC Endpoints** - Enables privately connecting your VPC to supported AWS services powered by PrivateLink. - Traffic between your VPC and the other service **does not leave** the Amazon network. - Two types - Interface Endpoints: an Elastic Network Interface with a private IP address that serves as an entry point - Gateway Endpoints: Currently supports Amazon S3 and DynamoDB - **About Multicast/Broadcast** - VPC does not support multicast or broadcast networking - You can create a virtual overlay network running on the OS level of the instance to migrate the legacy application. ## Direct Connect - Example ![title](/api/file/getImage?fileId=5de7e2c6494730055300000f) - Used for high throughput workloads or if you need a stable and reliable secure connection ## API Gateway - Exposed **HTTPS** endpoints to define a RESTful API - Low cost and scales automatically - Supports throttling requests to prevent attacks - Connects to CloudWatch to log all requests for monitoring - Supports API caching in API Gateway to cache your endpoint's response - If you are using Javascript/AJAX that uses multiple domains with API Gateway, ensure that you have **enabled CORS** on API Gateway. ## ELB - Can spread load across AZs not regions. - Load Balancers have their own DNS name. You are never given an IP address. - Health Checks check the instance health by talking to it. - Three types - Application Load Balancers - Must be deployed into at least two subnets. - Network Load Balancers - Classic Load Balancers - **Sticky Sessions** - Classic Load Balancers: Allows you to bind a user's session to **a specific EC2 instance**. - Application Load Balancers: Allows you to bind a user's session to **a specific Target Group**. - **Connection Draining** - Enables the load balancer to complete in-flight requests made to instances that are de-registering or unhealthy. - **Cross Zone Load Balancing** - Example ![title](/api/file/getImage?fileId=5de7e7d74947300553000010) ![title](/api/file/getImage?fileId=5de7e7e84947300553000011) - **Path Patterns** - Example ![title](/api/file/getImage?fileId=5de7e83c4947300553000012) - How to allow multiple domains to serve SSL traffic using ALB 1. Upload all SSL certificates of the domains in the ALB using the console 2. Bind multiple certificates to the same secure listener on your load balancer 3. ALB will automatically choose the optimal TLS certificate for each client using Server Name Indication (SNI) - ELB provides **access logs** that capture detailed information about requests sent to your load balancer (disabled by default). # Identity & Security ## IAM - **Users and Groups** - The default maximum limit is **5000** IAM users per AWS account. - **Roles** - Roles can be assigned to an EC2 instance after it is created. - Roles are universal and can be used in any region. - **Web Identity Federation** - Lets you give your users access to AWS resources after they have successfully authenticated with a web-based identity provider like Amazon, Facebook or Google. - **Working with Microsoft AD** - **AWS Managed Microsoft AD** - Built on actual Microsoft Active Directory and does not require you to synchronize or replicate data from your on-premises Microsoft Active Directory to the AWS Cloud. - You can use standard Microsoft Active Directory administration tools and take advantage of built-in Microsoft Active Directory features, such as Group Policy, trusts, and single sign-on (SSO). - **Simple Active Directory** - A standalone managed directory that is powered by a Samba 4 Active Directory Compatible Server - Provides **a subset of the features** offered by AWS Managed Microsoft AD - **AWS Directory Service AD Connector** - A directory gateway with which you can redirect directory requests to your on-premises Microsoft Active Directory without caching any information in the cloud. - You can assign an IAM role to the users or groups from your Active Directory once it is integrated with your VPC via the AWS Directory Service AD Connector. - **SAML** - AWS supports identity federation with SAML 2.0 (which Microsoft AD implements by default). - The feature enables federated single sign-on (SSO), so users can log into the AWS console or call the AWS APIs without creating an IAM user for everyone in your organization. ## Cognito - Provides Web Identity Federation - **User Pools** - User directories used to manage sign-up and sign-in functionality for mobile and web applications - Users can sign-in directly to the User Pool or using Amazon, Facebook or Google. - Successful authentication generates a JWT. - **Identity Pools** - Enables providing temporary AWS credentials to access AWS services like S3 or DynamoDB. - Synchronization - Cognito uses **Push Synchronization** to push updates and synchronize user data across multiple service. ## STS - AWS Security Token Service is the service that you can use to create and provide trusted users with temporary credentials that can control access to your AWS resources. ## KMS ## CloudHSM - ![title](/api/file/getImage?fileId=5de88612494730055300001c) - CloudHSM provides a secure key storage in tamper-resistant hardware available in **Multi-AZs**. # Management ## CloudFormation - Templates include several major sections. The **Resources** section is the only required section. ## CloudWatch - Monitor things like: - Compute: - EC2 Instances: CPU, network, disk, status check (**no metrics for memory!**) - Autoscaling groups - Elastic Load Balancers - Route53 health checks - Storage: - EBS volumes - Storage Gateways - CloudFront - CloudWatch with EC2 will monitor events every 5 minutes by default. You can have 1 minute intervals by turning on detailed monitoring. - What can you do with CloudWatch? - Dashboards - Alarms - Set alarms to notify you when particular thresholds are hit. - You can create alarms that automatically stop, terminate, reboot or recover your EC2 instances using CloudWatch alarm actions. - Events - Respond to state changes in your AWS resources - Logs - Aggregate, monitor and store logs ## CloudTrail - By default, CloudTrail event log files are encrypted using Amazon S3 server-side encryption (SSE). - Used for tracking the API calls of your AWS resources, not for sending EC2 logs to CloudWatch (should use CloudWatch agent for this) ## AWS Auto Scaling - Supported services - EC2 - EC2 Spot Fleets - ECS - DynamoDB - Aurora - Auto Scaling Policies - Auto Scaling Group - Scheduled Scaling - Default termination policy 1. Select the AZ with the most instances 2. Select the instance with the oldest launch configuration 3. Select the instance closest to the next billing hour - **Cooldown Period** - Ensures that the Auto Scaling group does not launch or terminate additional EC2 instances before the previous scaling activity takes effect. - The default value is 300 seconds. # Application Integration ## SNS - Instantaneous push-based delivery (no polling) ## SQS - Used for **decoupling** the components of an application - Pull based, not pushed based - Messages can contain up to 256 KB of text in any format - Two types of queues - Standard Queue (default) - Delivered at least once - Cannot guarantee order - FIFO (first-in-first-out) Queue - Exactly-once delivery - Can guarantee order - **Retention Period** - Messages can be kept in the queue from 1 minute to 14 days (4 days by default) - **Visibility Timeout** - The amount of time that the message is invisible in the SQS queue after a reader picks up that message. - If the job is not processed within that time, the message will become visible again and another reader will process it. - The default value is 30 seconds and maximum is 12 hours. - **Long Polling vs Short Polling** - While the regular short polling returns immediately (even if the message queue being polled is empty), long polling doesn't return until a message arrives in the queue (or the long poll times out). - The *ReceiveMessageWaitTimeSeconds* is the queue attribute that determines whether you are using Short or Long polling. By default, its value is zero which means Short polling. If it is set to a value greater than zero, then it is Long polling. - Cannot set a priority to individual items in the SQS queue - A queue can contain an **unlimited** number of messages. ## Step Functions - Provides serverless orchestration for modern applications ## SWF - Coordinates various processing steps such as executable code, web service calls, **human actions** and scripts. - It ensures that a task is never duplicated and is assigned only once. - Actors - Workflow Starters - Deciders - Activity Workers - The workflow executions can last up to **1 year**. - Unlike Step Functions, a user has to manage the infrastructure that runs the workflow logic and tasks when using SWF (i.e.**not serverless!**). ## Amazon MQ - Supports industry-standard APIs and protocols so you can switch from any standard-based message broker to Amazon MQ without rewriting the messaging code in your applications. # Analytics ## Kinesis - Three types - Kinesis Streams - Kinesis Streams consist of **Shards**. - The data capacity of your stream is a function of the number of shards that you specify for the stream. The total capacity of the stream is the sum of the capacities of its shards. - ![title](/api/file/getImage?fileId=5de7c52e494730055300000b) - **Retention Period** - The time period from when a record is added to when it is no longer accessible - **24 hours** by default - **168 hours (a week)** as maximum - Kinesis Firehose - Kinesis Analytics ## S3 Select - Enables applications to retrieve only a subset of data from an object by using simple SQL expressions ## Amazon Redshift Spectrum - A service that queries data directly from files on Amazon S3 - Allows creation of Redshift tables - Able to join Redshift tables with Redshift spectrum tables efficiently ## Amazon Athena - An interactive query service that makes it easy to analyse data in S3 using standard SQL commands - Supports data format - JSON - Apache ORC - Apache Parquet ## AWS Glue - A fully managed extract, transform, and load (ETL) service # DevOps ## AWS Elastic Beanstalk - Supports the deployment of web applications from Docker containers - Automatically handles all the tasks such as balancing load, auto scaling, monitoring and placing your containers across your cluster. - Stores your application files and optionally server log files in Amazon S3. ## AWS OpsWorks - A configuration management service that provides managed instances of **Chef** and **Puppet**. # Others ## CloudFront - Methods to have secure access to private files located in S3 - CloudFront Origin Access Identity - A virtual user identity that is used to give the CloudFront distribution permission to fetch a private object from an S3 bucket - CloudFront Signed URLs - CloudFront Signed Cookies ## Snowball - Types of services - Snowball - A petabyte-scale data transport solution - Comes in either 50TB or 80TB - Snowball Edge - Comes in 100TB - Has on-board compute capabilities - Snowmobile - An Exabyte-scale data transfer service - Up to 100PB per Snowmobile - When should I use Snowball ![title](/api/file/getImage?fileId=5de7b0604947300553000002) **Reference** 1. [AWS Certified Solutions Architect Associate by A Cloud Guru](https://acloud.guru/learn/aws-certified-solutions-architect-associate) 2. [AWS Certified Solutions Architect Associate Practice Exams](https://www.udemy.com/course/aws-certified-solutions-architect-associate-amazon-practice-exams/)
Previous:
Create User in Windows Server 2016
Next:
Mount S3 bucket on EC2 Linux Instance