ceate a scalable cloud architecture
TRANSCRIPT
在雲端支援大流量的平台架構
John Chang,
AWS Solutions Architecture
August 2016
A scalable architecture
• Can support growth in users, traffic, data size
• Without practical limits
• Without a drop in performance
• Seamlessly - just by adding more resources
• Efficiently - in terms of cost per user
”
“
Sanlih E-Television Uses AWS to Support Online Strategy
Sanlih E-Television is a nationwide cable TV
network delivering some of the most popular TV
channels in Taiwan.
I estimate that we’ve saved
30% by selecting AWS over
other cloud service
providers.
Andy Wang
Chief Information Officer, Sanlih E-Television
”
“ • Wanted to take advantage of online and streaming
platforms to build on leading position in the market
• Had to ensure IT infrastructure could handle demand
and deliver content
• Began running streaming service, website and mobile
apps on AWS
• Successfully integrated internet and mobile into
channel mix
• Saved time and money due to stability of AWS
platform and competitive pricing of services
Day 1 – Dev & private beta
Single host
THE server
(e.g. Apache,
MySQL)
Elastic IP
www.example.com
Amazon Route 53
DNS service
Server Image (AMI)
Day 2 - Public beta
We need a bigger server
• Add larger & faster storage (EBS)
• Use the right instance type
• Easy to change instance sizes
• Not our long term strategy
• Will hit an endpoint eventually
• No fault tolerance
Separating web and DB
• More capacity
• Scale each tier individually
• Tailor instance for each tier– Instance type
– Storage
• Security– Security groups
– DB in a private VPC subnet
But how do I choose what
DB technology I need?
SQL? NoSQL?
Why start with a Relational DB?
• SQL is versatile & feature-rich
• Lots of existing code, tools, knowledge
• Clear patterns to scalability*
• Reality: eventually you will have a polyglot data layer
– There will be workloads where NoSQL is a better fit
– Use the right tool for each workload
* for read-heavy apps
Key Insight: Relational Databases are Complex
• Our experience running Amazon.com taught us that
relational databases can be a pain to manage and
operate with high availability
• Poorly managed relational databases are a leading
cause of lost sleep and downtime in the IT world!
• Especially for startups with small teams
Relational Databases
MySQL, Aurora, PostgreSQL, Oracle, SQL Server
Fully managed; zero adminAmazon
RDS
Aurora
Improving efficiency
Offload static content
• Amazon S3: highly available hosting that scales– Static files (JavaScript, CSS, images)
– User uploads
• S3 URLs – serve directly from S3
• Let the web server focus on dynamic content
Amazon CloudFront• Worldwide network of edge locations
• Cache on the edge – Reduce latency
– Reduce load on origin servers
– Static and dynamic content
– Even few seconds caching of popular content can have huge impact
• Connection optimizations– Optimize transfer route
– Reuse connections
– Benefits even non cachable content
CloudFront
CloudFront for static & dynamic content
Amazon
Route 53
EC2 instance(s)
S3 bucket
Static content
Dynamic content
css/*
js/*
Images/*
Default(*)
CloudFront
distribution
Database caching• Faster response from RAM
• Reduce load on database
Application server
1. If data in cache,
return result
2. If not in cache,
read from DBRDS database
Amazon ElastiCache
3. And store
in cache
Amazon ElastiCache: in-memory cache
• Simple to Deploy
• Managed– Automatically replaces failed nodes
– Patch management
• Elastic
• Compatible ElastiCache
Day 3 – Paying customers
High Availability
Availability Zone a
RDS DB
instance
Web
serverS3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Amazon CloudFront
ElastiCache
node 1
High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
Web
serverWeb
serverS3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Amazon CloudFront
ElastiCache
node 1
High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
serverS3 bucket for
static assets
Amazon CloudFront
ElastiCache
node 1
Elastic Load Balancing
• Managed Load Balancing Service
• Fault tolerant
• Health Checks
• Distributes traffic across AZs
• Elastic – automatically scales its capacity
High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
serverS3 bucket for
static assets
ElastiCache
node 1
Amazon CloudFront
High Availability
Availability Zone a
RDS DB
instance
Availability Zone b
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
S3 bucket for
static assets
ElastiCache
node 1
Amazon CloudFront
Data layer HA
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
Data layer HA
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
ElastiCache
node 2
User sessions• Problem: Often stored on local disk
(not shared)
• Quickfix: ELB Session stickiness
• Solution: DynamoDB
Elastic Load
Balancing
Web
serverWeb
server
Logged in Logged out
Amazon DynamoDB
• Managed document and key-value store
• Simple to launch and scale
• To millions of IOPS
• Both reads and writes
• Consistent, fast performance
• Durable: perfect for storage of session data
https://github.com/aws/aws-dynamodb-session-tomcat
http://docs.aws.amazon.com/aws-sdk-php/guide/latest/feature-dynamodb-session-handler.html
Day 4 – Let’s go!
Replace guesswork with elastic IT
Startups pre-AWS
Demand
Unhappy Customers
Waste $$$
Traditional
Capacity
Capacity
Demand
AWS Cloud
Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
ElastiCache
node 2
Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
ElastiCache
node 2
Web
server
Web
server
Scaling the web tier
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
Web
serverWeb
server
RDS DB
standby
ElastiCache
node 2
Web
server
Web
server
Automatic resizing of compute
clusters based on demand
Feature Details
Control Define minimum and maximum instance pool sizes and when scaling and cool down occurs.
Integrated to Amazon CloudWatch
Use metrics gathered by CloudWatch to drive scaling.
Instance types Run Auto Scaling for on-demand and Spot Instances. Compatible with VPC.
aws autoscaling create-auto-scaling-group
--auto-scaling-group-name MyGroup
--launch-configuration-name MyConfig
--min-size 4
--max-size 200--availability-zones us-west-2c, us-west-2b
Auto Scaling Trigger auto-scaling policy
Amazon
CloudWatch
Decompose into small,
loosely coupled, stateless
building blocks
Prerequisite
What does this mean in practice?
• Only store transient data on local disk
• Needs to persist beyond a single http request?
– Then store it elsewhere
User uploads
User Sessions
Amazon S3
AWS DynamoDB
Application Data
Amazon RDS
Having decomposed into
small, loosely coupled,
stateless building blocks
You can now Scale out with ease
Having done that…
Having decomposed into
small, loosely coupled,
stateless building blocks
We can also Scale back with ease
Having done that…
Take the shortcut
• While this architecture is simple you still need
to deal with: – Configuration details
– Deploying code to multiple instances
– Maintaining multiple environments (Dev, Test, Prod)
– Maintain different versions of the application
• Solution: Use AWS Elastic Beanstalk
AWS Elastic Beanstalk (EB)
• Easily deploy, monitor, and scale three-tier web
applications and services.
• Infrastructure provisioned and managed by EB
• You maintain control.
• Preconfigured application containers
• Easily customizable.
• Support for these platforms:
Loose coupling with SQS
Tight coupling
• Place tasks into Amazon Simple Queue Service (SQS)• SQS – buffer that protects backend systems• Process asynchronously - at own pace• Remove delay from latency sensitive paths
SQS
Get
Message
Back
End EC2
Instance
Put
Message
Front
End EC2
Instance
Day 5 – Add more features
Mobile
Push
NotificationsMobile
AnalyticsCognito
Cognito
Sync
Analytics
KinesisData
PipelineRedShift EMR
Your Applications
AWS Global Infrastructure
Network
VPCDirect
ConnectRoute 53
Storage
EBS S3 Glacier CloudFront
Database
DynamoDBRDS ElastiCache
Deployment & Management
Elastic
BeanstalkOpsWorks
Cloud
Formation
Code
Deploy
Code
Pipeline
Code
Commit
Security & Administration
CloudWatch ConfigCloud
TrailIAM Directory KMS
Application
SQS SWFApp
Stream
Elastic
TranscoderSES
Cloud
Search
SNS
Enterprise Applications
WorkSpaces WorkMail WorkDocs
Compute
EC2 ELBAuto
ScalingLambdaECS
AWS building blocks
Inherently Scalable & Highly Available Scalable & Highly Available
Elastic Load Balancing
Amazon CloudFront
Amazon Route53
Amazon S3
Amazon SQS
Amazon SES
Amazon CloudSearch
AWS Lambda
…
Amazon DynamoDB
Amazon Redshift
Amazon RDS
Amazon Elasticache
…
Amazon EC2
Amazon VPC
Automated Configurable With the right architecture
Stay focused as you scale your team
AWSCloud-Based
Infrastructure
YourBusiness
More Time to Focus onYour Business
Configuring Your Cloud Assets
70%
30%70%
On-PremiseInfrastructure
30%
Managing All of the “Undifferentiated Heavy Lifting”
Day 6 – Growing fast
Scaling Relational DBs
• Increase RDS instance specs– Larger instance type
– More storage / more PIOPS
• Read Replicas (Master – Slave)– Scale out beyond capacity of single DB instance
– Available in Amazon RDS for MySQL, PostgreSQL and Amazon Aurora
– Writes => master
– Replication lag
– Reads with tolerance to stale data => read replica (slave)
– Reads with strong consistency requirements => master
Scaling the DB
Web
server
Web
server
Web
server
Web
server
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
RDS DB
standby
ElastiCache
node 2
Scaling the DB
Web
server
Web
server
Web
server
Web
server
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
RDS DB
standby
ElastiCache
node 2 RDS read
replica
Scaling the DB
Web
server
Web
server
Web
server
Web
server
Availability Zone a
RDS DB
instance
ElastiCache
node 1
Availability Zone b
S3 bucket for
static assets
www.example.com
Amazon Route 53
DNS service
Elastic Load
Balancing
RDS DB
standby
ElastiCache
node 2 RDS read
replicaRDS read
replica
What if your app is write-heavy?
Challenge: You will eventually hit the write throughput or
storage limit of the master node
Solutions:
• Federation (splitting into multiple DBs based on function)
• Sharding (splitting one data set across multiple hosts)
Database federation• Divide tables into smaller
autonomous databases
• Harder to do cross-function
queries
• Won’t help with single huge
functions/tables
Forums DB
Users DB
Products
DB
Sharded horizontal scaling
• Store subset of rows into
each database shard
• More complex at the
application layer
• No practical limit on
scalability
• Operation complexity
User ShardID
002345 A
002346 B
002347 C
002348 B
002349 A
Shard C
Shard B
Shard A
NoSQL data stores
• Trade query & integrity features of Relational DBs for
– More flexible data model
– Horizontal scalability & predictable performance
DynamoDB
Provisioned read/write performance per table
Massive and Seamless Scale
• Distributed system that can scale both reads and writes
– Sharding + Replicas
• Automatic partitioning:
– Data set size growth
– Provisioned capacity increases table
Summary
Amazon Route 53
DNS serviceNo limit
Availability Zone a
RDS DB
instance
ElastiCache
node 2
Availability Zone b
S3 bucket for
static assets
www.example.com
Elastic Load
Balancing
RDS DB
standby ElastiCache
node 3
RDS read
replicaRDS read
replica
DynamoDB
RDS read
replicaElastiCache
node 4
RDS read
replicaElastiCache
node 1
CloudSearchLambdaSES SQS
A quick review• Keep it simple and stateless
• Make use of managed self-scaling services
• Multi-AZ and AutoScale your EC2 infrastructure
• Use the right DB for each workload
• Cache data at multiple levels
• Simplify operations with deployment tools
Next steps?READ!
• aws.amazon.com/documentation
• aws.amazon.com/architecture
ASK FOR HELP!
• forums.aws.amazon.com
• aws.amazon.com/support
Q&A