scalable architectures how to design and implement service for tens of millions of users – case...
TRANSCRIPT
Scalable Architectures
How to Design and Implement Service for Tens of Millions of Users – Case TeamUpKalle Launiala, ProtonIT, The Ball
[email protected], +358445575665
Table of Contents
• Case: TeamUp
• Digital service and its user – the definitions..?
• Migration from existing ASP.NET MVC app to scalable• DEMO – How does the architecture look like in reality?• Mobile app in respect of scalable architecture• Mobile ecosystem (appstore) role in free data transfer
• Cost structure of digital service
Case: TeamUpBringing together talents, fans and sponsors
The Business Model
• Free for talents
• Free for fans
• Talents & fans & products for fans – web shop• Free = 0% margin• Not even covering the costs for running the webshop
• Sponsors – the basic level is free• Pricing starts from clearly measurable benefits = basis for
benefit
Requirements & Motives to Join• Talents are required for the service
• Motive: Inform the fans, reach to own fan community
• Fans are required for the service• Motive: Following and supporting the talents, community
• Sponsors are required (to fund) the service• Motive: Visibility in proper context, brought up by talents
themselves, accepted as beneficial by fans• Motive: Marketing to become effective and measurable
Identified challenges for growth• Talents are fans are required first, sponsors to
follow
• Talents and fans create all the expenses• Sponsors (thus funding) come often after the
established fanbase
• Desirable scenario: big existing fan base• For example: NHL or European Soccer League = millions• Pre-game activation for fan base –heavy usage spikes
Questions raising from business management...• Can we serve million(s) of simulatenous users?
• If one cannot immediately answer, the answer is NO
• If we can serve, what would it cost?• What options are to modify the technology or product?
• What is causing the cost, what does it mean?• Understanding this will help to negotiate with the partner• Big fanbase club with existing heavy ICT infrastructure is
likely to understand the cost and subsidize it
Usage of Resources is the Key
SQL ServerAs Master Data
& Indexing Storage
.json
Web Page
Razor View + MVCRendering
Partial View
MVC Controller + Handler
Partial ViewCombination
Ajax
Individual With Private Data
Invidividual WithPrivate Data
Community / Shared Data
Data (GB), performance and scalability
CPU time (h),performance and scalability
Web pages, AJAX sources (JSON) = amount of data to transfer
User group total counts= multiplier for the other metrics
ProtonIT’s skills to solve migration scenario...• ”The Ball on Azure”
• Available on GitHubissa 2012-
• In production since 2013
• Very thin authorization HTTPS/WSS layer on top of Azure Blob Storage
= Real world hands on experience on scalable base level architecture
+ Co-operation with TeamUp to build against real scenario with real metrics
Initial Starting Point
• NOTE – Prices Dropped – Spring-Autumn 2014:• Blob storage price split to half – due to matching AWS...
• Metrics to begin with:• Browser network profiling – including caching responses• Server side diagnostics for HTTP request begin-completion• Estimates on SQL usage per user• Estimates on users per time unit (more on this later)
• Thought + tools: architectural sketches + Excel!!!
Before–After – order of 50-100k
Storage Costs
Network (GB)
Network Costs
Total Cost w/o CPU
CPU Hours CPU CostTotal
Mbps Avg / Constant Load
Storage Transactions / second Avg
11 391 € 703 125 63 281 € 74 672 € 510 000 30 600 € 105 272 € 2 222 0
Storage Costs
Network (GB)
Network Costs
Total Cost w/o CPU
CPU Hours CPU CostTotal
Mbps Avg / Constant Load
Storage Transactions / second Avg
19 609 € 410 156 36 914 € 56 523 € 51 000 3 060 € 59 583 € 1 296 118 056
Active UsersLanding Page Uncached Loads / Month / User
Landing Page Cached Loads / Month / User
1 000 000 60 3000
CPU 600ms => 60msDataservice JSON => Cacheable BLOB .json
Can we serve it?How about 10x?
Digital Service... And its user in context of this presentation (= scalability)
Digital Service
• Web app
• Mobile app
• Open data API/service
• Internet-of-anything service
... User
• Single individual
• User group
• Single user’s device• Device in role (or behalf of) user group
• What-ever device...
• ... Another digital service (= endless chain of services)
Basis for calculations: User?What is the definition of user?
• Human or device?• Is the user different on weekend during the game compared to same
person checking statusreports on workday?
How many per time period, how active?• Is the real world user identified, or split to multiple user roles, based on
his or her actions?• ”User reading the pages”, ”User making update”, ”User making a
purchase”
Does the idle/away time from the service counted/recognized or do we only care for logged in and active users?
• User experience can be designed better, if real user is recognized
Why does it matter?
On calculations it doesn’t matter, as long as the hours and months are not mixed up – Watch out for order of magnitude for 300 Bytes (cached 304) outbound vs 1 IO Transaction!
Definition of (paying/valuable) active user DOES MATTER for business and for investors, so it’s better not to be confusing...
Example: 50%+ costs are caused from IoT m2m service chains... What is our user and how many of them we have?
Results may alter/effect the service productization and business model, which is good if identified !
Migration scenariosASP.NET MVC and mobile apps architecture
Starting Point
SQL ServerAs Master Data
& Indexing Storage
.json
Web Page
Razor View + MVCRendering
Partial View
MVC Controller + Handler
Partial ViewCombination
Ajax
Individual With Private Data
Invidividual WithPrivate Data
Community / Shared Data
Understanding Resource Usage• User specific vs. group (of users) specific
• Processing capacity, storage, network
• Concretize: Turn data into transferable files• Etags, expiration, ”visible/concrete file” data• Transfering files is what caches and CDN are all about
• Updates and user-given free form searches• Fraction of total request count, doesn’t need to be immediate
.html
.js = jQuery
.jpg
.json=> Already...
Initial goal: CPU consolidation + network bandwidth caching
SQL ServerAs Master Data
& Indexing Storage
.json .json
.json
RestructureStorage
To Master JSON
JSONInformation
Dependencies
JSONInformation
Dependencies
JSONInformation
Dependencies
HTML, jQuery,Dust.js etc
Served Directly as .json BLOB
jQuery & BrowserRendering
Individual With Private Data
Invidividual WithPrivate Data
Community / Shared Data
JSON / RESTServed DirectlyAs .json BLOB
Personal Data
User specific”CPU + SQL queries” away from GET requests:600ms => 60ms
Updates and free form searches as ASYNC, results as JSON
Scalable Cloud StorageHtml/Javascript/Css/JSON
Web Content ServingGET Requests Web Operation Handling
POSTS Requests
Web Page(s)
Libraries / FrameworksTo Support Viewing
Libraries / FrameworksTo Support Submitting
Libraries / FrameworksTo Support JSON Data
Management
Owner SeparatedStorage & Operations
HTTP GET
HTTP POST
Already have: Entity Framework models – for information flows
Already: URL (json)= EF + WCF Dataservice
User specific authorizationsimple to do in EF level:+ T4 code generation+ .JSON model customization as implicit by-product
HTTPModule can store response into.json file with URL:n as filename= Works as Blob name as-is=> Directly into Azure Blob
Safe step-by-step migration doable
SQL ServerAs Master Data
& Indexing Storage
.json
Web Page
Razor View + MVCRendering
.json .json
.json
Partial View
MVC Controller + Handler
Partial ViewCombination
Ajax
RestructureStorage
To Master JSON
JSONInformation
Dependencies
JSONInformation
Dependencies
JSONInformation
Dependencies
Web Page
jQuery & BrowserRendering
Individual With Private Data
Invidividual WithPrivate Data
Community / Shared Data
JSON / RESTServedDirectly
JSONInformation
Dependencies
Community wide public content:CDN Deliverable / cacheable, browser cacheable- One blob-storage cache applies to all users
Individual user private data:Browser cacheable
- Requires user specific blob-caching on server side
User vs. Public community data identification is best
done at database level. The separation can then be
applied to further processing chain with
simple rules.
Entity Framework
ASP.NET MVC + WCF Dataservice+ html & jQuery
.json files from blobs + ”Browser Templates”+ html & jQuery
DEMOAzure Blob Storage view vs. Local Filesystem view of web-app
Boiling down to – gradually
• Server is just authorizing HTTPS/WSS handler
• Data to be served is .html, .js, binary and .json
• Encrypted cookies/tokens provide state management
• Request sources and handled data as whole:• URL of request = what app/instance, which group of users• Cookie(s) of request = who is asking, what is the state of process• Rest of stuff from blob storage based on above (either authorized or public)• Memory cache is only for encryption keys that are common over whole
infrastructure• Request statistics/logging => into blob stoarge per request basis =>
background processing to aggregate it
Scalable by design
• User groups as data owners = data storage folders• Like filesystem folder, authorized access required
• GET requests served directly through auth service• POST requests processed by background workers
based on group-bound priorities
• Individual blobs scale to 500 IOPS & 60MB/s• Group specific blob/Azure Files scale to 60MB/s
Layers and levels of caching• Mobiile app has its own local data storage
• Updating and refreshing data like HTTP-web-app
• HTTP cache levels• Completely public = HTTP proxy server cacheable, expire
time• Strictly private = client/browser cache, expire or Etag/MD5• HTTPS/SSL => strictly private (proxies don’t see the traffic)• Response code 200 ”OK” = full network outbound
download• Response code 304 ”Not Changed” = <300 bytes + IOTrans• Expire time cache succesful hit = not network traffic at all
Mobile apps and app stores• Special perk: free data transfer for app package
• Size limit for apps are measured in GBs• Should take advantage on getting semi-static data to users
• Architecture is simplified down to transfering JSON files
• Straightforward simple to support offline mode
• Can possibly take advantage of other cloud file services• OneDrive, Google Drive, DropBox...
Digital Service Cost StructureWhat does it cost to have the requests served..?
Azuren hinnoittelu = kustannuksetWhat How Much Cost Per Unit
1/x jaettu vCore + 768M 750h = 1kk 11,09€ 0,015€/h
1 vCore + 1,75G (* kerrannaiset tästä johtaen)
750h = 1kk 44,33€ 0,06€/h
SQL Database (Web & Business) (RET) 150 GB 168,14€ 1,12€/GB
Verkkokaista ulospäin 2000 GB 178,29€ 0,09€/GB
Block Blob Storage (Locally Red.) 10000GB 178,73€ 0,018€/GB
Block Blob Storage (Geo R. Read Acc.) 10000GB 681,67€ 0,068€/GB
Page Blob (= Disk) Storage (LR) 10000GB 372,35€ 0,037€/GB
Page Blob (= Disk) Storage (RA-GRS) 10000GB 759,96€ 0,076€/GB
Files (SMB Share) (LR) 10000GB 297,88€ 0,030€/GB
Files (SMB Share) (RA-GRS) 10000GB 968,11€ 0,097€/GB
Storage Transactions (LR & RA-GRS) 1000M 26,81€ 0,02681€/M
Content Delivery Network 2000GB 129,58€ 0,06€/GB
”Own Servers” vs Azure
• Much cheaper CPU time• Does not scale to millions of users
• Much cheaper outbound bandwidth• When on balanced load, does not scale to spikes• ... Real world service is never under balanced load...
• Reliable and scalable storage costs a LOT• ... And still doesn’t scale to millions of users spiking
Add redundancy and PaaS level maintenance...
Azuren Pricing vs Private Cloud
• Disclaimer: UpCloud only as an example
• Not 1:1 comparable, yet still comparable...
• UpCloud Price List: considerable pricing difference between Helsinki & London, better check from sales... (used cheaper prices for the comparison)
Azuren Pricing vs UpCloud• Storage: Azure (LR-3x) Blob Storage = 0,018€/GB• MaxIOPS = 0,20€/GB (~ 10x more expensive)• HDD = 0,09€/GB (~ 5x more expensive)• SSD = 0,36€/GB (~ 20x more expensive)• Backup (why?) = 0,05€/GB (+ 2,7x more expensive – if
done still with blob storage = 0,036€/GB total)• IO Transaktion = 0€ (cheaper, clearly not identified)
• Outbound Network Bandwidth• Azure: 0,09€/GB (blob & processable), 0,06€/GB CDN• UpCloud: 0,05€/GB
Differentiating factor? Control!• ”What is the capacity of highway on cars per hour?”
• How to measure capacity of city, roads, crossroads?
• Highway can be charged as-a-service per served car• Primitive storage system can be as well• Primitive metrics can be designed against...
Azure Storage Scalability and Performance Targetshttp://msdn.microsoft.com/en-us/library/azure/dn249410.aspxSingle Blob: ”Up to 60 MB per second, or up to 500 requests per second”
Thank You!Questions / explanations..?