Scaling
Rough notes on high-scalability systems
- Scalability
- Scalability is the capability of a system, process, or a network to grow and manage increased demand.
- Horizontal
- Scaling up by adding more servers
- Easier to adjust dynamically
- Vertical
- Scaling up by beefing up server resources/power
- Reliability
- The probability a system will fail in a given period
- Achieved through redundancy of both the software components and data.
- Availability
- The time a system remains operational
- If a system is reliable, it is available.
- However, if it is available, it is not necessarily reliable.
- Efficiency
- Response Time
- Throughput
- Serviceability or Manageability
- Also Observability
- Load Balancing
- Prevents single point of failure
- Improves overall application availability and response time
- Distributes Load, no single struggling server
- Provides analytics
- Handles failures
- Can add LBs in three places
- Between the user and the web server
- Between web servers and an internal platform layer, like application servers or cache servers
- Between internal platform layer and database.
- How do they work
- Health checks to backend servers
- Methods
- Round Robin
- Weighted round robin
- Least connections
- Least response time
- IP hash
- LBs must also be redundant
- Caching
- LB helps scaling, but caching enables:
- Vastly better use of resources
- Implementing requirements that would otherwise be unattainable
- Pros: Better for static data
- Cons: Not good for user generated data, where freshness and consistencty is critical
- Assume 1M requests for 100 items
- If we split 50 items to 2 servers, each will still have to deal with 1M requests
- If we replicate 100 items to 2 servers, each will deal with 500K requests instead
- GET requests can be UDP, SET/DELETE can be TCP
- Application server cache
- Could improve request/response time, but if we have scaled with many nodes that are behind a load balancer, we’ll have many cache misses, because requests will be routed to different nodes
- CDNs
- Tip: use a different subdomain (cdn.facebook.com) for static content, will help later when passing DNS to your CDN
- Cache invalidation
- Cache is amazing, but needs management to keep cache coherent with DB
- If data is updated on DB, needs to be updated in cache.
- Known as cache invalidation
- Types of Cache
- Write-through
- Write to cache and DB at the same time
- More reliable, less data losses
- But, with increased latency for write operations
- Write-around
- Bypass cache and write directly to DB
- If data is being reread, cache miss will happen
- Write-back
- Data is written to cache and confirmed immediately to client
- Data will be written to DB at a later time
- Low latency and high thoughput for write-intensive apps
- Comes with risk of data loss
- Write-through
- Cache Eviction policies
- FIFO
- LIFO
- LRU
- LFU
- RR
- LB helps scaling, but caching enables:
- Data Partitioning
- Breaking up big DB in smaller parts
- Easier to scale vertically than vertically after a point
- Types
- Horizontal Partitioning / Sharding / Range based partitioning
- Put ranges of rows in different tables
- If we split across multiple servers, then we have sharding
- Main problem is that if range isn’t chosen properly it’s lead to
unbalanced servers
- e.g. using zip code can result in unbalances (bigger vs smaller cities)
- Put ranges of rows in different tables
- Vertical Partitioning
- Different tables in different servers
- e.g. for instagram, separate servers for
- Profile info
- Friend list
- Photos
- Main problem is that you might need to distribute to more servers if specific server grows big (e.g. billions of photos)
- Federation
- split into multiple DB based on function (product, user, forums DBs)
- Directory based paritioning
- Abstracts away partitioning details
- Horizontal Partitioning / Sharding / Range based partitioning
- Partitioning Criteria
- Key or Hash-based
- 100 servers, ID % 100
- Must ensure uniform allocation
- Con: changinc hash function requires redistributing the data
- List partitioning
- Each parittion is assigned a key
- In order to inser a record, we see which partition has the respective key
- e.g. store Sweden, Nowray etc. in partition for Nordic countries
- Round Robin paritioning
- Ensures uniform distribution
- Composite paritioning
- Combine above methods
- E.g. list partitioning then hash-based
- Combine above methods
- Key or Hash-based
- Problems with partitioning
- Joins and Denormalization
- Join needs to combine data from many servers
- Can be mitigated with denormalization, but we need to deal with data inconsistency if we go down that path
- Referential integrity
- Data integrity constraints, like foreign keys, can be hard
- Frequently not supported by RDBMS, need to be enforced in application code
- Rebalancing
- Needed when data distribution is not uniform, or if there’s a lot of load in one partition
- Perdorming rebalancing is difficult to do without downtime
- Joins and Denormalization
- DB Indexing
- One of the first methods to consider when performance is no longer satisfactory
- Makes searches of records faster
- Slows down data insertion & update
- Can be created using one or more columns
- Makes sense for read-intensive, not for write intenstive apps
- Proxy Servers
- Used to
- Filter requests
- Log
- Transform requests
- Add/remove headers
- Encrypt/Decrypt
- Compressing
- Caching
- Used to
- Redundancy & Replication
- Redundancy is the duplication of critical components or functions of a system
- Goals
- Increasing the reliability of the system, usually in the form of a backup
- Improve actual system performance
- Without redundancy, losing a server means losing the file
- Replication means sharing info to ensure consistency between redundant
resources
- Widely used in RDBMS systems, usually with a master-slave relationships
- Master gets all updates which then ripple to the slaves
- SQL vs NoSQL
- In SQL, you design your schema, in noSQL, you design your Queries
- Relational DBs
- Have predefined schema
- Use rows and columns
- Each row contains all info for entity
- ACID compliant
- Atomicity, Consistency, Isolation, Durability
- Atomicity, each transaction is all or nothing
- Consistency, any transaction will bring the database from one valid state to another
- Isolation, executing transactions concurrently has the same results as if the transactions were executed serially
- Durability, once a transaction has been committed, it will remain so
- Atomicity, Consistency, Isolation, Durability
- Non-Relational DBs
- Sacrifice ACID for scalability and performance
- Unstructured, Distributed, with dynamic schema
- Hold all the data
- Types
- Key-Value storage
- Redis
- Document DBs
- Data stored in documents that are grouped in collections
- Each can have entirely different structure
- MongoDB, CouchDB
- Graph DBs
- Store data whose relationship is better represented as a graph
- Data saved in nodes, properties, lines
- InfiniteGraph, Neo4j
- Key-Value storage
- CAP Theorem
- Consistency, Availability, Partition tolerance
- Distributed systems can provide at most two guarantees ^
- Consistency
- Every read receives the most recent write or error
- Achieved by updating all nodes and then allow further reads
- Sacrifices Availability
- Different from Consistency as defined for ACID
- Availability
- Every request receives a non-error response, w/o guarantee that it contains the most recent write
- Every request gets a response, even if some nodes fail
- Partition tolerance
- The system continues to function even if nodes can’t communicate between them
- When a network failure happens, we have to decide to:
-
- Cancel the operation, i.e. decrease availability but ensure consistency
-
- Proceed with the operation, i.e. provide availability w/o consistency
-
- Consistent Hashing
- Use Binary Search Tree (support successor) to represent cache nodes
- Only k/n keys need to be moved (k: total keys, n: servers)
- Contrast with normal hashing, all keys need to be moved
- DHT is one of the fundamendal components in distributed systems
- Problems with normal hashing (e.g. key % n)
- Not horizontally scalable. It breaks when we add one more node
- Might not be load balanced
- To reduce variance, we can use virtual copies of caching servers
- k copies of each server will make Binary Search Tree k times bigger
- Polling strategies
- AJAX Polling
- Repeatedly poll server
- HTTP Long Polling
- Like repeat polling, but wait and keep connection open
- “Hanging GET”
- Websockets
- Low overhead, real-time TCP connection with server
- Server-Sent Events (SSEs)
- Persistent and long-term connection with server, which server uses to send data to client
- AJAX Polling
- Eventual Consistency
- Provide BASE, instead of ACID
- Basically aVAILABLE, Soft state, Eventual consistency
- Provide BASE, instead of ACID
- Strategies
- Android and iPhone push services
- For 10k-100k users, consider using CDN for static content
- Pull vs Push CDN
- For 500k users, consider breaking up service to microservices
- Each microservice can be scaled independently
- For 1M users, consider LBs between tiers (web, app, DB)
- Use sql EXPLAIN to see how queries are running