Dwarves Memo

Gossip protocol

Experiment selection

Do one thing well

Compose newsletter

Our bets 🧙‍♂️

Focus on delivery

Go the extra mile

Properties

Tags: database sharding engineering

Location

Folder: /research/notes

Stats

Words:271
Characters:2,006
Blocks:12
Reading time:2m

Why?

Data-driven app need ability to scale dynamically to against significant growth in the future

What?

Database architecture
Horizontal partitioning - separating one table’s rows into multiple different tables (partitions)

How?

Breaking up one’s data into 2 or more smaller chunks (logical shards)
Chunks are distributed across separate DB nodes (physical shards)
Exemplify a shared-nothing architecture - means that they don’t share any of the same data or computing resources
A table can be replicate to all shared as a ref table
Can be implement in both BE layer and DB layer (clickhouseDB)

Benefit

Facilitate horizontal scaling (scaling out) by adding more machines to an existing stack in other to spread out the load and allow for more traffic and faster processing <> scaling up - add more compute power by upgrade hardware
Ignore limitation of non-distributed database in terms of storage and compute power.
Increasing speed of query stmt (each shard query itself instead of all rows)
Avoid the the impact of outages (an outage is likely to affect only a single shard)

Limitation

Complexity of shared DB architecture implementation
Manage data across multiple shard locations instead of one point
Shards eventually become unbalanced
Difficult to return to its unsharded architecture
Sharding isn’t natively supported by every database engine

Sharding architectures

Key Based Sharding: hashing a value from new data to determine which shard the data should go to
Range Based Sharding: Using range of given value to determine target shard to store data
Directory Based Sharding: Implement a lookup table for shard system.

sticker #2

Subscribe to Dwarves Memo

Receive the latest updates directly to your inbox.