1. Social Media Feed (e.g., Twitter)

Problem Statement:
Design a system to display a real-time feed of posts from users a person follows. Prioritize freshness, scalability, and low latency.

Key Requirements:

  • Functional: Post creation, follow/unfollow, feed generation, trending topics.
  • Non-Functional: Handle 10M+ users, <1s latency for feed refresh, 99.99% availability.

What’s Being Practiced?

  • Fan-out Strategies: Push vs. pull models (e.g., write-time fan-out to followers’ caches).
  • Caching: Redis for hot feeds, CDNs for static content.
  • Sharding: Distribute users by user_id to balance load.
  • Real-Time Analytics: Track trending hashtags with Apache Kafka + Flink.

Solution Sketch:

  • Hybrid Fan-out: Push posts to followers’ caches for active users, pull for inactive.
  • Storage: Cassandra for posts (time-series data), Redis for feed caching.
  • APIs:
    • POST /tweet → Kafka → Feed Service → Cache.
    • GET /feed → Fetch from Redis.

2. Ride-Sharing App (e.g., Uber)

Problem Statement:
Design a system to match riders with nearby drivers in real time. Include surge pricing and trip tracking.

Key Requirements:

  • Functional: Driver/rider location updates, ride matching, fare calculation, payment.
  • Non-Functional: <500ms match latency, handle 100K+ concurrent rides.

What’s Being Practiced?

  • Geospatial Data: Use Redis GEO or QuadTrees to find nearby drivers.
  • Real-Time Messaging: WebSocket/GRPC for location updates.
  • Dynamic Pricing: Surge pricing algorithm based on demand/supply.
  • Idempotency: Ensure payment transactions aren’t duplicated.

Solution Sketch:

  • Location Service: Store driver locations in Redis GEO.
  • Matching Engine: Use grid-based spatial partitioning.
  • Payment Service: Idempotent API with Stripe/PayPal integration.

3. Distributed File Storage (e.g., Dropbox)

Problem Statement:
Design a cloud file storage system with real-time sync across devices, versioning, and conflict resolution.

Key Requirements:

  • Functional: File upload/download, sync, version history, sharing.
  • Non-Functional: Handle 1PB+ storage, <200ms sync latency.

What’s Being Practiced?

  • Delta Sync: Only transfer file chunks that changed (like rsync).
  • Conflict Resolution: Use vector clocks to order edits.
  • Distributed Storage: Shard files across S3/GlusterFS.
  • Metadata Management: Store file metadata in PostgreSQL.

Solution Sketch:

  • Chunking: Split files into blocks, hash for deduplication.
  • Versioning: S3 for file history, DynamoDB for metadata.
  • APIs: WebSocket for real-time sync notifications.

4. E-commerce Inventory System (e.g., Amazon)

Problem Statement:
Design an inventory management system that prevents overselling during flash sales.

Key Requirements:

  • Functional: Product catalog, inventory tracking, order processing.
  • Non-Functional: Atomic inventory decrements, 10K+ orders/sec.

What’s Being Practiced?

  • Distributed Locks: Use Redis or ZooKeeper to reserve stock.
  • Idempotent APIs: Prevent duplicate orders.
  • Event Sourcing: Kafka to log inventory changes.
  • Caching: Cache product details with write-through to DB.

Solution Sketch:

  • Inventory Service: Use Redis for atomic DECR operations.
  • Order Pipeline: Kafka → Order Service → Payment Service → Inventory.
  • DB Sharding: Split products by product_id.

5. Video Streaming Platform (e.g., Netflix)

Problem Statement:
Design a video streaming service supporting adaptive bitrate streaming for 50M+ users.

Key Requirements:

  • Functional: Video upload, transcoding, CDN delivery.
  • Non-Functional: <2s startup latency, 99.9% uptime.

What’s Being Practiced?

  • Transcoding Pipeline: FFmpeg + Kubernetes for parallel processing.
  • CDN Optimization: Edge caching with Cloudflare/Akamai.
  • Adaptive Streaming: HLS/DASH to switch bitrates dynamically.
  • Load Balancing: Round-robin vs. least connections for traffic.

Solution Sketch:

  • Upload Service: Store raw videos in S3.
  • Transcoding Service: Convert to multiple resolutions.
  • Delivery: CDN for cached videos, API Gateway for metadata.

Summary: Core Skills Practiced

ProblemKey Skills
Social Media FeedFan-out strategies, caching, real-time analytics.
Ride-SharingGeospatial indexing, surge pricing, idempotency.
File StorageDelta sync, conflict resolution, chunking.
E-commerce InventoryDistributed locks, event sourcing, atomic ops.
Video StreamingTranscoding, CDNs, adaptive bitrate streaming.

Why Practice These Katas?

  1. Broaden Pattern Recognition: Each problem teaches unique strategies (e.g., fan-out for feeds, geospatial indexing for ride-sharing).
  2. Interview Readiness: These mirror real FAANG system design questions.
  3. Tradeoff Analysis: Learn when to prioritize CAP theorem aspects (e.g., availability for feeds, consistency for inventory).

Next Step: Pick one kata, sketch a solution, and compare it with a reference architecture! 🚀