1. Database Internals & Storage

  • Database

  • Database indexes

  • Hash Indexes

  • B-trees

  • LSMtree + SSTable

  • Index concluded

  • ACID database transactions

  • Read committed isolation

  • Snapshot isolation

  • Write skew and phantom writes

  • ACID serial execution

  • Database internals - two phase locking

  • Serializable snapshot isolation

  • Column oriented storage

  • Data serialization frameworks

2. Distributed Systems & Reliability

  • Intro to replication

  • Dealing with stale reads

  • Single leader replication

  • Multi leader replication

  • Dealing with write conflicts

  • CRDTs

  • Leaderless replication intro

  • Quorums - leaderless replication

  • Replication summarized

  • Introduction to partitioning

  • Two phase commit - distributed transactions

  • Consistent hashing

  • Linearizable database

  • Distributed consensus - raft leader election

  • Distributed consensus - raft writes

  • What is zookeeper? - co-ordination services

3. Data Tooling & Specialized Databases

  • SQL vs NO-SQL - who wins

  • MYSQL vs postgressql

  • What’s voltdb and why do we care about it?

  • Google spanner

  • Mongodb vs apache casandra

  • Time series database

  • Graph database

  • Geospatial indexes

  • Search indexes

  • Elasticsearch

4. Big Data, Batch & Stream Processing

  • HADOOP

  • HBASE

  • Mapreduce

  • Right way to do batch job data joins

  • Spark

  • Stream processing

  • Kafak vs rabbitmq

  • Stop messing up your stream processing joins

  • Apache flink

5. Infrastructure, Caching & Networking

  • Distributed caching

  • Cache evictions

  • Readis vs memcached

  • CDN

  • S3 - object store

  • Load balancing

  • TCP VS UDP

  • Long pooling, web sockets, server sent events

  • Monolith vs microservices + docker + kubernets