Skip to main content

2 posts tagged with "50PaperChallenge"

View All Tags

Why Write Amplification, Not Just Throughput, Shapes Modern Databases [50PaperChallenge]

· 8 min read
Narendra Dubey
Systems builder. Platform tinkerer. Distributed architecture troublemaker.

Lessons from LSM Trees and WiscKey — Paper #2 & #3 of #50PaperChallenge

Introduction: Why This Paper Stayed With Me

In my #50PaperChallenge journey, I've been deliberately alternating between foundational theory and systems papers that quietly changed the industry. This pairing — LSM Tree (O’Neil et al., 1996) and WiscKey: Separating Keys from Values in SSD-Conscious Storage — sits squarely in that second category.

LSM Trees are everywhere today — RocksDB, Cassandra, HBase, LevelDB, DynamoDB's storage engine — traces its lineage back to the LSM Tree. We configure them, tune them, and occasionally curse them during compaction storms — often without thinking too deeply about why the design works or what exact cost we’re paying for that performance.

When I first encountered LSM Trees years ago, I mentally bucketed them as “the write-optimized alternative to B-Trees” and moved on.

LSM Trees are faster for writes, slower for reads, and compaction is expensive.

That's not wrong — but it's dangerously incomplete.

Why Latency, Not Partitions, Dictates Your Database's Consistency [50PaperChallenge]

· 6 min read
Narendra Dubey
Systems builder. Platform tinkerer. Distributed architecture troublemaker.

Confession: As someone with difficulty reading a lot of text, I’m definitely not a fan of long, dense academic text. Video lectures have always been my preferred way to learn. Honestly, reading research papers is something I’ve dodged for years—too much jargon, too many walls of text, and not enough clarity. But that’s exactly why I’m giving myself this challenge #50PaperChallenge: I want to see how far I can go if I really stick with it, and whether pushing through helps me learn things that actually last.

My goal isn’t just to skim headlines or collect citations. I want to go deeper—reading seminal technical whitepapers and really figuring out what’s inside, even if that means slowing down, re-reading, and wrestling with tough concepts.

But here’s the twist: I’m doing all this in public, right here, as a sort of open online notebook.

Why? Two big reasons:

Memory for my future self: Writing down my takeaways helps me process, organize, and actually remember what I’ve learned. Putting them out there means I can always come back later when I need a refresher.

Maybe it helps you too: If you’re an engineer, researcher, or just another tech nerd, maybe these notes will help you discover (or rediscover) some classics. Or maybe you’ll just relate to my struggle—and those occasional “aha!” moments—trying to crack technical content.

So, consider this an open journal. I’ll do my best to cut through the jargon, flag the breakthroughs, and be honest about what clicked and what didn’t.

To kick things off, I picked a paper that’s sparked more conversations (and arguments!) in our world than almost any other:

Consistency Tradeoffs in Modern Distributed Database System Design by Daniel Abadi

Let's unpack that.