Designing Systems
How I think about designing services and features — the moves I keep coming back to, written down so they're explicit instead of implicit.
Table of Contents
When I sit down to design something — a service, a feature, an integration — I notice I keep running the same set of moves before I write a line of code. Writing them down so they stop being implicit. This is what I currently know.
SALT — what I run before I buildLink to heading
Before I write a contract, before I draw boxes and arrows, I make myself answer four questions:
| Question | Why it matters | |
|---|---|---|
| S — Scope | What are we building, for whom, read-heavy or write-heavy? What’s explicitly out of scope? | Anchors every later decision; stops me from over-building |
| A — Auth | Who can access what? Where does auth happen — here or upstream? Permission granularity? | Security is a boundary decision; it shapes the architecture |
| L — Latency | How fast does this need to feel? Local ms or remote roundtrips? Cached vs live? | Drives caching, async, sync-vs-queue choices |
| T — Tasks | What’s one discrete, testable unit of work? One fat operation or many small ones? | Defines boundaries — fat tool vs many focused, monolith vs services |
Error handling, observability, and scaling all fall out of these four once they’re answered. Skipping SALT is how I end up over-engineering things that didn’t need it, or under-engineering things that did.
The 6-step loopLink to heading
Once SALT is answered, this is the order I work through:
- Clarify the problem — functional and non-functional. The “never skip” is the whole point.
- Define the contract — inputs, outputs, errors. The interface is the most durable thing I’m designing.
- Sketch the architecture — boxes and arrows. No code yet.
- Go deep on 1–2 hard parts — the bottleneck, the consistency problem, the security boundary.
- Cross-cutting concerns — failure modes, scale, observability, security, cost.
- Trade-offs and what I’d revisit — “I chose X for now; at 10× I’d move to Y.”
Step 6 is the one I skip the most. Naming the limits of what I built is the difference between a design I trust and a design I’m just defending.
The trade-off axesLink to heading
Almost every design decision is choosing a point on one of these spectrums. The shortcut I keep coming back to is just naming the spectrum out loud before picking a side.
The phrasing that helps me actually decide: “It depends on X. If A matters more, I lean this way; if B matters more, the other. Here’s what I’d need to know to decide.” That’s the difference between “it depends” as a hedge and “it depends” as a real answer.
Concepts I want fluentLink to heading
Not for lecturing about them. Just for pulling the right one when it shows up:
- Data & consistency — stateless vs stateful, cache-aside / write-through / TTL invalidation, sync vs async, idempotency, strong vs eventual consistency
- Scale & reliability — horizontal vs vertical, rate limiting + backpressure, timeouts + retries with backoff, circuit breakers, graceful degradation, removing SPOFs
- Boundaries & maintainability — clean layering (transport ↔ business logic ↔ connectors), composition over monoliths, versioning (additive safe, removal breaking), structured error contracts
- Operability — structured logging, correlation IDs, p50/p95/p99, audit logging — “how do I know it’s broken before someone tells me?”
Scenarios I keep running intoLink to heading
| Scenario | How I work it |
|---|---|
| Designing something open-ended | SALT → 6-step loop. Depth goes to the contract and one hard component. |
| ”Scale this to N× traffic” | Identify the bottleneck first. Measure before scaling blindly. Stateless + horizontal, caching, async/queue for spikes, rate limiting, DB scaling. |
| Slow / failing in prod | Observe before acting — metrics + traces localize it. Hypothesis, isolate, fix, add a regression test or alert. |
| Dependency failure | Timeouts → retries with backoff → circuit breaker → graceful degradation. Structured errors the caller can reason about. |
| Two teams building on the same thing | Clear contracts, versioning, additive changes, backwards compat, observability the consumers can use. |
| A design I’d do differently | Be honest about the trade-off and what I’d revisit. |
Habits I want, habits I want to dropLink to heading
| ✅ Keep | ❌ Drop |
|---|---|
| Clarify the problem before designing | Jump to a stack with no requirements |
| Name the trade-off, then resolve it with criteria | Defend one answer with false certainty |
| Pick one hard part and go deep | Skate the surface on everything |
| Bring up failure, scale, security unprompted | Ignore “what happens when this breaks” |
| Acknowledge the limits of my own design | Defend a choice past its usefulness |
| Think about the consumer and the next maintainer | Treat the interface as an afterthought |
The mantraLink to heading
Run SALT. Walk the loop. Name the trade-off, then resolve it. Design for the people who’ll call it and the people who’ll maintain it. The AI generates; I verify.