Skip to content

Engineering Thinking

How to make decisions. Frameworks for when and why, not just how.


TL;DR

Core Principles:

  • Everything is a trade-off - No universally correct solutions
  • Reversibility determines investment - Spend decision energy proportional to cost of being wrong
  • Solve the problem you have - Not the one you imagine
  • Optimize for reading - Code is read 10x more than written

Database:

  • Normalize when data changes frequently
  • Denormalize when read performance is critical and data is stable
  • JSON columns for unstructured, rarely-queried data only

APIs:

  • Idempotency - Make POST safe to retry with idempotency keys
  • Version only for breaking changes (removing fields, changing types)
  • Pagination - Cursor-based for feeds, offset for admin tables

Abstraction:

  • Rule of three - Don't abstract until you have 3 concrete examples
  • Wrong abstraction > duplication - Inline bad abstractions, re-abstract later

1. Core Principles

These apply across all technical decisions.

Everything Is a Trade-off

No universally correct solutions. Every decision trades something for something else.

┌─────────────────────────────────────────────────────────┐
│                    Trade-off Triangle                   │
│                                                         │
│                      Flexibility                        │
│                          ▲                              │
│                         ╱ ╲                             │
│                        ╱   ╲                            │
│                       ╱     ╲                           │
│                      ╱       ╲                          │
│                     ╱         ╲                         │
│          Simplicity ◄─────────► Performance             │
│                                                         │
│   You can optimize for two. The third suffers.          │
└─────────────────────────────────────────────────────────┘

Key question: What are we optimizing for, and what are we willing to sacrifice?

Reversibility Determines Investment

┌────────────────────────────────────────────────────────────────┐
│                                                                │
│   Easy to reverse    ──────────────────►    Hard to reverse    │
│                                                                │
│   • Variable names          • Database schema                  │
│   • Internal APIs           • Public API contracts             │
│   • Code structure          • Data deletion                    │
│   • Feature flags           • Architecture boundaries          │
│                                                                │
│   Decide quickly            Invest in getting it right         │
│   Change if wrong           Prototype and validate first       │
│                                                                │
└────────────────────────────────────────────────────────────────┘

Mental model: Spend decision-making energy proportional to the cost of being wrong.

Solve the Problem You Have

Today's problem:        "We need to store user preferences"

Over-engineered:        "Let's build a generic key-value store with
                        pluggable backends, schema validation,
                        multi-tenancy, and real-time sync"

Right-sized:            "Let's add a preferences JSONB column to the
                        users table"

Red flags you're solving the wrong problem:

  • "We might need this later"
  • "Other teams could use this"
  • "It would be cool if..."
  • Building for 1M users when you have 1,000

Optimize for Reading

Code is read 10x more than it's written. Optimize for the reader.

Clever (write-optimized):
  users.filter(u => u.roles.some(r => perms[r]?.includes(action)))

Clear (read-optimized):
  const usersWithPermission = users.filter(user => {
    return userHasPermission(user, action)
  })

Test: Can a new team member understand this in 30 seconds?


2. Database Design

The Fundamental Question

Before any schema decision:

┌─────────────────────────────────────────────────────────┐
│                                                         │
│   How will this data be:                                │
│                                                         │
│   1. Written?    (frequency, volume, patterns)          │
│   2. Read?       (queries, joins, aggregations)         │
│   3. Changed?    (schema evolution, data migrations)    │
│                                                         │
└─────────────────────────────────────────────────────────┘

Most applications are read-heavy. Design for your read patterns first.

Normalization: A Decision Framework

Normalization = splitting data across tables to reduce duplication Denormalization = duplicating data to optimize reads

                    Normalize                 Denormalize
                        │                          │
                        ▼                          ▼
Write performance      Better                    Worse
Read performance       Worse (joins)             Better
Data consistency       Automatic                 You manage it
Storage                Less                      More
Schema flexibility     Higher                    Lower

Decision guide:

SituationRecommendation
Data changes frequentlyNormalize (single source of truth)
Read performance criticalConsider denormalization
Data rarely changes after creationDenormalization is safer
You need complex queriesNormalize (more flexible)
You have clear, simple access patternsDenormalize for those patterns

Example:

Scenario: E-commerce order history

Option A: Normalized
┌──────────┐     ┌──────────────┐     ┌──────────┐
│ orders   │────►│ order_items  │────►│ products │
└──────────┘     └──────────────┘     └──────────┘

  - Product name changes reflect everywhere
  - Need joins to display order history
  - Order history shows CURRENT product names (bug or feature?)

Option B: Denormalized
┌──────────────────────────────────────┐
│ orders                               │
│  └─ items[] (embedded)               │
│       └─ product_name (copied)       │
│       └─ price_at_purchase (copied)  │
└──────────────────────────────────────┘

  - Order history is a single read
  - Shows what customer ACTUALLY ordered
  - Product changes don't affect history

The right answer: Denormalize. Order history should be immutable — you want to know what they paid and what it was called when they bought it.

Choosing Primary Keys

┌─────────────────────────────────────────────────────────────────┐
│                         Key Type                                │
├───────────────┬───────────────┬───────────────┬─────────────────┤
│ Auto-increment│ UUID          │ ULID          │ Natural Key     │
├───────────────┼───────────────┼───────────────┼─────────────────┤
│ Sequential    │ Random        │ Time-sortable │ Business data   │
│ Compact       │ Large (36ch)  │ Large (26ch)  │ Varies          │
│ Guessable     │ Unguessable   │ Unguessable   │ May change      │
│ Single DB     │ Distributed OK│ Distributed OK│ Coupling risk   │
└───────────────┴───────────────┴───────────────┴─────────────────┘

Decision guide:

SituationRecommendation
IDs exposed in URLsUUID/ULID (not guessable)
Need to sort by creation timeULID or auto-increment
Multi-region/distributedUUID/ULID
Internal-only, single DBAuto-increment is fine
The "natural" key might changeDon't use it as PK

Common mistake: Using email as a primary key. Emails change. Use a synthetic key.

When to Use JSON/JSONB Columns

Good fit for JSON:
  ✓ User preferences (arbitrary, user-controlled)
  ✓ API response caching
  ✓ Metadata that varies by type
  ✓ Rarely queried data
  ✓ Data you're still figuring out

Bad fit for JSON:
  ✗ Data you query/filter frequently
  ✗ Data with relationships to other tables
  ✗ Data requiring validation/constraints
  ✗ Core business entities

Rule of thumb: If you're writing WHERE json_column->>'field' = ? in multiple places, that field should be a column.

Indexing Decisions

Indexes speed up reads but slow down writes and consume storage.

Decision guide:

Add an index when...Skip the index when...
Column is in WHERE clausesTable is small (<1000 rows)
Column is in JOIN conditionsColumn has few unique values
Column is in ORDER BYTable is write-heavy, rarely read
Query performance is sufferingYou're guessing "might need it"

Start without indexes. Add them when you have evidence of slow queries.

Schema Evolution Mindset

Your schema will change. Design for it.

Strategies for safe schema changes:

1. Additive changes (safe)
   - New nullable columns
   - New tables
   - New indexes

2. Backward-compatible changes (careful)
   - Rename: add new column → migrate data → remove old column
   - Type change: usually requires new column approach

3. Breaking changes (coordinate carefully)
   - Removing columns
   - Adding NOT NULL constraints
   - Changing relationships

Rule: Never deploy code that requires schema changes atomically. Always:

  1. Deploy code that works with old AND new schema
  2. Migrate schema
  3. Deploy code that only works with new schema
  4. Clean up old column

3. API Design

The Contract Mindset

An API is a contract. Once published, changing it breaks someone.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Internal APIs                    External/Public APIs         │
│   ─────────────                    ─────────────────────        │
│   Can change with coordination     Must be versioned            │
│   Consumers are known              Consumers are unknown        │
│   Can communicate breaking         Must maintain compatibility  │
│   changes directly                                              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key question: Who calls this, and can I coordinate with them?

Resource Design

Think in nouns (resources), not verbs (actions).

Poor (action-oriented):
  POST /createUser
  POST /updateUserEmail
  GET  /getUserById

Better (resource-oriented):
  POST   /users           Create
  GET    /users/:id       Read
  PATCH  /users/:id       Update
  DELETE /users/:id       Delete

When actions don't fit resources:

POST /orders/:id/cancel      (action on resource)
POST /users/:id/verify-email (action on resource)
POST /reports/generate       (process, not entity)

Idempotency

An operation is idempotent if doing it multiple times has the same effect as doing it once.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Naturally idempotent:                                         │
│   GET    - Reading doesn't change state                         │
│   PUT    - "Set X to Y" same whether done 1x or 10x             │
│   DELETE - "Delete X" - already deleted? Still deleted.         │
│                                                                 │
│   NOT naturally idempotent:                                     │
│   POST   - "Create X" - doing it twice creates two              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why it matters:

Client                           Server
   │                                │
   ├──── POST /payments ───────────►│
   │                                │ (processes payment)
   │◄─────── (network timeout) ─────┤
   │                                │
   │     Did the payment go through?│
   │     Is it safe to retry?       │

Making POST idempotent with idempotency keys:

POST /payments
Headers:
  Idempotency-Key: abc-123-unique-client-generated

Server behavior:
  1. Check if we've seen this key before
  2. If yes, return the original response
  3. If no, process and store key → response mapping

API Versioning

When do you need versioning?

Need version:
  ✗ Removing a field
  ✗ Changing field type
  ✗ Changing field meaning
  ✗ Changing required fields

Don't need version:
  ✓ Adding optional fields
  ✓ Adding new endpoints
  ✓ Adding new optional query params

Versioning strategies:

1. URL versioning
   /v1/users
   /v2/users

   ✓ Very explicit, easy to route
   ✗ Suggests whole API versions (usually overkill)

2. Header versioning
   Accept: application/vnd.api+json; version=2

   ✓ Clean URLs
   ✗ Hidden, harder to test

3. Query parameter
   /users?version=2

   ✓ Explicit, easy to use
   ✗ Pollutes query string

Recommendation: URL versioning is clearest. But avoid versioning if possible — design APIs to be evolvable.

Pagination

Offset-based:

GET /items?offset=20&limit=10

✓ Simple, supports "jump to page 5"
✗ Slow for large offsets
✗ Inconsistent if data changes between pages

Cursor-based:

GET /items?cursor=abc123&limit=10
Response: { items: [...], nextCursor: "def456" }

✓ Consistent results
✓ Performant at any position
✗ Can't jump to arbitrary page

Decision: Cursor-based for feeds/timelines. Offset for admin tables with small datasets.


4. Code Architecture

The Dependency Rule

Higher-level modules should not depend on lower-level modules.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   ┌─────────────────────────────────────────────┐               │
│   │           Business Logic                     │  ← Pure,     │
│   │   (rules, calculations, decisions)           │    no I/O    │
│   └──────────────────────▲──────────────────────┘               │
│                          │                                      │
│   ┌──────────────────────┴──────────────────────┐               │
│   │           Application Layer                  │  ← Orchestr- │
│   │   (use cases, workflows)                     │    ation     │
│   └──────────────────────▲──────────────────────┘               │
│                          │                                      │
│   ┌──────────────────────┴──────────────────────┐               │
│   │           Infrastructure                     │  ← I/O,      │
│   │   (DB, HTTP, external services)              │    details   │
│   └─────────────────────────────────────────────┘               │
│                                                                 │
│   Dependencies point inward. Inner layers don't know about      │
│   outer layers.                                                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why this matters:

  • Business logic can be tested without databases
  • You can swap infrastructure without touching business logic
  • Core doesn't depend on framework choices

Practical example:

typescript
// BAD: Business logic knows about infrastructure
function calculateOrderTotal(orderId: string) {
  const order = await prisma.order.findUnique({ where: { id: orderId }})
  // calculation logic
}

// GOOD: Business logic receives what it needs
function calculateOrderTotal(order: Order): Money {
  // pure calculation, no I/O
}

// Orchestration layer connects them
async function getOrderTotal(orderId: string) {
  const order = await orderRepository.find(orderId)
  return calculateOrderTotal(order)
}

Boundaries and Modules

A module should have a clear boundary and responsibility.

Signs of a good module:
  ✓ You can describe what it does in one sentence
  ✓ It has a small public interface
  ✓ Changes inside don't ripple outside
  ✓ It can be tested independently

Signs of a poor module:
  ✗ "It handles users and orders and notifications and..."
  ✗ Changing it requires changing many other files
  ✗ You need to understand its internals to use it
  ✗ Circular dependencies with other modules

Coupling and Cohesion

Coupling = how much modules depend on each other Cohesion = how related things are within a module

Goal: Low coupling, high cohesion.

typescript
// High coupling: OrderService knows too much about UserService
class OrderService {
  createOrder(userId: string) {
    const user = this.userService.findById(userId)
    const address = this.userService.getDefaultAddress(userId)
    const payment = this.userService.getDefaultPaymentMethod(userId)
    // ...
  }
}

// Lower coupling: OrderService receives what it needs
class OrderService {
  createOrder(input: {
    userId: string,
    shippingAddress: Address,
    paymentMethod: PaymentMethod
  }) {
    // ...
  }
}
// Orchestration layer gathers the data

5. Abstraction & Generalization

The Rule of Three

Don't abstract until you have three concrete examples.

Day 1: Need to send email notifications
       → Write email notification code

Day 30: Need to send Slack notifications
        → Write Slack notification code
        → Notice similarity, but don't abstract yet

Day 60: Need to send SMS notifications
        → NOW you have three examples
        → NOW you can see the real pattern
        → NOW abstract

Why wait?

  • Two examples might be coincidentally similar
  • Three reveals the true abstraction
  • Premature abstraction encodes wrong assumptions

Duplication vs Wrong Abstraction

"Duplication is far cheaper than the wrong abstraction" — Sandi Metz

The wrong abstraction lifecycle:

1. Developer sees duplication
2. Developer creates abstraction
3. New requirement doesn't quite fit
4. Developer adds parameter/flag
5. Repeat 3-4 many times
6. Abstraction becomes incomprehensible
7. Everyone is afraid to touch it

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   function doThing(                                             │
│     input,                                                      │
│     options = {},                                               │
│     legacyMode = false,                                         │
│     skipValidation = false,                                     │
│     useNewBehavior = true,                                      │
│     customer = null,  // only for enterprise                    │
│     ...etc                                                      │
│   )                                                             │
│                                                                 │
│   This was once "clean." Each flag was a "small change."        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Recovery: Inline the abstraction, live with duplication, re-abstract when patterns emerge.

When to Abstract: Decision Framework

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   1. Do I have 3+ concrete examples?                            │
│      No  → Keep things concrete                                 │
│      Yes → Continue                                             │
│                                                                 │
│   2. Is the pattern stable, or still evolving?                  │
│      Evolving → Wait, patterns still emerging                   │
│      Stable   → Continue                                        │
│                                                                 │
│   3. Does the abstraction simplify or complicate?               │
│      Complicate → Cost outweighs benefit                        │
│      Simplify   → Continue                                      │
│                                                                 │
│   4. Can I explain the abstraction simply?                      │
│      No  → Don't understand it well enough yet                  │
│      Yes → Abstract!                                            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Parameterize vs Duplicate

Parameterize when...Duplicate when...
Variations are truly the same patternVariations have different lifecycles
Parameters are simple (types, counts)Behavior differs significantly
You own all call sitesCaller contexts are very different
The abstraction is stableRequirements are still evolving

Configuration vs Code

Red flags of over-configuration:

  - Config files longer than the code they configure
  - Config that's really a programming language
  - Changes require "just a config change" but break things
  - Nobody understands all the config options

The rule:
  Configuration is for operators.
  If only developers change it, it should be code.

6. Data Modeling

Entities vs Value Objects

Entity: Has identity, tracked over time
  - User (user #123 is the same user even if email changes)
  - Order (order #456 exists independently)

Value Object: Defined by attributes, immutable
  - Money ($10 is $10, no identity)
  - Address (defined by its components)
  - Email (validated format, no identity)

Why it matters:

typescript
// Entity: compare by ID
user1.id === user2.id

// Value Object: compare by value
address1.equals(address2)  // compares all fields

State Machines

Many bugs come from invalid state transitions. Model states explicitly.

Order states:

  ┌─────────┐    confirm    ┌───────────┐
  │ PENDING ├──────────────►│ CONFIRMED │
  └────┬────┘               └─────┬─────┘
       │                          │
       │ cancel                   │ ship
       ▼                          ▼
  ┌──────────┐              ┌──────────┐
  │ CANCELED │              │ SHIPPED  │
  └──────────┘              └────┬─────┘
                                 │ deliver

                            ┌───────────┐
                            │ DELIVERED │
                            └───────────┘

Invalid transitions (code should prevent):
  CANCELED → CONFIRMED  ✗
  DELIVERED → PENDING   ✗

Implementation:

typescript
const validTransitions = {
  PENDING: ['CONFIRMED', 'CANCELED'],
  CONFIRMED: ['SHIPPED', 'CANCELED'],
  SHIPPED: ['DELIVERED'],
  DELIVERED: [],
  CANCELED: []
}

function transitionOrder(order: Order, newState: State) {
  if (!validTransitions[order.state].includes(newState)) {
    throw new InvalidTransitionError(order.state, newState)
  }
  order.state = newState
}

Temporal Data

When data changes over time:

1. Overwrite (no history)
   ✓ Simple
   ✗ Can't answer "what was their email last month?"

2. Audit log (separate history)
   ✓ History preserved, main table simple
   ✗ Querying history is separate

3. Temporal table (versioned records)
   ✓ Can query "as of" any point
   ✗ More complex queries

4. Event sourcing (store events, not state)
   ✓ Complete history, can rebuild any point
   ✗ Significant complexity

Decision: Start with overwrite + audit log. Temporal tables only if you query historical state.


7. Error Handling

Error Categories

┌─────────────────────────────────────────────────────────────────┐
│   Programming Errors (bugs)                                     │
│   - Null pointer, index out of bounds, type mismatches          │
│   Action: Crash, log, fix the code                              │
├─────────────────────────────────────────────────────────────────┤
│   Operational Errors (expected failures)                        │
│   - Network timeout, file not found, invalid user input         │
│   Action: Handle gracefully, may retry, inform user             │
├─────────────────────────────────────────────────────────────────┤
│   Business Rule Violations                                      │
│   - Insufficient balance, order already shipped                 │
│   Action: Return meaningful error, consider Result types        │
└─────────────────────────────────────────────────────────────────┘

Result Types vs Exceptions

typescript
// Exception approach
function withdraw(amount: number): void {
  if (balance < amount) throw new InsufficientFundsError()
}

// Caller must remember to try/catch (easy to forget)

// Result type approach
function withdraw(amount: number): Result<Transaction, WithdrawError> {
  if (balance < amount) {
    return { ok: false, error: 'INSUFFICIENT_FUNDS' }
  }
  return { ok: true, value: transaction }
}

// Caller must handle the result (forced to consider failure)

Use exceptions for: Unexpected failures, programming errors Use result types for: Expected business failures, validation

Fail Fast

typescript
// BAD: Error discovered deep in the stack
function processOrder(data: any) {
  // ... 100 lines of code ...
  const email = data.user.email  // crashes here if user is null
}

// GOOD: Validate at the boundary
function processOrder(data: unknown) {
  const validated = validateOrderInput(data)  // fails fast
  if (!validated.ok) return validated.error
  processValidatedOrder(validated.value)
}

Validate at boundaries: API endpoints, event handlers, user input. After validation, trust the data internally.


8. Testing Strategy

The Testing Pyramid

                    ╱╲
                   ╱  ╲
                  ╱ E2E╲         Few, slow, expensive
                 ╱──────╲
                ╱        ╲
               ╱Integration╲     Some, medium speed
              ╱────────────╲
             ╱              ╲
            ╱   Unit Tests   ╲   Many, fast, cheap
           ╱──────────────────╲

What to Test

Unit tests: Pure logic, calculations

typescript
calculateOrderTotal(items)
validateEmail(input)
applyDiscount(price, discount)

Integration tests: Boundaries, I/O

typescript
"API returns 400 when email is invalid"
"Repository saves and retrieves Order"

E2E tests: Critical journeys only

typescript
"User can sign up, create order, checkout"

Testing Heuristics

Test behavior, not implementation
─────────────────────────────────
BAD:  "calls saveToDatabase exactly once"
GOOD: "after saving, order appears in order list"

Test boundaries, not internals
──────────────────────────────
BAD:  Testing private methods
GOOD: Testing public interface

When Not to Test

Skip tests for:
  - Trivial code (getters, setters)
  - Third-party library behavior
  - Code about to be deleted

Definitely test:
  - Business logic
  - State machines
  - Security boundaries
  - Anything that's broken before

TopicResource
Database Design"Designing Data-Intensive Applications" by Kleppmann
Code Architecture"A Philosophy of Software Design" by Ousterhout
Abstractions"The Wrong Abstraction" by Sandi Metz
API Design"RESTful Web APIs" by Richardson & Ruby

Related: