Using AI Well

The failure mode isn’t using AI too much. It’s using it without judgment: pasting in a vague question, accepting a mediocre output, and moving on. That’s not AI-assisted work. That’s outsourcing your thinking to a tool that has no context, no stakes, and no accountability for what gets built.

Andrej Karpathy describes this as “Software 2.0”: as neural networks become the primary implementation medium, the programmer’s job shifts from writing code to specifying intent and evaluating output. The model implements. The human architects, directs, and reviews. That shift applies to every role. PM specifies the problem and owns the definition doc. PD specifies the experience. PE directs the implementation. All three are now in the business of clear specification and rigorous evaluation.

Understanding your domain matters more under this model, not less. You can only judge the output if you understand what good looks like. You are the brain. AI is the workhorse. That division only works if you’re actually doing the brain part.

Thinking

Use AI before you talk to anyone.

This is the most underused application across all roles. Before you bring a problem to leadership, before you schedule a Define session, before you send a message you’re uncertain about: open a conversation with AI and think out loud.

For PM: Drop your problem statement in and ask: what’s weak about this? What assumption am I making that could be wrong? What questions would a skeptic ask? This is not about getting AI to validate you. It’s about finding the gaps before someone else does.

For PD: Before presenting a design direction, ask: what usability problems does this create? What edge cases am I not accounting for? What would a user who isn’t familiar with this product think when they see this?

For PE: Before proposing a technical approach, ask: what are the tradeoffs of this approach vs. alternatives? What will make this hard to maintain in six months? What’s the failure mode if the load doubles?

The goal is to arrive at every conversation sharper than you would have alone. AI is an infinitely patient thinking partner who has no agenda and will say what you need to hear if you ask the right questions.

Writing

AI drafts, you edit. If the output requires heavy editing, your prompt was underspecified. Rewrite the prompt, not the output.

What AI handles well: Definition docs, release notes, weekly updates, tickets, client messages, PRDs, retrospective summaries. Any structured writing where the substance exists in your head and the challenge is getting it into clear language.

What good prompting looks like: Give it the context it needs to write what you’d write. Not just “write release notes for this feature.” Instead: “write release notes for a feature that lets users save their cart across sessions. The audience is end users. Tone is clear and direct, no jargon. Here’s what it does: [description]. Here’s what problem it solves: [description].” The more context you give, the less editing you do.

What to watch for: AI writes confidently even when it’s filling in gaps you didn’t provide. If something in the output doesn’t sound right, it’s usually because you left something ambiguous and AI made a choice. Go back to the prompt, be more specific, and regenerate that section.

Research and Synthesis

Give AI a question, not a task.

“Write me a competitive analysis of X” produces a generic list. “What are the key tradeoffs between approach A and approach B for a user who needs to do Y?” produces something you can actually use to make a decision.

PM and PD use cases:

Synthesising user feedback: paste in raw comments or ticket themes, ask for patterns and what they imply
Exploring design patterns: “what are common ways products handle [this UX problem] and what are the tradeoffs?”
Competitive research: ask about how a category of product typically solves a specific problem, then verify what it says
Preparing for leadership sessions: ask AI to steelman the other side of your argument before you walk in

One caution: AI’s training data has a cutoff and it has gaps. Use it to explore and frame, then verify anything that matters with real sources.

Implementation (PE)

The definition doc is the prompt. This is what it was written for.

Feed it in full. Add the codebase context AI needs to produce code that fits: the patterns you use, the adjacent code it should be consistent with, the standards you follow. The more context it has, the less you correct.

What good looks like: You give AI the definition doc plus relevant codebase context. It produces an implementation. You review against three things: does it do what the doc says, is the structure consistent with the codebase, and does it handle the edge cases. If any of these are no, you prompt again. Manual fixes are for small details. Structural problems get fixed by reprompting with better context.

What to build as you go: Every time AI makes a mistake you’ve seen before, add it to CLAUDE.md in the project repo. AI reads this at the start of every session. “Always use the service layer for database calls, never call the ORM directly.” “This module expects ISO date strings, not Date objects.” A maintained CLAUDE.md is compounding leverage. The team that keeps it gets faster every week. The team that doesn’t re-explains the same things indefinitely.

Review

Get a second opinion before sharing anything important.

Before sending a definition doc to the team, paste it and ask: what’s unclear, what’s missing, what edge case isn’t handled? Before sending a client message, ask: how does this land to someone who doesn’t have our context? Before making a technical decision, ask: what are the risks of this approach that I might be underweighting?

This is one of the highest-leverage uses of AI for all roles, and the one most skipped because it requires the discipline to slow down before shipping something.

How Not to Drain Costs

AI usage has real costs. Most waste comes from a few sources.

Bloated context. The cost of most AI APIs scales with tokens in and tokens out. A padded prompt costs more than a tight one and usually gets worse results. Be specific. Cut what isn’t needed. The definition doc should be precise. Every sentence earns its place.

The wrong model for the job. Not everything needs the most capable model. Simple drafts, formatting, quick lookups: use a faster, cheaper model. Complex reasoning, architecture decisions, definition doc review: use the best one you have. Defaulting to the most powerful model for everything is like using a power drill to hang a picture frame.

Unnecessary regeneration. If AI output is 80% right, edit the 20% instead of regenerating. Regenerating costs as much as the original call and often produces something only marginally different. Get specific about what’s wrong and ask for a targeted revision.

Vague prompts that require many iterations. Every back-and-forth costs tokens. A better prompt upfront, with full context, a clear ask, and examples of what good looks like, means fewer iterations. This is why the time you spend on the definition doc pays off in Build: a precise definition means fewer prompting cycles to get the implementation right.

Prompt caching. Tools like Claude support prompt caching for repeated context: system instructions, CLAUDE.md contents, shared background. If you’re sending the same base context across many calls, use a tool that caches it. The first call pays; subsequent calls are significantly cheaper.

By Role

PM — highest leverage: thinking partner before leadership sessions, definition doc drafting, release notes, client communication, synthesising feedback into problem statements.

PD — highest leverage: exploring design alternatives before committing to a direction, writing UX copy, identifying usability gaps in your own designs, synthesising research into insights.

PE — highest leverage: implementation via definition doc prompts, maintaining CLAUDE.md, code review assistance, pressure-testing architecture decisions before proposing them.

For all roles: the habit that compounds fastest is using AI as the first stop before any significant piece of work, to think, to draft, to pressure-test. The people who do this consistently produce sharper work in less time. The people who only use it for grunt work miss most of the value.

Related: Build, Engineering Thinking