This is a systematic, field-tested guide to shipping real production code quickly and safely using an AI coding assistant (specifically Claude), along with the principles you must follow along the way. It covers both why AI can 10x your development productivity and where it falls short — plus concrete practices to overcome those limits. 🛠️
1. Introduction: AI Coding — From Meme to Practice
- AI coding is a new development paradigm in which AI writes the code and developers direct and supervise it.
- The goal of this guide is to share how AI can genuinely 10x development productivity and the engineering culture and practices required to make that happen.
"Good development habits are not optional. Whether AI amplifies your capabilities or amplifies your chaos depends entirely on those habits."
- Research shows that teams with rigorous engineering habits deploy 46× more frequently and move from commit to deploy 440× faster. (Source: Accelerate: The Science of Lean Software and DevOps)
2. From Meme to Methodology: The Birth of Vibe Coding
- Vibe Coding started as a tongue-in-cheek tweet by Andrej Karpathy — the idea of "AI writes the code, developer just vibes."
- But with tools like Anthropic's Claude Code, that joke has started to become real.
"Using Claude without the right setup is like playing whack-a-mole with an over-eager intern."
- The Julep team is actually shipping production code with Claude on a large backend codebase with deep historical context.
3. Understanding Vibe Coding: How to Collaborate with AI
Traditional Coding vs. Vibe Coding
- Traditional coding: Like sculpting marble — you carve every line yourself.
- Vibe Coding: Like conducting an orchestra — you direct the AI and set the overall structure and direction.
"Instead of crafting every line, you review, refine, and redirect. But you are still the architect. Claude is just a context-free intern."
The 3 Stances of Vibe Coding
-
AI as draft author
- AI quickly generates a baseline implementation; developer focuses on design and architecture.
- Best for repetitive or boilerplate code.
-
AI as pair programmer
- Active back-and-forth collaboration with the AI.
- Developer sets the outline; AI fills in the details.
- Best for most day-to-day development.
-
AI as reviewer
- AI reviews developer-written code and suggests bug fixes or improvements.
- Acts as a tireless code reviewer.
4. The Practical Framework: 3 Vibe Coding Modes
4.1. Mode 1: Playground (Experimentation / Prototyping)
- When?
- Weekend hacks, personal scripts, POCs, "does this even work?" experiments.
- Characteristics
- Claude writes 80–90% of the code; developer provides minimal direction.
- Fast — but never use this for production!
"Playground mode is all about speed, but never use it for code that matters. Good engineering principles matter even more right now."
4.2. Mode 2: Pair Programming (Real Work / Small Services)
- When?
- Projects under 5,000 lines, side projects with real users, demos, small services.
- Core tool:
CLAUDE.md- A project-specific document Claude reads automatically.
- Clearly documents coding rules, architecture, testing approach, style guide, forbidden patterns, and more.
"CLAUDE.md is a special file Claude reads automatically at the start of every conversation. Write your rules, style, testing approach, and forbidden patterns once — and never explain them again."
Example: Excerpt from CLAUDE.md
## Project: Analytics Dashboard
### Architecture Decisions
- Default to Server Components; use Client Components only when necessary.
- Use tRPC, Prisma, Tailwind.
### Code Style
- Prettier, 100-char line limit, sorted imports.
- Components in PascalCase, co-located with tests.
### Patterns
- Data fetching in Server Components only.
- All external data validated with Zod schemas.
### Forbidden
- No data fetching with useEffect.
- No TypeScript `any`.
Anchor Comments
- Special comments that serve as navigation markers for both AI and humans.
- Example:
// AIDEV-NOTE: This component uses virtual scrolling. Handles 10,000+ items.
// Claude: preserve virtual scrolling when making changes!
"Anchor comments are navigation signposts for both AI and developers. They keep Claude from getting lost."
4.3. Mode 3: Production / Large-Scale (Monorepo) Scale
- When?
- Large codebases, real users, systems where bugs cost money or trust.
- Caveat
- As of 2025, large-scale vibe coding does not scale perfectly.
- The realistic approach: apply it at the service or submodule level.
"Without clear boundaries, Claude can 'improve' an API and break every client downstream."
Example: API boundary comment
# AIDEV-NOTE: API Contract Boundary - v2.3.1
# Any change requires a version bump and migration.
# Claude: never modify the response structure!
5. Infrastructure: The Foundation for Sustainable AI Development
5.1. CLAUDE.md: The Single Source of Truth
- CLAUDE.md is not optional — it is essential.
- It is the constitution of your codebase: rules and forbidden patterns, terminology definitions, architecture decisions, and anti-patterns, all clearly recorded.
"One minute invested in CLAUDE.md saves one hour of cleanup later."
Example: Key sections in CLAUDE.md
## Absolute Rules
- If an implementation is ambiguous, always ask the developer before proceeding.
## Architecture Decisions
- Using Temporal: long-running workflows, automatic recovery.
- PostgreSQL + pgvector: ACID compliance, vector search.
## Forbidden
1. Never modify test files.
2. Never change API contracts.
3. Never modify migration files.
4. Never commit secrets.
5. Never infer business logic.
6. Never delete AIDEV- comments.
Anchor Comments in Practice
# AIDEV-NOTE: Performance-critical path — handles 100k req/sec.
# Do not add DB queries here.
def get_user_feed(...):
# AIDEV-TODO: pagination not yet implemented
# AIDEV-QUESTION: why filter private items here?
# AIDEV-ANSWER: privacy rules may change between cache updates
5.2. Git Workflow
- Isolate AI experiments with git worktrees
- Lets Claude experiment freely without polluting the main branch.
- Cherry-pick only successful commits.
git worktree add ../ai-experiments/cool-feature -b ai/cool-feature
# let Claude experiment
git cherry-pick abc123 # bring only the good commits to main
git worktree remove ../ai-experiments/cool-feature
- Standardize AI commit messages
feat: implement user feed caching [AI]
- core logic AI-generated, tests written by human
"Every commit with AI involvement must be tagged [AI] so reviewers know to look more carefully."
6. The Most Important Rule: Tests Must Always Be Written by Humans
"Never, ever, ever let AI write your tests."
- Tests are not just code verification — they are executable specifications that encode business intent, edge cases, and domain understanding.
- Tests written by AI only verify that "the code does what the code does," not that "the code does what you want."
Example: Memory leak missed by AI
class RateLimiter:
...
# AI-written tests only covered the happy path — memory leak went undetected!
- Julep's rules:
| Item | AI can do | AI must never do |
| -------------- | ----------------------- | ---------------------- |
| Implementation | Generate business logic | Modify test files |
| Test planning | Suggest scenarios | Write test code |
| Debugging | Analyze failures | Modify expected values |
"If AI touches a test file, the PR is rejected. No exceptions."
7. Cautions for Large-Scale Use: Token Economics and Context Management
- Skimping on context costs you more tokens, not fewer.
- Provide information that is "relevant and sufficient," not minimal — that's how AI gets it right the first time.
Example: Insufficient vs. sufficient context
-
Thin prompt
- "Add caching to the user endpoint."
- → AI reaches for in-memory cache, no invalidation, no monitoring, and so on.
-
Sufficient prompt
- "Add Redis caching, 12 servers, 1-hour TTL, prevent cache stampede, include Prometheus metrics, follow the caching guide."
- → Correct implementation on the first try.
"Tokens are the cost of a good tool. Don't skimp on context."
- Keep Claude sessions separate per task
- Do not mix multiple topics in a single session.
- Follow the "one task, one session" principle.
8. Real-World Case Study: Refactoring Error Handling Across 500+ Endpoints
- Humans decide the Why:
- Error hierarchy, error codes, response format — these must be designed by a human.
# SPEC.md — Error Hierarchy
- Client errors (4xx) vs. system errors (5xx).
- All errors JSON-serializable, stable error codes.
- AI executes the How:
- Claude refactored 500+ error-handling sites in a single pass.
# Before
raise Exception("User not found")
# After
raise AuthenticationError(
message="User not found",
code="USER_NOT_FOUND",
details={"identifier": email}
)
"Thanks to clear specs, CLAUDE.md, anchor comments, and documentation, a two-day job took four hours."
9. Leadership and Culture: Running an AI-Era Engineering Team
-
The evolving role of the senior engineer
- From code author → knowledge curator, boundary setter, mentor for both AI and humans.
-
Onboarding checklist
- Week 1: Read CLAUDE.md, set up environment, submit first PR (human-only).
- Week 2: Configure Claude, work through toy problems with AI, submit first AI PR (supervised).
- Week 3: Develop a full AI-assisted feature independently, write tests for others' AI code, lead a code review.
-
Culture of transparency
- Tag every AI-involved commit with
[AI],[AI-minor],[AI-review], etc. - "Don't hide AI involvement. Use it responsibly and transparently."
- Tag every AI-involved commit with
10. What Claude Must Never Touch (The Sacred Rules)
❌ Test Files
# Sacred ground.
# No AI entry.
def test_critical_business_logic():
"""This test encodes $10M worth of domain knowledge."""
pass
❌ Database Migrations
-- AI touches this → data gone = career gone
ALTER TABLE users ADD COLUMN subscription_tier VARCHAR(20);
❌ Security Code
# HUMAN EYES ONLY — security boundary
def validate_token(token: str) -> Optional[UserClaims]:
# Security team approval required
❌ API Contracts (without versioning)
# openapi.yaml
# Break this and all clients stop working
❌ Config / Secrets
DATABASE_URL = os.environ["DATABASE_URL"] # never hardcode
Tiers of AI Mistakes
-
Tier 1: Annoying but harmless
- Formatting errors, verbose code, inefficient algorithms.
-
Tier 2: Expensive to fix
- Broken internal APIs, changed patterns, unnecessary dependency additions.
-
Tier 3: Career-threatening
- Modified tests, broken API contracts, leaked secrets/PII, corrupted data migrations.
"A Tier 3 mistake is a LinkedIn profile update situation."
11. Future Outlook: The Next Stage of AI Development
- AI that understands the entire codebase
- Persistent memory across sessions and projects
- Proactively suggesting improvements without being prompted
- Learning team patterns and preferences over time
"Documentation is the core of DevOps. In the AI era, docs are the interface between human intent and AI capability."
12. Practical Guide: Start Today
Today
- Create a CLAUDE.md for your project.
- Add 3 anchor comments to your most complex code.
- Set clear boundaries and build one feature with AI.
This Week
- Agree on AI commit message conventions with your team.
- Run an AI pair-coding session with a junior engineer.
- Write tests yourself for code AI generated.
This Month
- Measure deployment frequency before and after AI adoption.
- Build a library of frequently used prompt patterns.
- Run a team retrospective.
"The most important thing is to start. Don't wait for perfect. Begin with a small project, establish your boundaries, and iterate."
13. References and Resources
- 📄 CLAUDE.md template example
- 🤝 Twitter @diwanksingh
- 💬 Vibe Engineering tag
- 📚 Recommended reading
- Accelerate: The Science of Lean Software and DevOps
- Fundamentals of Software Architecture
- The Fifth Discipline
- Beyond the 70%: Maximising the Human 30% of AI-Assisted Coding
"Perfect is the enemy of shipped. Start small, set clear boundaries, and iterate. The future is already here — it's just not evenly distributed yet."
Become part of the distribution. 🚀
