Workflow·March 2026·14 min read

My Claude Code workflow for getting production-grade code in one shot

The prompt engineering, skills, plugins, and session habits that separate professional AI output from the generic garbage most people are getting. All of it, for free.

Sayuj Shah
Sayuj Shah

AI Consultant · MS Computer Science (AI)

AI coding workflow with Claude Code skills and plugins

Key takeaways

  • 01Most AI output is mediocre because most people treat the session like a vending machine: type in a vague request, expect a polished result. The quality of what you get is almost entirely determined by what you set up before you start typing.
  • 02Skills and plugins are pre-built instruction sets and agents that dramatically expand what Claude can do. Loading the right ones before a task is the difference between a one-shot build and a debugging session that runs all afternoon.
  • 03CLAUDE.md is the permanent instruction manual for your project. Combined with Auto Memory and the claude-mem plugin, every session starts with full context and no re-explanation required, at a fraction of the token cost.
  • 04On the standard Claude Pro plan, two or three complex prompts used to drain a session. The right workflow, stronger prompts, smarter memory, and better context management, now gets me through entire project builds without touching the usage cap.

Most people I talk to who have tried AI coding tools and gotten nothing useful out of them share a common experience. They opened a chat window, typed a request, and expected something production-ready to come back. When the code was broken or generic, they concluded the tools were overhyped and moved on.

That is a reasonable conclusion. It is also wrong.

The developers and consultants getting genuinely professional output from AI are not using better tools. They are working differently. The quality of what Claude Code or any other model produces is almost entirely a function of what you give it to work with: context, constraints, examples, project structure, and the supporting infrastructure around the session itself.

What follows is my actual workflow. Not the vendor demo version. The one I use for production builds at Shah Labs, including the prep that happens before I open Claude Code, the skills and plugins I load for every session, concrete examples of what good prompts and setups look like, and how I keep token usage from becoming a wall.

It is all free information. If the free version is useful, you can imagine what the paid version goes into.

Why most AI-generated code is mediocre

The answer is shorter than you might expect. Vague inputs produce vague outputs.

When you open a fresh Claude session with no project context and type “build me a contact form,” the model does its best with nothing to go on. It does not know your tech stack, your design system, your validation rules, your API endpoint, or your error handling conventions. So it guesses. Sometimes well, usually not.

The fix is not a smarter model. It is more and better context, provided earlier. What separates one-shot outputs from four-round debugging sessions is almost always what happened before the first message was sent.

The prep work that happens before you open Claude Code

For complex tasks, I do not start in Claude Code. I start in Google Gemini.

I have a custom Gemini “Gem” configured specifically for prompt generation. It has detailed instructions about what a good prompt should include: what context to specify, what constraints to set, what output format to request, and critically, what clarifying questions to ask when my initial description leaves gaps.

That last part is where the real value is. When I give the Gem an incomplete description, it asks me clarifying questions instead of filling in the blanks by guessing. That exchange forces me to think through the task more carefully. It surfaces gaps I had not noticed. And it produces a cohesive prompt that Claude can act on in a single pass, no clarification round required.

My working theory: AI knows AI best. A model trained on similar data has a better intuition for what Claude needs in a prompt than most humans do. The prompts the Gem generates are structured differently than what I would write on my own. More specific in the right places. Less verbose in the wrong ones.

The result is one-shot prompts that save significant token budget by eliminating the back-and-forth that usually fills the first quarter of any session.

That said, I do not do this for everything. For simpler, well-defined tasks, I skip the Gemini prep entirely and talk directly to Claude. More on that in the next section.

For simpler tasks: talk to your agent like a person

Not every prompt needs a structured framework. For routine or clearly defined tasks, natural language works best, but “natural” does not mean vague.

The mental model I use: you are briefing a smart, capable team member who is new to your project. They can do the work. They just need you to explain it completely before they start, because once they are in it, interruptions are expensive. You have one shot to relay the information before sending them on their way.

The difference between a weak and strong prompt is not about length. It is about specificity.

Weak prompt

Add a dark mode toggle to the site.

Strong prompt

Add a dark mode toggle to the site. The theme system uses next-themes with a data-theme attribute on the HTML element. All colors are CSS custom properties defined in globals.css. The toggle should match the NavBar component styling, sit in the top-right corner on desktop, and collapse to icon-only on mobile. Use the Framer Motion animation patterns from lib/animations.ts and build on the existing ThemeToggle component already in components/ui/. Do not hardcode any color values.

Both prompts ask for the same feature. Only one of them produces something usable on the first pass. The difference is that the strong version tells Claude exactly what constraints matter, where to look, and what to avoid.

A few things always worth including in any prompt: the relevant files or components, the constraints that are not obvious from the code, examples of similar work done correctly elsewhere in the project, and any “do not do this” guardrails. Negative examples are just as valuable as positive ones.

Want the exact prompt templates and Gemini Gem configuration I use? I offer a 90-minute information session ($199) that walks through this workflow applied to your specific tech stack and goals. Book a session and we get into the specifics.

What Claude Code skills and plugins actually are

If you have not heard of Claude Code skills, here is the plain version: they are pre-built instruction sets, agents, reference files, and context documents that you add to your Claude session to extend what it can do.

Think of it like a specialist contractor arriving at a job. A generalist with no tools can still get the work done, but slowly and with a lot of improvisation. A specialist who shows up with the right tools, built by people who have done this specific job hundreds of times, does it faster and with better results.

Skills are those tools. They are shared freely across the internet: on skills.sh, in GitHub repositories, and through developer communities. Many are built by engineers with deep domain expertise. A frontend design skill might include hundreds of pages of design system context, aesthetic judgment rules, and component architecture guidelines that took months to develop. You load it, and Claude inherits all of that knowledge.

Claude will detect and use available skills automatically when you give it a relevant task. That said, I have found it is worth being explicit about this at the start of every session.

Session opener I use every time
Before we start: conduct a full audit of all skills and plugins
available to you. List what you found, then identify which ones are
most relevant to the task ahead. Do not begin the task until you
have done this and confirmed your toolset.

Without that instruction, Claude does not always do a complete sweep. Relevant skills go unused. Telling it explicitly to audit and report back before starting means the session is properly equipped from the first message.

Plugins operate at a similar level but tend to be more tool-oriented, connecting Claude to external services or giving it new capabilities like persistent memory or semantic code search.

The skills that actually change output quality

A few stand out from the rest.

Superpowers is the most comprehensive and the one that changes how you work at the deepest level. It gives Claude a structured operating approach for complex tasks: how to write and execute plans, how to dispatch independent subtasks to parallel subagents, how to review its own code, and how to verify work is complete before marking it done. Sessions that use Superpowers feel less like talking to a chatbot and more like managing a small engineering team. The model plans before it builds, checks its own work, and runs independent tasks in parallel.

frontend-design and impeccable skill set completely change how Claude approaches visual work. Instead of producing technically correct but visually generic components, it produces work that reflects design thinking: proper spacing, intentional color use, real visual hierarchy. If you are building anything user-facing, this category is non-optional.

writing-skills is what I use to build custom skills of my own. It provides the scaffolding and conventions for creating a skill Claude will follow reliably. Using it, I have built skills specific to Shah Labs: our brand voice, our design conventions, our component library, our content strategy. Here is a simplified example of what a custom business skill looks like:

Example: Shah Labs custom skill (abbreviated)
# Shah Labs Brand & Design Skill

## Voice
Write in first-person. Conversational, direct, slightly contrarian.
No corporate language. No AI buzzwords ("leverage", "utilize",
"synergy"). Short sentences. Occasional fragments are fine.

## Design rules
- Colors: always CSS custom properties (var(--primary), var(--bg), etc.)
- Never hardcode hex values in component files
- All animations: Framer Motion with whileInView + scrollViewport
- Glass cards: use GlassCard component, not manual glass styles
- Fonts: DM Sans only, no other typefaces

## What NOT to do
- Do not add py-* padding to section elements (handled by globals.css)
- Do not create new files when editing an existing one would work
- Do not add error handling for scenarios that cannot happen

Every session I open for Shah Labs work starts with a Claude that already understands the business because that skill carries the context in. Building your own custom skill is the highest-leverage move in this entire workflow.

I build fully pre-equipped agents for businesses: loaded with the right skills, plugins, and your specific context, ready to handle your most repetitive operations from day one. See what that looks like for your team before you spend a session figuring it out yourself.

MCPs: connecting Claude to real systems

MCP stands for Model Context Protocol. The simplest explanation: MCPs give Claude access to external tools and services during a session.

Without MCPs, Claude works only with what is inside the conversation. With them, it can read from and write to real systems in your stack.

Supabase MCP connects Claude directly to your database. Instead of copying schema definitions into the prompt manually, Claude reads the actual live tables, runs queries, and generates migrations based on what the database looks like right now. No stale schema information, no guessing at column types.

Playwright and Puppeteer MCPs give Claude a live browser. It can navigate to pages, take screenshots, inspect elements, and test interactions. For web development, this means Claude can verify that what it built actually renders correctly instead of assuming it does.

GitHub MCP handles pull request management, issue tracking, and repository navigation without leaving the session. Other commonly useful ones include Slack MCP for reading channel context, and filesystem and Git MCPs for navigating large codebases more efficiently.

The MCP ecosystem is expanding quickly. Directories of available MCPs are worth reviewing before you set up any new project, because the right configuration depends on what systems that project actually touches.

CLAUDE.md: the instruction manual that lives with your project

Every serious Claude Code setup has a CLAUDE.md file in the project root. Claude reads it at the start of every session. It is where you put the things you would otherwise have to explain from scratch every time.

For a software project, that means the tech stack, architecture patterns, naming conventions, which tools to use for which tasks, and any constraints that are not obvious from reading the code itself. Here is the basic structure I use:

CLAUDE.md structure (template)
# CLAUDE.md

## Commands
```bash
npm run dev      # Start dev server at localhost:3000
npm run build    # Production build
npm run lint     # Run ESLint
```

## Architecture
[2-3 sentences: what the project is, the tech stack, the main entry point]

## Key conventions
- Colors: [how colors are defined — CSS vars, design tokens, etc.]
- Components: [where they live, how they are named]
- Animations: [which library, which patterns to follow]
- Fonts: [which typeface, how it is loaded]

## What NOT to do
- [List the guardrails Claude should never cross]
- [Examples: do not hardcode values, do not create files unnecessarily]

## Environment
[API keys, env vars the project needs — names only, not values]

## Brand & copy context
[Voice, tone, audience — especially important for content-heavy projects]

Without CLAUDE.md, the first chunk of every session is re-orienting Claude to the project. With a good one, the first message is actually about the task. The time you invest writing it once pays back across every session that follows.

It is also a living document. As the project evolves, the file evolves with it. One of the practices I follow: at the end of any session where I make a significant architectural decision, I update CLAUDE.md with that decision before closing out.

Auto Memory, introduced recently by Anthropic, extends this further. Instead of a single static file, Auto Memory lets Claude build up and refer to persistent memory across sessions: user preferences, project decisions, feedback you have given it, and context that does not belong in the code itself. The difference between an assistant who starts fresh every morning and one who actually remembers what you worked through last week.

The claude-mem plugin and keeping sessions lean

The biggest practical bottleneck with Claude is usage. It is a token-expensive model by design. The quality that makes it the best coding assistant available is the same quality that drains a session quickly. On the standard Pro plan, before I had this workflow figured out, two or three complex prompts and I was watching the usage limit approach.

The claude-mem plugin changed that significantly. It compresses session context and manages memory persistence in a token-efficient way. Instead of carrying the full conversation history forward as the session grows, it maintains a compact summary of what has been established and decided, clearing out what is no longer needed. The session stays lean. The context stays intact.

For context on what “usage limits” means practically: Claude Pro has a usage cap that resets every five hours, plus a weekly cap on top of that. Before this workflow, I would exhaust a five-hour session in a few prompts during complex tasks. Now I complete full project builds, multiple features, testing, and refinements, without touching the cap once.

Combined with a few other habits, token usage dropped dramatically:

Start with a strong prompt. Every clarification round costs tokens. Front-loading the context means fewer rounds, which means less total usage for the same output. The Gemini prompt prep pays back here.

Use semantic search instead of brute-force reads. Grepai finds relevant code by intent rather than filename. Instead of telling Claude to read 10 files that might be related, you describe what you are looking for and grepai returns the 2 that actually matter.

Clear context at natural stopping points. When one task finishes and another starts, compress or clear the conversation history before moving forward. The memory persists via claude-mem. The raw conversation history does not need to.

Split large projects across sessions. Memory persists between sessions. A single session does not need to hold everything. Breaking a large project into well-defined sessions with clear handoff points is more efficient than trying to squeeze everything into one.

How to manage a Claude Code session without losing control

The most common way a coding session goes wrong is a lack of oversight. Claude starts a task, makes assumptions along the way, and delivers something that works technically but is not what you had in mind. By the time you notice, there are 200 lines of code to unwind.

The fix is regular check-ins. After each meaningful chunk of work, I ask Claude to report back. Not “how is it going,” but a specific structured request:

Check-in prompt I use after each task batch
Before continuing: give me a summary of what you just completed,
any architectural decisions you made along the way and why,
any assumptions you made that I should be aware of,
and what you are planning to do next. Wait for my confirmation
before proceeding.

That check-in is where misalignments surface before they compound. If Claude made a decision you disagree with, you catch it at one task instead of discovering it after five downstream tasks have been built on top of it.

When Superpowers is loaded, Claude will sometimes dispatch tasks to parallel subagents. This is genuinely useful because multiple independent tasks can run simultaneously. But it also means you have less direct visibility into what each subagent is doing. After a subagent batch completes, I review outputs before moving forward. Every time.

Build verification is non-negotiable. Before declaring any meaningful task complete, Claude confirms the build actually passes:

Verification prompt before marking any task done
Before marking this task complete: run the build and confirm
it passes with no errors or warnings. If anything fails, diagnose
the root cause and fix it. Do not declare this done until
you have a clean build output to show me.

The underlying principle: you hold the vision. Claude handles the execution of specific, well-defined tasks within that vision. Your job is to know what you are building well enough to recognize when the output is drifting from it. That requires genuine expertise in what good output looks like. The tool amplifies the expertise you bring. Without that expertise, you get confident-looking output that has no foundation.

The token efficiency picture, before and after

Claude has usage limits tied to five-hour windows on the Pro plan, with a weekly cap on top. Before this workflow, a couple of complex prompts in a single session would push me up against that wall. Attempting a full project build across one session was not realistic.

The combination of everything described above changed that: stronger prompts, relevant skills loaded from the start, claude-mem for session compression, grepai for targeted code retrieval instead of broad file reads, and CLAUDE.md so there is no re-orientation cost at the start of each session.

I now complete full project builds, with multiple features, build verification, and refinements, within a single session budget. Regularly, on a standard Pro plan.

The improvement does not come from Claude getting cheaper. It comes from the workflow getting smarter about what it asks Claude to do, and how it asks.

Cursor as an alternative worth knowing

Claude Code is what I use most. Cursor is the other tool in the rotation, and the two are increasingly interchangeable now that Cursor has added skills and plugin support.

Cursor's practical advantage is model flexibility. You can run sessions against OpenAI models, which are more token-efficient than Claude's at the cost of some output quality. For tasks where raw reasoning matters less, editing documentation, refactoring boilerplate, generating test data, that tradeoff can make sense. For complex architecture work and anything requiring genuine judgment about the codebase, Claude stays ahead.

The workflow described in this article transfers to Cursor. CLAUDE.md has an equivalent AGENTS.md. Skills and MCPs work across both environments. The investment in setup carries across both tools, which means you are not locked into one if the other serves a specific task better.

It's a tool, not a replacement

There is a version of this article that makes the whole thing sound like magic: set up the workflow and the code ships itself. That version is not true, and writing it would not be useful to anyone who actually tries it.

What this workflow does is remove friction between your expertise and the output. Claude does not provide the expertise. It executes against it. Without a real understanding of how software is built, what good architecture looks like, and where security and performance risks live, the assistant produces confident-sounding wrong answers at high speed. The session management, build verification, and check-in prompts described above only work because there is someone with the judgment to evaluate the answers.

The business owners I help with AI implementation are not using these tools to become developers. They work with someone who holds the technical expertise and uses these tools to produce professional output faster. The tool is in the workflow, not replacing the person making the decisions.

Injecting a coding session with the right skills, plugins, memory, and context is what lifts output from generic to production-grade. The vision behind the session, and the expertise to recognize when the output is wrong, is still yours to bring.

If this is free, imagine what the paid version looks like.

Two ways to go deeper. A 90-minute information session ($199) walks through the full workflow applied to your specific tech stack, your prompts, your setup. Or if you want to skip the learning curve entirely, I build the agent for you, pre-loaded with everything it needs to handle your operation from day one.

Related insights

Want to apply this to your business?

Book a free 30-minute call and we figure out whether a custom agent build makes sense for your operation, or whether a one-on-one workflow session is the better starting point. Either way, you leave with a clear next step.

Book a free 30-minute call