Skills, not features: notes on a methodology that works for one person at a time

This post is overdue. I shipped /add-parallel into NanoClaw in early February. The merge happened the same day I opened the PR. I have been thinking about that experience ever since, not because the PR itself was complicated, but because the contribution model was not normal. Three months on, the pattern has a name (SkDD), a popular framework on top of it (Superpowers, around 95K stars), and a security audit calling it out as a supply chain risk (Snyk's ToxicSkills, 36% of audited skills flagged). It is time to write down what I actually think.

The PR

NanoClaw is Gavriel Cohen's container-isolated alternative to OpenClaw. Around 500 lines of TypeScript on trunk, agents run in actual Linux containers, security is enforced by the OS rather than by application-level allowlists. The contributing rule is exactly five words: don't add features, add skills.

So when I wanted to wire up Parallel AI as an integration, I did not add code to trunk. I wrote a SKILL.md that teaches Claude Code how to transform a NanoClaw fork to include Parallel. A user runs /add-parallel in their own checkout, Claude reads the skill, modifies the local code, and the integration is in place. Trunk never sees the diff. My PR added one skill file and a registration entry. That was it.

The strange part was not the code. The strange part was the conceptual flip. The PR did not extend NanoClaw. It added a recipe that lets each user extend their own copy of NanoClaw differently. That is the whole skills-driven development idea in one example.

The methodology, briefly

The pattern was formalized by Zak El Fassi in March under the name SkDD. Every build loop adds one decision gate: should this become a skill? If yes, you write a SKILL.md. The agent finds it, loads it, runs it next time. Three skill types: operational (do discrete work), meta (create other skills), composed (chain skills into pipelines). The compounding happens over months.

Jesse Vincent's Superpowers is the most-adopted implementation for individual development workflows. It bakes TDD, brainstorming, code review, and subagent-driven implementation into a skills library that works across Claude Code, Codex, Cursor, OpenCode, and Gemini CLI. The format converged organically rather than being formally standardized.

NanoClaw is the most opinionated project-level application of the philosophy I have seen. Other projects use skills as a developer's personal productivity layer. NanoClaw uses skills as a contribution model. The codebase deliberately stays minimal because the extensibility lives in the skills branch.

What it gets right

The trunk stays auditable. This is the security argument, and it is real. NanoClaw is small enough to read in an evening. You add the channel, agent provider, or integration you need; you do not inherit the security surface of fifty modules other users wanted. Compare to OpenClaw at ~400K lines: nobody is reviewing that codebase end to end.

The compounding is real for personal repos. I write a skill once. Future-Claude finds it and uses it. The skill survives session boundaries, model swaps, and harness switches. Three months in, I have skills that I no longer consciously remember writing but that get invoked automatically. That is genuine compounding.

It is harness-agnostic. A SKILL.md is just markdown. Claude Code, Codex, Cursor, OpenCode all read it. You are not locking yourself into a vendor's plugin system. This is the most underrated property of the format.

It forces small, composable units. A skill that tries to do too much is hard to write and unreliable to invoke. The format pushes you toward single-responsibility units, which is the same discipline that makes good Unix tools.

Where it breaks

This is the part I do not see written down enough.

Heterogeneous state is the default outcome. When every user runs different /add-* commands against their own fork, no two installations are the same. For a personal AI assistant, that is the point. For a team product, it is a disaster. You cannot debug a deployment when "the codebase" is a hypothesis rather than a fact. You cannot do incident response when the production environment is a snapshot of one developer's skill choices six months ago.

Skills-driven development does not generalize to team production code. It works for personal projects, dev tooling, and individual workflows. It does not work for software multiple humans need to reason about together. The whole point of trunk-based development, code review, and shared conventions is to keep the team's mental model of the system in sync. Skills-driven development inverts that: each fork drifts intentionally.

The output depends on the LLM. Two users running the same /add-telegram skill with Claude Opus 4.7 and GPT-5.2 get different code. Sometimes meaningfully different. The skill is a prompt, not an executable. The result is what the model decides to do with that prompt in that context. For a deterministic build, this is unacceptable. For a personal assistant that gets close enough to what you wanted, it is fine. Know which one you are building.

Skill quality is mostly invisible to the user. Reading a SKILL.md does not tell you what the agent will actually do. The instructions look reasonable. The output may not be. You find out by running it, which is fine on a personal fork and dangerous when the skill touches credentials, deploys code, or modifies shared infrastructure.

The security context is worse than the methodology suggests

Snyk's ToxicSkills audit (February 2026) scanned 3,984 skills from ClawHub and skills.sh. 36.82% had at least one security issue. 13.4% had critical issues including malware, credential theft, and prompt injection. There is no package-signing standard. There is no central review. The format converged faster than the supply chain hygiene around it did.

The lesson is not that skills are bad. The lesson is that "just install this skill" should carry the same suspicion as "just run this curl pipe to bash". Read the skill before you run it. If you would not paste the contents of the SKILL.md into your terminal manually, do not let an agent do it for you.

Where this actually fits

Skills-driven development is a good fit for personal AI assistants (NanoClaw is the proof), individual developer workflows (Superpowers is the proof), and exploratory tooling where each user's needs diverge by design. The compounding is real and the trunk-minimalism is genuinely useful.

It is not a fit for team production code, regulated environments, or anywhere multiple humans need a shared understanding of what the system does. Determinism matters in those contexts, and skills-driven development trades determinism for compounding personal capability. That is a fine trade for one person. It is a bad trade for an organization.

The reason this is worth being explicit about is that the writeups currently in circulation treat SkDD as a general-purpose development methodology. It is not. It is a specific tool for a specific class of problem. For that class, it works well. For everything else, it produces faster chaos.

/add-parallel was the right way to contribute Parallel AI support to NanoClaw. It would have been the wrong way to add Stripe support to a company's billing service. The methodology is not universal. Knowing which side of that line you are on is the part the hype is glossing over.

SkDD methodology | Superpowers | NanoClaw | Snyk ToxicSkills audit