DevAi

Knowledge graph

home method services case-studies audiences skills projects decisions playbooks notes influences open-questions about
← open questions

Is prompt engineering more leveraged than the harness, long-term?

As base models keep getting better at following short, loose prompts, does the long-term leverage shift back to prompt-craft, or does it stay with the harness around the agent (typed adapters, file-driven planning, adversarial review, test scaffolds)? For code I expect to maintain in a year or three, where’s the compounding return?

The tension

Prompt techniques are getting cheaper, stronger base models need less prompting. Harness work stays expensive to build and to keep current. If model capability keeps climbing fast enough, a lot of scaffolding could look like over-engineering in three years. On the other hand, prompts go stale every time a model changes; typed adapters, file-driven phases, and adversarial review loops don’t.

My current lean

Harness over prompt. The decisions and playbooks already on this site embody that lean:

The bet is that harness pays off more on maintenance than prompt-craft does, because maintenance is when the model, the codebase, and the team have all changed.

What would change my mind

Experienced practitioners shipping and maintaining long-running agent systems, Karpathy and the small group of people doing this work in public, landing on “the harness was over-engineering; prompt discipline would have carried it.” If the consensus among that group shifts toward prompt-first for long-lived code, revisit.

Status

Open. Not holding the lean tightly.

Talk to me

Pick a doorway.

Remote, UTC+2. Open to contract, retainer, or lead-developer roles.

Drop a note to info@devai.co.za