Back in January I posted about how I view LLMs, which included my workflow of doing LLM-assisted coding. To summarize, my workflow was:

  1. Reverse rubber-ducking
  2. Planning and writing a spec file
  3. Implement each phase of the plan, one by one
  4. Validation and commit

I can say lots have changed since January. For one, models are significantly better and more reliable. As well, I feel like I’ve got better at steering them. If I look at the above list by itself, without details, it doesn’t feel like things changed so much, but look closer and it’s a whole new world. My current workflow is an evolution of the above.

Before continuing, let me make something clear: this is my professional workflow. It’s what I use to write production code on cloud services.

Phase 0: Reverse Rubber-ducking

I still have this, but I no longer use a chat interface. I start directly with an agent (almost always OpenCode) and I now first get the agent to engage with the code before anything else. Let’s say I need to make a change to the flux capacitors, so I go and tell the agent what I think happens:

This is cloud-service-foo and it handles requests to create farbelizer connectors. I believe it then nimbolizes the farbelizers before sending them to cloud-service-bar that processes them through the flux capacitors. Check what the actual flow is and summarize it for me.

Most of the times – though not always – I know exactly how the flow works, but I do this to sort of prime the LLM for discussing what I want to change. I just found that it tends to work well for me; better then just telling it directly what I want to change.

An indirect effect of doing this is that sometimes it will tell me something that doesn’t meet my understanding, so I ask details to figure out if it’s really something I missed or just something the LLM got wrong.

When I know the LLM has the context, I will say something like

I have an issue where if the farbelizer connector starts with “foo”, then the flux capacitors should suppress the harmonic back-feeding before it reaches the primary gimbal housing.

I often add a little more about what the actual goal is, but it’s something like this. This usually causes the LLM to tell me what it thinks should be done. It will sometimes ask a question or two, but eventually it will give me a solution. Many times the solution is one I know I don’t want due to some constraint and I will tell it. Sometimes I steer it a bit more and tell it what I think we should do.

And then when I’m happy, I’ll say:

Plan a series of independent PRs to implement this. List which ones can be done in parallel vs linearly

That’s it. That’s the entire plan phase now. I no longer need a spec file.

Phase 1: Implementation

Given the list of PRs, I will ask it to implement them either a few in parallel or, if there’s a dependency, one by one. I also now let the LLM agent commit its changes. (if they were in parallel, I also let it push the changes.)

Again, that’s it?

Phase 2: Validation

I then test the work locally, I still insist on doing that because I don’t want to cause a SEV. I’ll then push the branches and review the diffs myself in GitHub before asking for others to review: I want to avoid wasting people’s time.

You still have to watch them

I had an interesting interaction with GPT 5.5 a few weeks ago, where it wrote code that was akin to this:

attempt := 0

for {
	attempt++
	if attempt > maxAttempts {
		return errTooManyAttemps
	}
	err := fetchData(ctx)
	if err != nil {
		switch {
		case errors.Is(err, errTimeout):
			fmt.Println("Warning: Timeout occurred. Retrying...")

		case errors.Is(err, context.Canceled):
			fmt.Println(" -> Context was canceled. Exiting...")

		default:
			return err
		}
		
		time.Sleep(100 * time.Millisecond) 
	}
}

When I saw that, I immediately knew it didn’t look right, so I asked the agent about the case with the context.Canceled and it happily explained to me that it would log the error and then return with default:. I said, no, it won’t, that’s not how Go switches work. And it insisted! “I understand your confusion, but because there is no break statement, the code will simply fall through the next case.”

No, it forking won’t! So I told it to prove it by writing a test that returned a context canceled. It did, caught the infinite loop and conceded.

My point? They can still make mistakes. I have to check them.

Conclusion

That said, I will concede that the LLMs are so much better now and that these errors are getting more and more rare. My flow is much quicker than before. I still review code like a caveman, I still make sure the LLM gets what I want it to do. But I basically killed the entire “plan” step. It’s just not needed. And I almost never write code by hand.