The behavioral impact of LLM billing models

Claude bills you differently than Copilot.

This results in some financial considerations. It seems like the behavioral/workflow implications are more important. And they might be the exact reason why the billing models differ.

Claude Code subscription does token counting. You get time-bound token quotas. When you hit a limit, no inference for you until the next refresh. Simple enough.

Copilot counts your “premium requests”. This does not mean REST requests. This does not mean LLM completions or generations. It seems convoluted at first (see docs) but I found that a good rule of thumb is “when you hit ENTER, it’s a new premium request”.¹

When using Claude Code, you learn to save tokens. You interrupt reasoning that’s going astray, you proactively provide missing context, you’re quick to retry with a slightly different prompt. You optimize for interaction and keeping yourself in the loop.

When using Copilot, you learn to save Premium Requests. You spend more time refining your prompts before sending them. You put more context in skill files and AGENTS.md. You optimize for long-running, autonomous task completion.

Using Claude Code feels like pair programming. You can interrupt your engineering buddy at any time, point out potential pitfalls, comment on architecture, do a little back-and-forth to design a critical part of code.

Using Copilot feels like delegating. You have to make sure that all the tools and context are available ahead of time. You have to make more effort to document your code and processes well. Formalization and standardization helps a lot. When you send Copilot off to work on something, you don’t micromanage. You verify the end result and hone your delegating skills further.

Part of the adaptation lies in our infrastructure and tooling. Prompts, tool integrations, autonomous task harnesses, task granularity and parallelization… they all can be adjusted to fit a particular billing model.

This is institutional in nature. It’s not cool, but it’s easier for me to accept that the other kind of vendor lock-in attempted here.

The other kind is behavioral. By now it’s no secret that the muscle memory trained over the years in users of a particular piece of software constitutes a hefty part of the software’s business value. What used to happen with keyboard shortcuts now happens with the choice of words when prompting the LLM, with the cadence of replies, with the balance between experimentation and grounding that one exhibits in a coding session that became a conversation.

How much is it you programming the computer, and how much is it the computer programming you?

The AI companies are trying really hard to capture the market and the closed-weight models cannot be their moat anymore. The resulting technological landscape shifts rapidly and feels really hostile to everybody. Everybody in the current, largely knowledge-based, economy increasingly has to depend on these tools for their daily work.

As to programming, both of the described billing approaches might have their rightful place in our toolboxes. Casual pair programming with a tight feedback loop for PoCs and ideation. Corporate-like structure and strictness required by Copilot for routine work on stable, long-running, well documented projects. I just wish they were both offered by everybody and not used as weapons in the war over our brains.

Model multipliers and whether a given product consumes premium requests at all is a different matter. An interesting instance is the autonomous (hosted) Copilot working in GitHub. It consumes both Premium Requests and GitHub Actions minutes. ↩︎