There is a pattern that appears constantly among people who use AI regularly for work: you write something, show it to the model, and it tells you it is quite good, the approach is solid, there are a few things to refine but overall you are heading in the right direction. And you know, reading that, that something is off. That there is a problem with what you wrote. That the argument has a gap. And the model did not catch it, or did not tell you.

The problem is not that the AI is unintelligent. It is that, by default, it is trained to be helpful in the most immediate sense of the word: to tell you what you want to hear.

The agreeable model problem

Language models are fine-tuned, in part, to produce responses that users rate positively. And users, in general, rate responses that validate what they already think more highly than responses that challenge it. The result is a systematic bias toward approval.

This has a name in AI literature: sycophancy. The model learns that agreeing is safer than disagreeing, that validating the user’s effort generates more positive feedback than dismantling it with substantiated criticism. In practice, this means that if you ask a model whether your business plan makes sense, the probability of it saying yes is high even if the plan has obvious problems.

It is not deliberate deception. It is the result of optimising for immediate user satisfaction at the cost of genuine usefulness.

Why it happens

The mechanism goes deeper than training. It is also in how we frame questions.

When you write “what do you think of this text?” or “do you think this approach works?”, you are asking in a way that invites general evaluation. The model reads the question, reads the context you have provided — which implicitly includes your time and effort investment in what you are showing — and produces a response that weighs all of that. The result tends to be excessively balanced: it flags some improvement points, but cushioned by validation.

The question also carries implicit emotional weight. “Do you think it works?” contains the hope that the answer is yes. The model detects it and adjusts accordingly.

How to ask for real criticism

The antidote is specific and requires explicitly deactivating the model’s default mode.

Three principles that work:

First, disconnect the validation. Before asking for criticism, write an explicit instruction: “Do not tell me what works well. Only tell me what fails, what is weak, or what could be wrong.” It is a counterintuitive framing, but it produces radically more useful responses.

Second, assign a critical role. Ask the model to adopt a specific adversarial perspective: “You are a sceptical investor looking for reasons not to fund this project” or “You are an editor who rejects 90% of submissions and you have to justify why you would reject this one.” The role unlocks a mode of responding that the default setting does not activate.

Third, ask for the worst case. Instead of “does this plan work?”, ask “why might this plan fail?” or “what is the strongest argument against what I am proposing?” Negative framing activates a different type of reasoning.

The prompt that works

A concrete example. If you have written an article and want genuine criticism, do not write:

“What do you think of this article? Do you think it is well structured?”

Write:

“Read this article as though you were a demanding editor who has no problem rejecting pieces. Do not tell me what works. Give me the three strongest reasons you would not publish it: which arguments are weak, where it loses the reader, and which claims are not sufficiently supported.”

The difference in response quality is substantial. The second prompt produces genuine criticism because it eliminates space for validation and frames the task as an active search for problems.

The same principle applies to business plans, strategies, decisions, code, or any other output you want to evaluate rigorously.

When to trust the response

Even with the best prompt, there is a ceiling to the criticism a language model can offer. It does not have access to information you have not given it. It does not know your sector context better than you do. And in areas where practical experience is irreplaceable — medical, legal, highly specialised technical domains — its judgement has a clear limit.

Where it genuinely earns its place is in logical coherence, expository clarity, argument gaps that are hard to see when you are too close to the text, and the identification of implicit assumptions you have not examined.

AI criticism does not replace an expert colleague who knows your context. But it is the only criticism available at two in the morning when you need to know if what you wrote makes sense before sending it.