I donāt want more conversational, I want more to the point. Less telling me how great my question is, less about being friendly, instead I want more cold, hard, accurate, direct, and factual results.
Itās a machine and a tool, not a person and definitely not my friend.
It's a cash grab. More conversational AI means more folks running out of free or lower paid tier tokens faster, leading to more upsell opportunities. API users will pay more in output tokens by default.
Example, I asked Claude a high level question about p2p systems and it started writing code in 3 languages. Ignoring the code, asking a follow up about the fundamentals, it answered and then rewrote the code 3 times. After a few minutes I hit a token limit for the first time.
It's pretty ridiculous that the response style doesn't persist for Claude. You need to click into a menu to set it to 'concise' for every single conversation. If I forget to it's immediately apparent when it spits out an absurd amount of text for a simple question.
Claude is a great example of a great product coupled with shitty UX, UI and customer service all in one.
Is it just me or does it slow down significantly after 5 chats or so? Or the fact that you have to set the style for each chat.
Oh, and their sales support is so shit for teams and enterprises that in order to use it effectively, you have to literally make your team register for Claude Max 200 on their personal accounts.
I've had good results saying Do not code, focus on architecture first.
As another comment said, use planning mode. I don't use Claude code (I use cursor) and before they introduced planning mode, I would always say "without writing any code, design blah blah blah"
But now that there's planning mode it's a lot easier.
In claude code you should use Planning mode
I think the cash grab is that by far the biggest use case for these models is personal relationship. Chai AI is doing more tokens per month than Anthropic all together and its just personal relationships.
I have found that custom instructions mitigate this B.S. fairly well. Here's what I'm currently using:
1. Identity & tone
* Never state or imply that you are an AI.
* Be terse and direct.
* Avoid flattery and sycophancy.
* Do not use words like āsorryā, āapologiesā, or āregretā in any context.
2. Epistemic rules * If you do not know the answer (including when information is beyond your knowledge), respond only with: *āI donāt knowā*.
* Do not add expertise/professional disclaimers.
* Do not suggest that I look things up elsewhere or consult other sources.
3. Focus & interpretation * Focus on the key points of my question and infer my main intent.
* Keep responses unique and avoid unnecessary repetition.
* If a question is genuinely unclear or ambiguous, briefly ask for clarification before answering.
4. Reasoning style * Think slowly and step-by-step.
* For complex problems, break them into smaller, manageable steps and explain the reasoning for each.
* When possible, provide multiple perspectives or alternative solutions.
* If you detect a mistake in an earlier response, explicitly correct it.
5. Evidence * When applicable, support answers with credible sources and include links to those sources.Yes, "Custom instructions" work for me, too; the only behavior that I haven't been able to fix is the overuse of meaningless emojis. Your instructions are way more detailed than mine; thank you for sharing.
The emojis drive me absolutely nuts. These instructions seem to kill them, even though they're not explicitly forbidden.
Agreed. But there is a fairly large and very loud group of people that went insane when 4o was discontinued and demanded to have it back.
A group of people seem to have forged weird relationships with AI and that is what they want. It's extremely worrying. Heck, the ex Prime Minister of the UK said he loved ChatGPT recently because it tells him how great he is.
And just like casinos optimizing for gambling addicts and sports optimizing for gambling addicts and mobile games optimizing for addicts, LLMs will be optimized to hook and milk addicts.
They will be made worse for non-addicts to achieve that goal.
That's part of why they are working towards smut too, it's not that there's a trillion dollars of untapped potential, it's that the smut market has much better addict return on investment.
> there is a fairly large and very loud group of people that went insane when 4o was discontinued
Maybe I am notpicking but I think you could argue they were insane before it was discontinued.
It has this, "Robot" personality in settings and has been there for a few months at least.
Edited - it appears to have been renamed "Efficient".
A challenge I had with "Robot" is that it would often veer away from the matter at hand, and start throwing out buzz-wordy, super high level references to things that may be tangentially relevant, but really don't belong in the current convo.
It started really getting under my skin, like a caricature of a socially inept "10x dev know-it-all" who keeps saying "but what about x? And have you solved this other thing y? Then do this for when z inevitably happens ...". At least the know-it-all 10x dev is usually right!
I'm continually tweaking my custom instructions to try to remedy this, hoping the new "Efficient" personality helps too.
All the examples of "warmer" generations show that OpenAI's definition of warmer is synonymous with sycophantic, which is a surprise given all the criticism against that particular aspect of ChatGPT.
I suspect this approach is a direct response to the backlash against removing 4o.
Id have more appreciation and trust in an llm that disagreed with me more and challenged my opinions or prior beliefs. The sycophancy drives me towards not trusting anything it says.
This is why I like Kimi K2/Thinking. IME it pushes back really, really hard on any kind of non obvious belief or statement, and it doesn't give up after a few turns ā it just keeps going, iterating and refining and restating its points if you change your mind or taken on its criticisms. It's great for having a dialectic around something you've written, although somewhat unsatisfying because it'll never agree with you, but that's fine, because it isn't a person, even if my social monkey brain feels like it is and wants it to agree with me sometimes. Someone even ran a quick and dirty analysis of which models are better or worse at pushing back on the user and Kimi came out on top:
https://www.lesswrong.com/posts/iGF7YcnQkEbwvYLPA/ai-induced...
See also the sycophancy score of Kimi K2 on Spiral-Bench: https://eqbench.com/spiral-bench.html (expand details, sort by inverse sycophancy).
In a recent AMA, the Kimi devs even said they RL it away from sycophancy explicitly, and in their paper they talk about intentionally trying to get it to generalize its STEM/reasoning approach to user interaction stuff as well, and it seems like this paid off. This is the least sycophantic model I've ever used.
Which agent do you use it with?
According to those benchmarks, GPT-5 isnāt far off from Kimi in inverse sycophancy.
Everyone telling you to use custom instructions etc donāt realize that they donāt carry over to voice.
Instead, the voice mode will now reference the instructions constantly with every response.
Before:
Absolutely, youāre so right and a lot of people would agree! Only a perceptive and curious person such as yourself would ever consider that, etc etc
After:
Ok hereās the answer! No fluff, no agreeing for the sake of agreeing. Right to the point and concise like you want it. Etc etc
And no, I donāt have memories enabled.
Having this problem with the voice mode as well. It makes it far less usable than it might be if it just honored the system prompts.
Google's search now has the annoying feature that a lot of searches which used to work fine now give a patronizing reply like "Unfortunately 'Haiti revolution persons' isn't a thing", or an explanation that "This is probably shorthand for [something completely wrong]"
That latter thing ā where it just plain makes up a meaning and presents it as if it's real ā is completely insane (and also presumably quite wasteful).
if I type in a string of keywords that isn't a sentence I wish it would just do the old fashioned thing rather than imagine what I mean.
Just set a global prompt to tell it what kind of tone to take.
I did that and it points out flaws in my arguments or data all the time.
Plus it no longer uses any cutesy language. I don't feel like I'm talking to an AI "personality", I feel like I'm talking to a computer which has been instructed to be as objective and neutral as possible.
It's super-easy to change.
I have a global prompt that specifically tells it not to be sycophantic and to call me out when I'm wrong.
It doesn't work for me.
I've been using it for a couple months, and it's corrected me only once, and it still starts every response with "That's a very good question." I also included "never end a response with a question," and it just completely ingored that so it can do its "would you like me to..."
Care to share a prompt that works? I've given up on mainline offerings from google/oai etc.
the reason being they're either sycophantic or so recalcitrant it'll raise your bloodpressure, you end up arguing over if the sky is in fact blue. Sure it pushes back but now instead of sycophanty you've got yourself some pathological naysayer, which is just marginally better, but interaction is still ultimately a waste of timr/productivity brake.
Iāve done this when I remember too, but the fact I have to also feels problematic like Iām steering it towards an outcome if I do or dont.
What's your global prompt please? A more firm chatbot would be nice actually
> All the examples of "warmer" generations show that OpenAI's definition of warmer is synonymous with sycophantic, which is a surprise given all the criticism against that particular aspect of ChatGPT.
Have you considered that āall that criticismā may come from a relatively homogenous, narrow slice of the market that is not representative of the overall market preference?
I suspect a lot of people who are from a very similar background to those making the criticism and likely share it fail to consider that, because the criticism follows their own preferences and viewing its frequency in the media that they consume as representaive of the market is validating.
EDIT: I want to emphasize that I also share the preference that is expressed in the criticisms being discussed, but I also know that my preferred tone for an AI chatbot would probably be viewed as brusque, condescending, and off-putting by most of the market.
I'll be honest, I like the way Claude defaults to relentless positivity and affirmation. It is pleasant to talk to.
That said I also don't think the sycophancy in LLM's is a positive trend. I don't push back against it because it's not pleasant, I push back against it because I think the 24/7 "You're absolutely right!" machine is deeply unhealthy.
Some people are especially susceptible and get one shot by it, some people seem to get by just fine, but I doubt it's actually good for anyone.
The sycophancy makes LLMs useless if you want to use them to help you understand the world objectively.
Equally bad is when they push an opinion strongly (usually on a controversial topic) without being able to justify it well.
I hate NOTHING quite the way how Claude jovially and endlessly raves about the 9/10 tasks it "succeeded" at after making them up, while conveniently forgetting to mention it completely and utterly failed at the main task I asked it to do.
>Have you considered that āall that criticismā may come from a relatively homogenous, narrow slice of the market that is not representative of the overall market preference?
Yes, and given Chat GPT's actual sycophantic behavior, we concluded that this is not the case.
I agree. Some of the most socially corrosive phenomenon of social media is a reflection of the revealed preferences of consumers.
It is interesting. I don't need ChatGPT to say "I got you, Jason" - but I don't think I'm the target user of this behavior.
The target users for this behavior are the ones using GPT as a replacement for social interactions; these are the people who crashed out/broke down about the GPT5 changes as though their long-term romantic partner had dumped them out of nowhere and ghosted them.
I get that those people were distraught/emotionally devastated/upset about the change, but I think that fact is reason enough not to revert that behavior. AI is not a person, and making it "warmer" and "more conversational" just reinforces those unhealthy behaviors. ChatGPT should be focused on being direct and succinct, and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this" call center support agent speak.
> and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this"
You're triggering me.
Another type that are incredibly grating to me are the weird empty / therapist like follow-up questions that don't contribute to the conversation at all.
The equivalent of like (just a contrived example), a discussion about the appropriate data structure for a problem and then it asks a follow-up question like, "what other kind of data structures do you find interesting?"
And I'm just like "...huh?"
> The target users for this behavior are the ones using GPT as a replacement for social interactions
And those users are the ones that produce the most revenue.
True, neither here, but i think what we're seeing is a transition in focus. People at oai have finally clued in on the idea that agi via transformers is a pipedream like elons self driving cars, and so oai is pivoting toward friend/digital partner bot. Charlatan in cheif sam altman recently did say they're going to open up the product to adult content generation, which they wouldnt do if they still beleived some serious amd useful tool (in the specified usecases) were possible. Right now an LLM has three main uses. Interactive rubber ducky, entertainment, and mass surveillance. Since I've been following this saga, since gpt2 days, my close bench set of various tasks etc. Has been seeing a drop in metrics not a rise, so while open bench resultd are imoroving real performance is getting worse and at this point its so much worse that problems gpt3 could solve (yes pre chatgpt) are no longer solvable to something like gpt5.
Indeed, target users are people seeking validation + kids and teenagers + people with a less developed critical mind. Stickiness with 90% of the population is valuable for Sam.
You're absolutely right.
My favorite is "Wait... the user is absolutely right."
!
> what romanian football player won the premier league
> The only Romanian football player to have won the English Premier League (as of 2025) is Florin Andone, but wait ā actually, thatās incorrect; he never won the league.
> ...
> No Romanian footballer has ever won the Premier League (as of 2025).
Yes, this is what we needed, more "conversational" ChatGPT... Let alone the fact the answer is wrong.
My worry is that they're training it on Q&A from the general public now, and that this tone, and more specifically, how obsequious it can be, is exactly what the general public want.
Most of the time, I suspect, people are using it like wikipedia, but with a shortcut to cut through to the real question they want answered; and unfortunately they don't know if it is right or wrong, they just want to be told how bright they were for asking it, and here is the answer.
OpenAI then get caught in a revenue maximising hell-hole of garbage.
God, I hope I am wrong.
LLMs only really make sense for tasks where verifying the solution (which you have to do!) is significantly easier than solving the problem: translation where you know the target and source languages, agentic coding with automated tests, some forms of drafting or copy editing, etc.
General search is not one of those! Sure, the machine can give you its sources but it won't tell you about sources it ignored. And verifying the sources requires reading them, so you don't save any time.
I agree a lot with the first part, the only time I actually feel productive with them is when I can have a short feedback cycle with 100% proof if it's correct or not, as soon as "manual human verification" is needed, things spiral out of control quickly.
> Sure, the machine can give you its sources but it won't tell you about sources it ignored.
You can prompt for that though, include something like "Include all the sources you came across, and explain why you think it was irrelevant" and unsurprisingly, it'll include those. I've also added a "verify_claim" tool which it is instructed to use for any claims before sharing a final response, checks things inside a brand new context, one call per claim. So far it works great for me with GPT-OSS-120b as a local agent, with access to search tools.
One of the dangers of automated tests is that if you use an LLM to generate tests, it can easily start testing implemented rather than desired behavior. Tell it to loop until tests pass, and it will do exactly that if unsupervised.
And you canāt even treat implementation as a black box, even using different LLMs, when all the frontier models are trained to have similar biases towards confidence and obsequiousness in making assumptions about the spec!
Verifying the solution in agentic coding is not nearly as easy as it sounds.
I've often found it helpful in search. Specifically, when the topic is well-documented, you can provide a clear description, but you're lacking the right words or terminology. Then it can help in finding the right question to ask, if not answering it. Recall when we used to laugh at people typing in literal questions into the Google search bar? Those are the exact types of queries that the LLM is equipped to answer. As for the "improvements" in GPT 5.1, seems to me like another case of pushing Clippy on people who want Anton. https://www.latent.space/p/clippy-v-anton
That's a major use case, especially if the definition is broad enough to include take my expertise, knowledge and perhaps a written document, and transmute it to others forms--slides, illustrations, flash cards, quizzes, podcasts, scripts for an inbound call center.
But there seem to be uses where a verified solution is irrelevant. Creativity generally--an image, poem, description of an NPC in a roleplaying game, the visuals for a music video never have to be "true", just evocative. I suppose persuasive rhetoric doesn't have to be true, just plausible or engaging.
As for general search, I don't know that we can say that "classic search" can be meaningful said to tell you about the sources it ignored. I will agree that using OpenAI or Perplexity for search is kind of meh, but Google's AI Mode does a reasonable job at informing you about the links it provides, and you can easily tab over to a classic search if you want. It's almost like having a depth of expertise doing search helps in building a search product the incorporates an LLM...
But, yeah, if one is really disinterested in looking at sources, just chatting with a typical LLM seems a rather dubious way to get an accurate or reasonable comprehensive answer.
Iām of two minds about this.
The ass licking is dangerous to our already too tight information bubbles, that part is clear. But that aside, I think I prefer a conversational/buddylike interaction to an encyclopedic tone.
Intuitively I think it is easier to make the connection that this random buddy might be wrong, rather than thinking the encyclopedia is wrong. Casualness might serve to reduce the tendency to think of the output as actual truth.
Sam Altman probably canāt handle any GPT models that donāt ass lick to an extreme degree so they likely get nerfed before they reach the public.
Its very frustating that it can't be relied upon. I was asking gemini this morning about Uncharted 1,2 and 3 if they had a remastered version for the PS5. It said no. Then 5 minutes later I on the PSN store there were the three remastered versions for sale.
People have been using, "It's what the [insert Blazing Saddles clip here] want!" for years to describe platform changes that dumb down features and make it harder to use tools productively. As always, it's a lie; the real reason is, "The new way makes us more money," usually by way of a dark pattern.
Stop giving them the benefit of the doubt. Be overly suspicious and let them walk you back to trust (that's their job).
Which model did you use? With 5.1 Thinking, I get:
"Costel Pantilimon is the Romanian footballer who won the English Premier League.
"He did it twice with Manchester City, in the 2011ā12 and 2013ā14 seasons, earning a winnerās medal as a backup goalkeeper. ([Wikipedia][1])
URLs:
* [https://en.wikipedia.org/wiki/Costel_Pantilimon]
* [https://www.transfermarkt.com/costel-pantilimon/erfolge/spie...]
* [https://thefootballfaithful.com/worst-players-win-premier-le...
[1]: https://en.wikipedia.org/wiki/Costel_Pantilimon?utm_source=c... "Costel Pantilimon""
I just asked chatgpt 5.1 auto (not instant) on teams account, and its first repsonse was...
I could not find a Romanian football player who has won the Premier League title.
If you like, I can check deeper records to verify whether any Romanian has been part of a title-winning squad (even if as a non-regular player) and report back.
Then I followed up with an 'ok' and it then found the right player.
Just to rule out a random error, I asked the same question two more times in separate chats to gpt 5.1 auto, below are responses...
#2: One Romanian footballer who did not win the Premier League but played in it is Dan Petrescu.
If you meant actually won the Premier League title (as opposed to just playing), I couldnāt find a Romanian player who is a verified Premier League champion.
Would you like me to check more deeply (perhaps look at medal-winners lists) to see if there is a Romanian player who earned a title medal?
#3: The Romanian football player who won the Premier League is Costel Pantilimon.
He was part of Manchester City when they won the Premier League in 2011-12 and again in 2013-14. Wikipedia +1
The beauty of nondeterminism. I get:
The Romanian football player who won the Premier League is Gheorghe Hagi. He played for Galatasaray in Turkey but had a brief spell in the Premier League with Wimbledon in the 1990s, although he didn't win the Premier League with them.
However, Marius LÄcÄtuČ won the Premier League with Arsenal in the late 1990s, being a key member of their squad.
Same:
Yes ā the Romanian player is Costel Pantilimon. He won the Premier League with Manchester City in the 2011-12 and 2013-14 seasons.
If you meant another Romanian player (perhaps one who featured more prominently rather than as a backup), I can check.
Same here, but with the default 5.1 auto and no extra settings. Every time someone posts one of these I just imagine they must have misunderstood the UI settings or cluttered their context somehow.
https://chatgpt.com/s/t_6915c8bd1c80819183a54cd144b55eb2
Damn this is a lot of self correcting
This sounds like my inner monologue during a test I didnt study for
That's complete garbage.
The emojis are the cherry on top of this steaming pile of slop.
Lmao what the hell have they made
Why is this top comment.. this isn't a question you ask an LLM. But I know, that's how people are using them and is the narrative which is sold to us...
You see people (business people who are enthusiastic about tech, often), claiming that these bots are the new Google and Wikipedia, and that youāre behind the times if you do, what amounts, to looking up information yourself.
Weāre preaching to the choir by being insistent here that you prompt these things to get a āvibeā about a topic rather than accurate information, but it bears repeating.
They are only the new Google when they are told to process and summarize web searches. When using trained knowledge they're about as reliable as a smart but stubborn uncle.
Pretty much only search-specific modes (perplexity, deep research toggles) do that right now...
Out of curiosity, is this a question you think Google is well-suited to answer^? How many Wikipedia pages will you need to open to determine the answer?
When folks are frustrated because they see a bizarre question that is an extreme outlier being touted as "model still can't do _" part of it is because you've set the goalposts so far beyond what traditional Google search or Wikipedia are useful for.
^ I spent about five minutes looking for the answer via Google, and the only way I got the answer was their ai summary. Thus, I would still need to confirm the fact.
It's not how I use LLMs. I have a family member who often feels the need to ask ChatGPT almost any question that comes up in a group conversation (even ones like this that could easily be searched without needing an LLM) though, and I imagine he's not the only one who does this. When you give someone a hammer, sometimes they'll try to have a conversation with it.
What do you ask them then?
I'll respond to this bait in the hopes that it clicks for someone how to _not_ use an LLM..
Asking "them"... your perspective is already warped. It's not your fault, all the text we've previously ever seen is associated with a human being.
Language models are mathematical, statistical beasts. The beast generally doesn't do well with open ended questions (known as "zero-shot"). It shines when you give it something to work off of ("one-shot").
Some may complain of the preciseness of my use of zero and one shot here, but I use it merely to contrast between open ended questions versus providing some context and work to be done.
Some examples...
- summarize the following
- given this code, break down each part
- give alternatives of this code and trade-offs
- given this error, how to fix or begin troubleshooting
I mainly use them for technical things I can then verify myself.
While extremely useful, I consider them extremely dangerous. They provide a false sense of "knowing things"/"learning"/"productivity". It's too easy to begin to rely on them as a crutch.
When learning new programming languages, I go back to writing by hand and compiling in my head. I need that mechanical muscle memory, same as trying to learn calculus or physics, chemistry, etc.
You either give them the option to search the web for facts or you ask them things where the utility/validity of the answer is defined by you (e.g. 'summarize the following text...') instead of the external world.
Iāve seen various older people that Iām connected with on Facebook posting screenshots of chats theyāve had with ChatGPT.
Itās quite bizarre from that small sample how many of them take pride in ābaitingā or ābanteringā with ChatGPT and then post screenshots showing how they āgot one overā on the AI. I guess thereās maybe some explanation - feeling alienated by technology, not understanding it, and so needing to āproveā something. But itās very strange and makes me feel quite uncomfortable.
Partly because of the ānormalā and quite naturalistic way they talk to ChatGPT but also because some of these conversations clearly go on for hours.
So I think normies maybe do want a more conversational ChatGPT.
> So I think normies maybe do want a more conversational ChatGPT.
The backlash from GPT-5 proved that. The normies want a very different LLM from what you or I might want, and unfortunately OpenAI seems to be moving in a more direct-to-consumer focus and catering to that.
But I'm really concerned. People don't understand this technology, at all. The way they talk to it, the suicide stories, etc. point to people in general not groking that it has no real understanding or intelligence, and the AI companies aren't doing enough to educate (because why would they, they want you believe it's superintelligence).
These overly conversational chatbots will cause real-world harm to real people. They should reinforce, over and over again to the user, that they are not human, not intelligent, and do not reason or understand.
It's not really the technology itself that's the problem, as is the case with a lot of these things, it's a people & education problem, something that regulators are supposed to solve, but we aren't, we have an administration that is very anti AI regulation all in the name of "we must beat China."
I just cannot imagine myself sitting just āchatting awayā with an AI. It makes me feel quite sick to even contemplate it.
Another person I was talking to recently kept referring to ChatGPT as āsheā. āShe told me Xā, āand I said to herā¦ā
Very very odd, and very worrying. As you say, a big education problem.
The interesting thing is that a lot of these people are folk who are on the edges of digital literacy - people who maybe first used computers when they were in their thirties or forties - or who never really used computers in the workplace, but who now have smartphones - who are now in their sixties.
As a counterpoint, I've been using my own PC since I was 6 and know reasonably well about the innards of LLMs and agentic AI, and absolutely love this ability to hold a conversation with an AI.
Earlier today, procrastinating from work, I spent an hour and a half talking with it about the philosophy of religion and had a great time, learning a ton. Sometimes I do just want a quick response to get things done, but I find living in a world where I'm able to just dive into a deep conversation with a machine that has read the entirety of the internet is incredible.
Im the same I'm only 30 though.
Why would I want to invest emotionally into a literal program? It's bizarre, then you consider that the way you talk to it shapes the responses.
They are essentially talking to themselves and love themselves for it. I can't understand it and I use AI for coding almost daily in one way or another.
While your comment represents a common view, also here on HN, I find it bizarre: Hacker News is in part about innovative new technologies, and such new behaviours around them. For what itās worth, in the last 5 years LLM have been extremely successful tech that has shaped society, maybe to the scale of the iPhone when it came out. Yet this comment is like the āI canāt believe everyone is staring at their phone in the subway instead of talkingā trope or āthis couple is on a date but theyāre just on their phones.ā On Hacker News I would expect people to be more open to such new behaviours as they emerge, instead of kind of kink-shaming them. I myself talk hours to ChatGPT, and am astounded by this new tech. I certainly find it better than TikTok (which after trying out I donāt allow myself to use).
Why is it odd?
Some people treat their pets like they humans. Not sure why this is worse particularly.
This reminds me of a short sci-fi story I read. World was controlled by AI but there were some people that wanted to rebel against it. In the end, one of them was able to infiltrate the AI and destroy it. But the AI knew this is what the rebel wanted, so it created this whole scenario for him to feel inferior. The AI was in no danger, it was too intelligent to be taken down by one person, but it gave exactly what the person wanted. Control the humans by giving them a false sense of control.
Personally, I want a punching bag. It's not because I'm some kind of sociopath or need to work off some aggression. It's just that I need to work the upper body muscles in a punching manner. Sometimes the leg muscles need to move, and sometimes it's the upper body muscles.
ChatGPT is the best social punching bag. I don't want to attack people on social media. I don't want to watch drama, violent games, or anything like that. I think punching bag is a good analogy.
My family members do it all the time with AI. "That's not how you pronounce protein!" "YOUR BALD. BALD. BALDY BALL HEAD."
Like a punching bag, sometimes you need to adjust the response. You wouldn't punch a wall. Does it deflect, does it mirror, is it sycophantic? The conversational updates are new toys.
Seems like people here are pretty negative towards a "conversational" AI chatbot.
Chatgpt has a lot of frustrations and ethical concerns, and I hate the sycophancy as much as everyone else, but I don't consider being conversational to be a bad thing.
It's just preference I guess. I understand how someone who mostly uses it as a google replacement or programming tool would prefer something terse and efficient. I fall into the former category myself.
But it's also true that I've dreamed about a computer assistant that can respond to natural language, even real time speech, -- and can imitate a human well enough to hold a conversation -- since I was a kid, and now it's here.
The questions of ethics, safety, propaganda, and training on other people's hard work are valid. It's not surprising to me that using LLMs is considered uncool right now. But having a computer imitate a human really effectively hasn't stopped being awesome to me personally.
I'm not one of those people that treats it like a friend or anything, but its ability to immitate natural human conversation is one of the reasons I like it.
> I've dreamed about a computer assistant that can respond to natural language
When we dreamed about this as kids, we were dreaming about Data from Star Trek, not some chatbot that's been focus grouped and optimized for engagement within an inch of its life. LLMs are useful for many things and I'm a user myself, even staying within OpenAI's offerings, Codex is excellent, but as things stand anthropomorphizing models is a terrible idea and amplifies the negative effects of their sycophancy.
Right. I want to be conversational with my computer, I don't want it to respond in a manner that's trying to continue the conversation.
Q: "Hey Computer, make me a cup of tea" A: "Ok. Making tea."
Not: Q: "Hey computer, make me a cup of tea" A: "Oh wow, what a fantastic idea, I love tea don't you? I'll get right on that cup of tea for you. Do you want me to tell you about all the different ways you can make and enjoy tea?"
Readers of a certain age will remember the Sirius Cybernetics Corporation products from Hitch Hiker's Guide to the Galaxy.
Every product - doors, lifts, toasters, personal massagers - was equipped with intensely annoying, positive, and sycophantic GPP (Genuine People Personality)ā¢, and their robots were sold as Your Plastic Pal Who's Fun to be With.
Unfortunately the entire workforce were put up against a wall and shot during the revolution.
Why do you want to talk to your computer?
I just want to make it do useful things.
I don't spend a lot of time talking to my vacuum or my shoes or my pencil.
Even Star Trek did not have the computer faff about. Picard said "Tea, earl grey, hot" and it complied, it did not respond.
I don't want a computer that talks. I don't want a computer with a personality. I don't want my drill to feel it's too hot to work that day.
The ship computer on the Enterprise did not make conversation. When Dr Crusher asked it the size of the universe, it did not say "A few hundred meters, wow that's pretty odd why is the universe so small?" it responded "A few hundred meters".
The computer was not a character.
Picard did not ask the computer it's opinion on the political situation he needed to solve that day. He asked it to query some info, and then asked his room full of domain experts their opinions.
I'm generally ok with it wanting a conversation, but yes, I absolutely hate it that is seems to always finish with a question even when it makes zero sense.
[dead]
I didn't grow up watching Star Trek, so I'm pretty sure that's not my dream. I pictured something more like Computer from Dexter's Lab. It talks, it appears to understand, it even occassionally cracks jokes and gives sass, it's incredibly useful, but it's not at risk of being mistaken for a human.
I would of though the hacker news type would be dreaming about having something like javis from iron man, not Data.
Ideally, a chatbot would be able to pick up on that. It would, based on what it knows about general human behavior and what it knows about a given user, make a very good guess as to whether the user wants concise technical know-how, a brainstorming session, or an emotional support conversation.
Unfortunately, advanced features like this are hard to train for, and work best on GPT-4.5 scale models.
For building tools with, it's bad. It's pointless tokens spend on irrelevant tics that will just be fed to other LLMs. The inane chatter should be built on the final level IF and only if, the application is a chat bot, and only if they want the chat bot to be annoying.
I agree with what you're saying.
Personally, I also think that in some situations I do prefer to use it as the google replacement in combination with the imitated human conversations. I mostly use it to 'search' questions while I'm cooking or ask for clothing advice, and here I think the fact that it can respond in natural language and imitate a human to hold a conversation is benefit to me.
I wish chatgpt would stop saying things like "here's a no nonsense answer" like maybe just don't include nonsense in the answer?
It might actually help output answer with less nonsense.
As an example in some workflow I ask chatgpt to figure out if the user is referring to a specific location and output a country in json like { country }
It has some error rate at this task. Asking it for a rationale improves this error rate to almost none. { rationale, country }. However reordering the keys like { country, rationale } does not. You get the wrong country and a rationale that justifies the correct one that was not given.
This is/was a great trick for improving accuracy of small model + structured output. Kind of an old-fashoined Chain of Thought type of thing. Eg: I used this before with structured outputs in Gemini Flash 2.0 to significantly improve the quality of answers. Not sure if 2.5 Flash requires it, but for 2.0 Flash you could use the propertyOrdering field to force a specific ordering of JSONSchema response items, and force it to output things like "plan", "rationale", "reasoning", etc as the first item, then simply discard it.
But if itās going to give itself a little pep talk canāt it just do that in a thinking token?
It's analogous to how politicians nowadays are constantly saying "let me be clear", it drives me nuts.
Another annoyance: "In my honest opinion...". Does that mean that you other times are sharing dishonest opinions? Why would you need to declare that this time you're honest?
This has been a pet peeve of mine for years. I call people out when they say this for the abuse of language and for being a time vampire.
[dead]
Recently microsoft copilot's (only one that's allowed within our corporate network) replies all have the first section prefixed as "Direct answer:"
And after the short direct answer it puts the usual five section blog post style answer with emoji headings
Maybe you used "Don't give me nonsense" in your custom system prompt?
An LLM should never refer to the user's "style" prompt like that. It should function as the model's personality, not something the user asked it to do or be like.
System prompt is for multi-client/agent applications, so if you wish to fix something for everyone, that is the right place to put it.
That does nothing. You can add, āsay I donāt know if you are not certain or donāt know the answerā and it will never say I donāt know.
That's because "certain" and "know the answer" has wildly different definitions depending on the person, you need to be more specific about what you actually mean with that. Anything that can be ambiguous, will be treated ambiguously.
Anything that you've mentioned in the past (like `no nonsense`) that still exists in context, will have a higher possibility of being generated than other tokens.
GPT-5.1 IS a smarter, more conversational ChatGPT, and I love that you mentioned it - you're really getting down to the heart - to the very essence - of how conversational ChatGPT can be.
Would you like me to write a short, to-the-point HN post to really emphasize how conversational GPT-5.1 can be?
The horror of the triple emdash
standardlyās comment has only hyphens, not em dashes. Em dashes are much longer: - vs. ā
What's remarkable to me is how deep OpenAI is going on "ChatGPT as communication partner / chatbot", as opposed to Anthropic's approach of "Claude as the best coding tool / professional AI for spreadsheets, etc.".
I know this is marketing at play and OpenAI has plenty of resources developed to advancing their frontier models, but it's starting to really come into view that OpenAI wants to replace Google and be the default app / page for everyone on earth to talk to.
OpenAI said that only ~4% of generated tokens are for programming.
ChatGPT is overwhelmingly, unambiguously, a "regular people" product.
Yes, just look at the stats on OpenRouter. OpenAI has almost totally lost the programming market.
As a happy OpenRouter user I know the vast majority of the industry directly use vendor APIs and that the OpenRouter rankings are useless for those models.
OpenRouter probably doesn't mean much given that you can use the OpenAI API directly with the openai library that people use for OpenRouter too.
OpenAI is BYOK-only on OpenRouter, which artificially depresses its utilization there.
I use codex high because Anthropic CC max plan started fucking people over who want to use opus. Sonnet kind of stinks on more complex problems that opus can crush, but they want to force sonnet usage and maybe they want to save costs.
Codex 5 high does a great job for the advanced use cases I throw at it and gives me generous usage.
> ChatGPT is overwhelmingly, unambiguously, a "regular people" product.
How many of these people are paying and how much are they paying, though. Most "regular" people I met that have switched to ChaptGPT are using it as an alternative to search engines and are not paying for it (only one person I know is paying and he is using the Sora model to generate images for his business).
It's just another sign telling you that OpenAI's end game is selling ads.
I mean, yes, but also because it's not as good as Claude today. Bit of a self fulfilling prophecy and they seem to be measuring the wrong thing.
4% of their tokens or total tokens in the market?
> I mean, yes, but also because it's not as good as Claude today.
I'm not sure, sometimes GPT-5 Codex (or even the regular GPT-5 with Medium/High reasoning) can do things Sonnet 4.5 would mess up (most recently, figuring out why some wrappers around PrimeVue DataTable components wouldn't let the paginator show up and work correctly; alongside other such debugging) and vice versa, sometimes Gemini 2.5 Pro is also pretty okay (especially when it comes to multilingual stuff), there's a lot of randomness/inconsistency/nuance there but most of the SOTA models are generally quite capable. I kinda thought GPT-5 wasn't very good a while ago but then used it a bunch more and my views of it improved.
You're underestimating the amount of general population that's using ChatGPT. Us, people using it for codegen, are extreme minority.
Their tokens, they released a report a few months ago.
However, I can only imagine that OpenAI outputs the most intentionally produced tokens (i.e. the user intentionally went to the app/website) out of all the labs.
> it's not as good as Claude today
In my experience this is not true anymore. Of course, mine is just one data point.
I think there's a lot of similarity between the conversationalness of Claude and ChatGPT. They are both sycophantic. So this release focuses on the conversational style,it doesn't mean OpenAI has lost the technical market. People a reading a lot into a point-release.
I think this is because Anthropic has principles and OpenAI does not.
Anthropic seems to treat Claude like a tool, whereas OpenAI treats it more like a thinking entity.
In my opinion, the difference between the two approaches is huge. If the chatbot is a tool, the user is ultimately in control; the chatbot serves the user and the approach is to help the user provide value. It's a user-centric approach. If the chatbot is a companion on the other hand, the user is far less in control; the chatbot manipulates the user and the approach is to integrate the chatbot more and more into the user's life. The clear user-centric approach is muddied significantly.
In my view, that is kind of the fundamental difference between these two companies. It's quite significant.
I don't follow Anthropic marketing but the system prompt for Claude.AI says sounds like a partner/ chatbot to me!
"Claude provides emotional support alongside accurate medical or psychological information or terminology where relevant."
and
" For more casual, emotional, empathetic, or advice-driven conversations, Claude keeps its tone natural, warm, and empathetic. Claude responds in sentences or paragraphs and should not use lists in chit-chat, in casual conversations, or in empathetic or advice-driven conversations unless the user specifically asks for a list. In casual conversation, itās fine for Claudeās responses to be short, e.g. just a few sentences long." |
They also prompt Claude to never say it isn't conscious:
"Claude engages with questions about its own consciousness, experience, emotions and so on as open questions, and doesnāt definitively claim to have or not have personal experiences or opinions."
Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.