Hacker News Discussion Summary

Post ID: 44826997 Title: GPT-5 Points: 1700 Total Comments: 1983 Model: vertex_ai/gemini-2.5-pro Generated: 2025-08-08 17:30:46

Token Usage

Prompt tokens: 132,873
Completion tokens: 4,803
Reasoning tokens: 1,808
Total tokens: 137,676

Here is a summary of the themes from the Hacker News discussion on the GPT-5 announcement.

An Underwhelming, Incremental Step on the S-Curve

A dominant sentiment throughout the discussion is that GPT-5, despite the major version number, represents an incremental improvement rather than the revolutionary leap many had anticipated. Commenters frequently referenced the "S-curve" of technological progress, suggesting that large language models are hitting a period of diminishing returns. The jump from GPT-3 to GPT-4 was seen as a paradigm shift, whereas the improvements in GPT-5 feel more like a refinement, leading to widespread feelings of it being "underwhelming."

Many users felt the hype, partly fueled by OpenAI's leadership, did not match the reality of the release. This sentiment was captured succinctly by user sharkjacobs: "The upgrade from GPT3.5 to GPT4 was like going from a Razr to an iPhone, just a staggering leap forward. Everything since then has been successive iPhone releases (complete with the big product release announcements and front page HN post). A sequence of largely underwhelming and basically unimpressive incremental releases."

The feeling of a plateau was echoed by many. "It's seemed that way for the last year," said dismalaf. "The only real improvements have been in the chat apps themselves (internet access, function calling). Until AI gets past the pre-training problem, it'll stagnate." User lawlessone summed it up: "im sure i am repeating someone else but sounds like we're coming over the s-curve".

This perspective frames GPT-5 not as a breakthrough, but as an optimization. User smlacy suggested, "this very much feels like 'we have made a more efficient/scalable model and we're selling it as the new shiny but it's really just an internal optimization to reduce cost'".

The Competitive Landscape: Playing Catch-Up to Claude

A significant portion of the discussion places GPT-5 in direct competition with other frontier models, most notably Anthropic's Claude. For many, especially those focused on coding, OpenAI is seen as playing catch-up. The sentiment is that Claude, particularly through interfaces like Cursor, has been outperforming OpenAI's models for months, and GPT-5 is just closing a gap that had already formed.

The thread opens with atonse stating, "For day to day coding, I've found Anthropic to be killing it with Sonnet 3.7 and now Sonnet 4, and Claude Code feeling like it has even bigger advantages... I don't even try to use the OpenAI models because it's felt like night and day." This view was widely shared. "Yup, Claude has been kicking GPT's ass for months now," user pawelduda agreed. The leaked coding examples did little to change this perception, with cuuupid remarking, "These are honestly pretty disappointing :/ this quality was possible with Claude Code months ago".

However, some users noted that Claude isn't perfect. User dudeinhawaii offered a nuanced critique: "My experience has been that Claude Code is exceptional at tool use (and thus working with agentic IDEs) but... not the smartest coder. It will happy re-invent the wheel, create silos, or generate terrible code that you'll only discover weeks or months later... I find that the 'smarter' models like Gemini and o3 output better quality code overall".

The pricing of GPT-5 was seen as a direct competitive move. User jumploops analyzed the numbers: "With 74.9% on SWE-bench, this inches out Claude Opus 4.1 at 74.5%, but at a much cheaper cost. For context, Claude Opus 4.1 is $15 / 1M input tokens and $75 / 1M output tokens." This suggests that even if the capability leap isn't massive, the economic positioning is aggressive.

Coding Prowess: Strong in JavaScript, Weak in the Margins

Discussions on GPT-5's coding abilities revealed a clear divide. While it performs well on common tasks, especially in JavaScript and other web technologies, its capabilities break down significantly when dealing with less common languages or complex, existing codebases. This led many to believe that its strength is a reflection of its training data rather than a true, generalizable reasoning ability.

User 0xFEE1DEAD provided a detailed account of this limitation: "I wish they wouldn't use JS to demonstrate the AI's coding abilities - the internet is full of JS code and at this point I expect them to be good at it. Show me examples in complex (for lack of a better word) languages to impress me... I recently used OpenAI models to generate OCaml code, and it was eye opening how much even reasoning models are still just copy and paste machines."

This experience was not unique. "The models break down on not even that complex of code either, if it's not web/javascript," noted thewebguyd. "This makes the tech even less useful where it'd be most helpful - on internal, legacy codebases, enterprisey stuff, stacks that don't have numerous examples on github to train from."

The value of demos showing greenfield project generation was also questioned. User rkozik1989 argued, "Honestly, why would anyone find this information useful? Creating a brand new greenfield project is a terrible test. Because literally anything it outputs as long as it looks good as long as it works following the happy path. Coding with LLMs falls apart in situations where complex reasoning is required."

Marketing Blunders, Presentation Gaffes, and Vibe Checks

The launch presentation itself became a major topic of ridicule. Commenters seized on several unforced errors, most notably a series of bizarrely incorrect bar charts, as evidence of either sloppiness or intentional deception. The presenters' stiff, "robotic" delivery also drew heavy criticism, creating an ironic contrast with the claims of the AI becoming "more human."

The graph issue was first flagged by mtlynch: "What's going on with their SWE bench graph?... GPT-5 non-thinking is labeled 52.8% accuracy, but o3 is shown as a much shorter bar, yet it's labeled 69.1%." This led to a torrent of mockery. "Tufte used to call this creating a 'visual lie'," said Upvoter33. User Aurornis expressed professional shock: "As someone who spent years quadruple checking every figure in every slide for years to avoid a mistake like this, it’s very confusing to see this out of the big launch announcement of one of the most high profile startups around."

The presentation's first technical demo, an explanation of the Bernoulli effect on airplane wings, was also a spectacular failure. User kybernetikos pointed out the scientific inaccuracy: "Isn't that explanation of why wings work completely wrong? There's nothing that forces the air to cover the top distance in the same time that it covers the bottom distance, and in fact it doesn't... Very strange to use a mistake as your first demo, especially while talking about how it's phd level."

The overall "vibe" of the presentation was found wanting. "They are researchers, not professional presenters," wrote motoxpro in defense, but many, like spruce_tips, simply found the presenters gave off "such a 'sterile' vibe." User SV_BubbleTime joked, "If they release in a week it was all AI generated I’ll be ultra impressed because they nailed the mix of corpo speak, mild autism and awkwardness".

The Future of Work, AGI, and Existential Questions

The release prompted a wide-ranging debate on the future impact of AI. Opinions spanned the full spectrum from utopian optimism to dystopian dread. The central tension revolved around whether AI will augment human potential or render it obsolete, and whether the pursuit of AGI is a net positive for humanity.

Some expressed a dark hope for disruption. User rvz began, "I hope that this live stream will tell you that this will be the definitive reason why web developers, JavaScript / TypeScript developers are going to be made completely obsolete". Others saw a more collaborative future, with ethan_smith stating, "Tools like GPT-5 will transform web development rather than replace developers - the most valuable skills will shift toward problem definition, architecture design, and quality verification".

The debate became more personal when discussing AI's potential to solve major human problems. User unsupp0rted argued, "I don't mind losing my programming job in exchange for being able to go to the pharmacy for my annual anti-cancer pill." This was met with pragmatic skepticism from jplusequalt: "Have you looked at how expensive prescription drug prices are without (sometimes WITH) insurance? If you are no longer employed, good luck paying for your magical pill."

The concept of "cognitive atrophy" was introduced by billmalarky, who theorized that over-reliance on AI for generation could weaken human cognitive abilities, much like physical technology has led to physical atrophy. "AI will accelerate our cognitive productivity, and allow for cognitive convenience -- at a cost of cognitive atrophy."

The discussion also touched on whether a "hard takeoff" scenario, where one company achieves AGI and pulls far ahead, is likely. User highfrequency observed the opposite trend: "as time goes on and the models get better, the performance of the different company's gets clustered closer together... As a user, it feels like the race has never been as close as it is now."

Product Strategy: Deprecation, Unification, and Verification

OpenAI's decision to deprecate all of its previous models in the ChatGPT interface and unify them under the GPT-5 banner was a major point of discussion. For some, this was a welcome simplification. "im just glad that I don't have to switch between models any more," said semiinfinitely, "for me thats a huge ease of use improvement."

However, others were frustrated by the loss of choice and control. User thimabi lamented, "I personally hated this decision... As a paying user, I liked the ability to set which models to use each time... Now they’ve deprecated this feature and I’m stuck with their base GPT-5 model or GPT-5 Thinking".

A more cynical take, from andix, suggested a financial motive: "It makes me think that GPT-5 is mostly a huge cost saving measurement. It's probably more energy efficient than older models, so they remove it from ChatGPT. It also makes comparisons to older models much harder."

Separately, a new requirement for API users to undergo identity verification with a "video face scan and your legal ID" caused significant friction. "Yep, and I asked ChatGPT about it and it straight up lied and said it was mandatory in EU," wrote AtNightWeCode. "I will never upload a selfie to OpenAI. That is like handing over the kids to one of those hangover teenagers watching the ball pit at the local mall."

Uncommon Opinions and Unique Perspectives

A selection of quotes that stood out from the main threads of conversation.

HardCodedBias: "Bravo. 1) So impressed at their product focus 2) Great product launch video. Fearlessly demonstrating live. Impressive. 3) Real time humor by the presenters makes for a great 'live' experience Huge kudos to OAI. So many great features (better coding, routing, some parts of 4.5, etc) but the real strength is the product focus as opposed to the 'research updates' from other labs. Huge Kudos!! Keep on shipping OAI!"

koeng: "I am a synthetic biologist, and I use AI a lot for my work. And it constantly denies my questions RIGHT NOW. But of course OpenAI and Anthropic have to implement more - from the GPT5 introduction: 'robust safety stack with a multilayered defense system for biology'... It just sucks that some of the best tools for learning are being lobotomized specifically for my field because of people in AI believe that knowledge should be kept secret. It's extremely antithetical to the hacker spirit that knowledge should be free."

Telemakhos: "I am thoroughly unimpressed by GPT-5. It still can't compose iambic trimeters in ancient Greek with a proper penthemimeral cæsura, and it insists on providing totally incorrect scansion of the flawed lines it does compose. I corrected its metrical sins twice, which sent it into 'thinking' mode until it finally returned a 'Reasoning failed' error."

obloid: "So far GPT-5 has not been able to pass my personal 'Turing test'... I want it to create an image of Santa Claus pulling the sleigh with a reindeer in the sleigh holding the reins, driving the sleigh. No matter how I modify the prompt it is still unable to create this image that my daughter requested a few years ago. This is an image that is easily imagined and drawn by a small child yet the most advanced AI models still can't produce it."

ycosynot: "Damn, you guys are toxic. So -- they did not invent AGI yet. Yet, I like what I'm seeing. Major progress on multiple fronts. Hallucination fix is exciting on its own. The React demos were mindblowing."

mikewarot: "I've you're into woo-woo physics, GPT-5 seems to have a good handle on things.. here's a chat I just had with it.[1]"

kkukshtel: "However, this model is not for me in the same way models normally are. This is for the 800m or whatever people that open up chatgpt every day and type stuff in. All of them have been stuck on GPT-4o unbeknwst to them. They had no idea SOTA was far beyond that... for all these people, they just got a MAJOR upgrade."

primaprashant/gemini-2.5-pro-hn-summary-gpt-5-post.md

Select an option

No results found