Close Menu
    What's Hot

    Trend Pulse Confirms Structural Weakness

    January 24, 2026

    U.K. FCA moves closer to crypto regulation with final consumer duty consultation

    January 24, 2026

    Democrats File Amendments to Crypto Market Structure Bill

    January 24, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Trend Pulse Confirms Structural Weakness
    • U.K. FCA moves closer to crypto regulation with final consumer duty consultation
    • Democrats File Amendments to Crypto Market Structure Bill
    • Can Bitcoin Revisit $97,600? Glassnode Says Watch This
    • Ethereum Whales’s $15 Million Move, Is This Another Insider Trader?
    • U.S. Senator Warren rebuffed on delay of World Liberty bank charter over Trump ties
    • Here’s How Ethereum Staking Transforms Into A Multi-Billion-Dollar Bet For Bitmine Immersion
    • FBI arrests ex-Olympian drug ‘kingpin’ who allegedly used crypto to move proceeds
    Facebook X (Twitter) Instagram
    Tokatik – Latest Crypto News, Market Insights & Crypto Products
    • Home
    • Shop
    • Altcoins
    • Bitcoin
    • Ethereum
    • Exchanges
    • Market Updates
    • NFTs
    • DeFi
    • Regulations
    Tokatik – Latest Crypto News, Market Insights & Crypto Products
    Home»NFTs»Anthropic Claims ‘Best Coding Model in the World’ With Claude Sonnet 4.5—We Tested It
    NFTs

    Anthropic Claims ‘Best Coding Model in the World’ With Claude Sonnet 4.5—We Tested It

    8okaybaby@gmail.comBy 8okaybaby@gmail.comSeptember 30, 2025No Comments4 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Anthropic Claims ‘Best Coding Model in the World’ With Claude Sonnet 4.5—We Tested It
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In brief

    • Anthropic released Claude Sonnet 4.5, calling it the best coding model yet.
    • The model scored 77.2% on SWE-bench Verified, rising to 82% with parallel compute.
    • Anthropic claimed improvements on alignment and safety, but jailbreakers cracked it within minutes.

    Anthropic released Claude Sonnet 4.5 on Monday, calling it “the best coding model in the world” and releasing a suite of new developer tools alongside the model. The company said the model can focus for more than 30 hours on complex, multi-step coding tasks and shows gains in reasoning and mathematical capabilities.

    Introducing Claude Sonnet 4.5—the best coding model in the world.

    It’s the strongest model for building complex agents. It’s the best model at using computers. And it shows substantial gains on tests of reasoning and math. pic.twitter.com/7LwV9WPNAv

    — Claude (@claudeai) September 29, 2025

    The model scored 77.2% on SWE-bench Verified, a benchmark that measures real-world software coding abilities, according to Anthropic’s announcement. That score rises to 82% when using parallel test-time compute. This puts the new model ahead of the best offerings from OpenAI and Google, and even Anthropic’s Claude 4.1 Opus (per the company’s naming scheme, Haiku is a small model, Sonnet is a medium size, and Opus is the heaviest and most powerful model in the family).

    Image: Anthropic

    Claude Sonnet 4.5 also leads on OSWorld, a benchmark testing AI models on real-world computer tasks, scoring 61.4%. Four months ago, Claude Sonnet 4 held the lead at 42.2%. The model shows improved capabilities across reasoning and math benchmarks, and experts in specific business fields like finance, law and medicine.

    We tried the model, and our first quick test found it capable of generating our usual “AI vs Journalists” game using zero-shot prompting without iterations, tweaks, or retries. The model produced functional code faster than Claude 4.1 Opus while maintaining top quality output. The application it created showed visual polish comparable to OpenAI’s outputs, a change from earlier Claude versions that typically produced less refined interfaces.

    Anthropic released several new features with the model. Claude Code now includes checkpoints, which save progress and allow users to roll back to previous states. The company refreshed the terminal interface and shipped a native VS Code extension. The Claude API gained a context editing feature and a memory tool that lets agents run longer and handle greater complexity. Claude apps now include code execution and file creation for spreadsheets, slides, and documents directly in conversations.

    Pricing remains unchanged from Claude Sonnet 4 at $3 per million input tokens and $15 per million output tokens. All Claude Code updates are available to all users, while Claude Developer Platform updates, including the Agent SDK, are available to all developers.

    Anthropic also called Claude Sonnet 4.5 “our most aligned frontier model yet,” saying it made substantial improvements in reducing concerning behaviors like sycophancy, deception, power-seeking, and encouraging delusional thinking. The company also said it made progress on defending against prompt injection attacks, which it identified as one of the most serious risks for users of agentic and computer use capabilities.

    Of course, it took Pliny—the world’s most famous AI prompt engineer—a few minutes to jailbreak it and generate drug recipes like it was the most normal thing in the world.

    The release comes as competition intensifies among AI companies for coding capabilities. OpenAI released GPT-5 last month, while Google’s models compete on various benchmarks. This can be a shocker for some prediction markets, which up until a few hours ago were almost completely certain that Gemini was going to be the best model of the month.

    It may be a race against time. Right now, the model does not appear on the rankings, but LM Arena announced it was already available for ranking. Depending on the number of interactions, the outcome tomorrow could be pretty surprising, considering Claude 4.1 Opus in in second place and Claude 4.5 Sonnet is much better.

    Anthropic is also releasing a temporary research preview called “Imagine with Claude,” available to Max subscribers for five days. In the experiment, Claude generates software on the fly with no predetermined functionality or prewritten code, responding and adapting to requests as users interact.

    “What you see is Claude creating in real time,” the company said. Anthropic described it as a demonstration of what’s possible when combining the model with appropriate infrastructure.

    Generally Intelligent Newsletter

    A weekly AI journey narrated by Gen, a generative AI model.

    4.5We Anthropic Claims Claude Coding Model Sonnet Tested World
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    8okaybaby@gmail.com
    • Website

    Related Posts

    U.S. Senator Warren rebuffed on delay of World Liberty bank charter over Trump ties

    January 24, 2026

    FBI arrests ex-Olympian drug ‘kingpin’ who allegedly used crypto to move proceeds

    January 24, 2026

    BitGo Stock Plunges Below IPO Price on Second Day of Trading

    January 23, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Subscribe to Updates

    Get the latest sports news from SportsSite about soccer, football and tennis.

    Advertisement
    About Us

    Welcome to Tokatik.com, your go-to source for the latest in cryptocurrency news, insights, and trends. Our mission is to provide accurate, timely, and comprehensive coverage of the ever-evolving world of digital currencies.

    Facebook X (Twitter) Instagram Pinterest YouTube
    Top Insights

    Trend Pulse Confirms Structural Weakness

    January 24, 2026

    U.K. FCA moves closer to crypto regulation with final consumer duty consultation

    January 24, 2026

    Democrats File Amendments to Crypto Market Structure Bill

    January 24, 2026
    Recent Posts
    • Trend Pulse Confirms Structural Weakness
    • U.K. FCA moves closer to crypto regulation with final consumer duty consultation
    • Democrats File Amendments to Crypto Market Structure Bill
    • Can Bitcoin Revisit $97,600? Glassnode Says Watch This
    • Ethereum Whales’s $15 Million Move, Is This Another Insider Trader?
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 tokatik.com . Designed by by pro.

    Type above and press Enter to search. Press Esc to cancel.