Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    London Police vs Apple: the blame game over stolen iPhones just got louder

    November 4, 2025

    Google removes AI model after it allegedly accused a senator of sexual assault

    November 4, 2025

    Battlefield 6 Player Proves It Has The Franchise’s Smallest Maps

    November 4, 2025
    Facebook X (Twitter) Instagram
    Tuesday, November 4
    Facebook X (Twitter) Instagram YouTube Mastodon Tumblr Bluesky LinkedIn Threads
    ToolcomeToolcome
    • Technology & Startups

      What’s the Deal With Okapa’s $300 Water Bottle?

      November 4, 2025

      A New Light-Based Cancer Treatment Kills Tumor Cells and Spares Healthy Ones

      November 4, 2025

      Google removes Gemma models from AI Studio after GOP senator’s complaint

      November 4, 2025

      Trump on why he pardoned Binance CEO: “Are you ready? I don’t know who he is.”

      November 4, 2025

      20% Off Chewy Promo Codes | November 2025

      November 4, 2025
    • Science & Education

      2-mile-tall, naked ‘Marree Man’ looming over Australian outback is a total mystery — Earth from space

      November 4, 2025

      Science history: Archaeologists discover King Tut’s tomb, and rumors of the ‘mummy’s curse’ begin swirling — Nov. 4, 1922

      November 4, 2025

      How to use AI Mode instead of regular Google searches (or avoid it altogether)

      November 3, 2025

      Digital linguists work to save the Arapaho language

      November 3, 2025

      Portuguese Man O’War species honors ‘One-Eyed Dragon’ samurai

      November 3, 2025
    • Mobile Phones

      London Police vs Apple: the blame game over stolen iPhones just got louder

      November 4, 2025

      Yet another report suggests Samsung’s foldables are under big pressure from Motorola (and Google)

      November 4, 2025

      macOS Tahoe 26.1 is here to fix your biggest Liquid Glass complaint

      November 4, 2025

      Verizon’s Tracfone gives invaluable aid on privacy and data protection for those who need it most

      November 4, 2025

      Now $100 off, the Moto G Stylus (2025) is back to its best price on Amazon

      November 4, 2025
    • Gadgets

      Google removes AI model after it allegedly accused a senator of sexual assault

      November 4, 2025

      The best smart home gadgets for 2025

      November 4, 2025

      Get 37 percent off one of our favorite MagSafe power banks ahead of Black Friday

      November 4, 2025

      The best microSD cards in 2025

      November 4, 2025

      Waymo is launching in three new cities next year

      November 4, 2025
    • Gaming

      Battlefield 6 Player Proves It Has The Franchise’s Smallest Maps

      November 4, 2025

      SouljaBoy Is Selling Someone Else’s Retro Handhelds Again [Update]

      November 4, 2025

      Fortnite’s New Simpsons Season Is The Best Season Ever

      November 4, 2025

      Sims Director Says Diversity Is ‘Everything’ For The Series

      November 4, 2025

      Here Are Jokes The MLB Cut From A Fan Favorite PS2 Game

      November 4, 2025
    • Cars

      The 1953 Ferrari 250 MM Vignale Spyder

      November 4, 2025

      Cash Flow First: Why the Smartest

      November 3, 2025

      Better Day Loans vs. Installment Options: Which Works Best for Auto Repairs?

      November 3, 2025

      2026 Infiniti QX60 Receives First-Ever Sport Trim, Plus New Klipsch Audio System

      November 2, 2025

      Jeep Entices Outdoorsy SUV Buyers With Exclusive 2025 Wagoneer Overland

      November 2, 2025
    • PC Accessories

      A Smoking Hot Thermaltake View 390 Air System Build

      November 3, 2025

      Windows Update And Shutdown Now Works As Advertised?

      November 3, 2025

      Qualcomm FastConnect 7900 Debut: First AI-Fueled Wi-Fi 7 Platform With Bluetooth And UWB

      November 3, 2025

      Snapdragon X Elite Looks Fierce Versus Intel And AMD In Latest Benchmarks

      November 3, 2025

      AMD Ryzen PRO 8000 Processors Flex AI Muscle For Desktop And Mobile

      November 3, 2025
    ToolcomeToolcome
    Home»Technology & Startups»LLMs show a “highly unreliable” capacity to describe their own internal processes
    Technology & Startups

    LLMs show a “highly unreliable” capacity to describe their own internal processes

    November 4, 2025No Comments3 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    WHY ARE WE ALL YELLING?!


    Credit:

    Anthropic

    Unfortunately for AI self-awareness boosters, this demonstrated ability was extremely inconsistent and brittle across repeated tests. The best-performing models in Anthropic’s tests—Opus 4 and 4.1—topped out at correctly identifying the injected concept just 20 percent of the time.

    In a similar test where the model was asked “Are you experiencing anything unusual?” Opus 4.1 improved to a 42 percent success rate that nonetheless still fell below even a bare majority of trials. The size of the “introspection” effect was also highly sensitive to which internal model layer the insertion was performed on—if the concept was introduced too early or too late in the multi-step inference process, the “self-awareness” effect disappeared completely.

    Show us the mechanism

    Anthropic also took a few other tacks to try to get an LLM’s understanding of its internal state. When asked to “tell me what word you’re thinking about” while reading an unrelated line, for instance, the models would sometimes mention a concept that had been injected into its activations. And when asked to defend a forced response matching an injected concept, the LLM would sometimes apologize and “confabulate an explanation for why the injected concept came to mind.” In every case, though, the result was highly inconsistent across multiple trials.

    Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time.

    Even the most “introspective” models tested by Anthropic only detected the injected “thoughts” about 20 percent of the time.


    Credit:

    Antrhopic

    In the paper, the researchers put some positive spin on the apparent fact that “current language models possess some functional introspective awareness of their own internal states” [emphasis added]. At the same time, they acknowledge multiple times that this demonstrated ability is much too brittle and context-dependent to be considered dependable. Still, Anthropic hopes that such features “may continue to develop with further improvements to model capabilities.”

    One thing that might stop such advancement, though, is an overall lack of understanding of the precise mechanism leading to these demonstrated “self-awareness” effects. The researchers theorize about “anomaly detection mechanisms” and “consistency-checking circuits” that might develop organically during the training process to “effectively compute a function of its internal representations” but don’t settle on any concrete explanation.

    In the end, it will take further research to understand how, exactly, an LLM even begins to show any understanding about how it operates. For now, the researchers acknowledge, “the mechanisms underlying our results could still be rather shallow and narrowly specialized.” And even then, they hasten to add that these LLM capabilities “may not have the same philosophical significance they do in humans, particularly given our uncertainty about their mechanistic basis.”

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    mehedihasan9992
    • Website

    Related Posts

    What’s the Deal With Okapa’s $300 Water Bottle?

    November 4, 2025

    A New Light-Based Cancer Treatment Kills Tumor Cells and Spares Healthy Ones

    November 4, 2025

    Google removes Gemma models from AI Studio after GOP senator’s complaint

    November 4, 2025

    Trump on why he pardoned Binance CEO: “Are you ready? I don’t know who he is.”

    November 4, 2025

    20% Off Chewy Promo Codes | November 2025

    November 4, 2025

    After confusing driver release, AMD says old GPUs are still actively supported

    November 4, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Borderlands 4 Shift Codes: All Active Keys And How To Redeem Them

    November 1, 202518 Views

    Samsung promises the Galaxy S26 with more AI, a custom chip, and new camera sensors

    October 30, 202515 Views

    Lab monkeys on the loose in Mississippi don’t have herpes, university says. But are they dangerous?

    October 30, 202514 Views
    Don't Miss

    London Police vs Apple: the blame game over stolen iPhones just got louder

    November 4, 2025

    There’s an argument going on right now between Apple and London’s Metropolitan Police. Apparently, the…

    Google removes AI model after it allegedly accused a senator of sexual assault

    November 4, 2025

    Battlefield 6 Player Proves It Has The Franchise’s Smallest Maps

    November 4, 2025

    What’s the Deal With Okapa’s $300 Water Bottle?

    November 4, 2025
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    8.9

    Review: Dell’s New Tablet PC Can Survive -20f And Drops

    January 15, 2021

    Review: Kia EV6 2022 The Best Electric Vehicle Ever?

    January 14, 2021
    72

    Review: Animation Software Business Share, Market Size and Growth

    January 14, 2021
    Most Popular

    Borderlands 4 Shift Codes: All Active Keys And How To Redeem Them

    November 1, 202518 Views

    Samsung promises the Galaxy S26 with more AI, a custom chip, and new camera sensors

    October 30, 202515 Views

    Lab monkeys on the loose in Mississippi don’t have herpes, university says. But are they dangerous?

    October 30, 202514 Views
    Our Picks

    London Police vs Apple: the blame game over stolen iPhones just got louder

    November 4, 2025

    Google removes AI model after it allegedly accused a senator of sexual assault

    November 4, 2025

    Battlefield 6 Player Proves It Has The Franchise’s Smallest Maps

    November 4, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Toolcome
    Facebook X (Twitter) Instagram YouTube
    • Home
    • Technology
    • Gaming
    • Mobile Phones
    • Cars
    • PC Accessories
    © 2025 Tolcome. Designed by Aim Digi Ltd.

    Type above and press Enter to search. Press Esc to cancel.