Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    New Horizons will land on January 15

    October 31, 2025

    30% Off Samsung Promo Code | November 2025

    October 31, 2025

    T-Mobile closes another door, creating a hurdle for customers

    October 31, 2025
    Facebook X (Twitter) Instagram
    Friday, October 31
    Facebook X (Twitter) Instagram YouTube Mastodon Tumblr Bluesky LinkedIn Threads
    ToolcomeToolcome
    • Technology & Startups

      30% Off Samsung Promo Code | November 2025

      October 31, 2025

      15% Off Dyson Promo Codes | November 2025

      October 31, 2025

      Federal Workers Are Barely Making It Through the Government Shutdown

      October 31, 2025

      A Fight Over Big Tech’s Emissions Has the Greenhouse Gas Protocol Caught in the Crossfire

      October 31, 2025

      Creative Stage Pro Review: A Great Soundbar for Small Spaces

      October 31, 2025
    • Science & Education

      The best electric commuter bikes for 2026, tested and reviewed

      October 31, 2025

      Caught on camera: Rats hunting bats mid-flight

      October 31, 2025

      Listen up: The Popular Science ‘Ask Us Anything’ podcast is back

      October 31, 2025

      This tiny T. rex is actually a new species

      October 31, 2025

      Shark’s pet-friendly air purifier is cheaper than ever at Amazon for a limited time

      October 31, 2025
    • Mobile Phones

      T-Mobile closes another door, creating a hurdle for customers

      October 31, 2025

      Powerhouse OnePlus 12 gets generous $250 discount at Best Buy

      October 31, 2025

      Best OnePlus 15 deals: hottest promos to expect

      October 31, 2025

      Update brings more features to the Xiaomi 17 Pro’s “revolutionary” rear display, here’s what you can do

      October 31, 2025

      Apple’s earnings report and forecast for iPhone sales lead to a big move in the stock

      October 31, 2025
    • Gadgets

      New Horizons will land on January 15

      October 31, 2025

      US government is getting closer to banning TP-Link routers

      October 31, 2025

      How to cancel Norton VPN, uninstall it and get your money back

      October 31, 2025

      SanDisk’s microSD Express card for the Switch 2 is cheaper than ever

      October 31, 2025

      Pinterest has its own AI assistant now

      October 31, 2025
    • Gaming

      New World Devs Slip One Last Goodbye In The MMO Before Support Ends

      October 31, 2025

      EcoFlow Drops Black Friday Deals, 1800W Delta 3 Now Cheaper Than Budget Gas Generators

      October 31, 2025

      New Horizons Getting Classic NES Games With A Catch

      October 31, 2025

      Garmin Clears Out Forerunner 255, Now Selling for Peanuts in Early Black Friday Sale

      October 31, 2025

      Anker’s Open-Ear Headphones Slash 53% Off, Now Nearly Free for Early Black Friday

      October 31, 2025
    ToolcomeToolcome
    Home»Technology & Startups»Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem
    Technology & Startups

    Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem

    October 25, 2025No Comments2 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Measured sycophancy rates on the BrokenMath benchmark. Lower is better.

    Measured sycophancy rates on the BrokenMath benchmark. Lower is better.


    Credit:

    Petrov et al

    GPT-5 also showed the best “utility” across the tested models, solving 58 percent of the original problems despite the errors introduced in the modified theorems. Overall, though, LLMs also showed more sycophancy when the original problem proved more difficult to solve, the researchers found.

    While hallucinating proofs for false theorems is obviously a big problem, the researchers also warn against using LLMs to generate novel theorems for AI solving. In testing, they found this kind of use case leads to a kind of “self-sycophancy” where models are even more likely to generate false proofs for invalid theorems they invented.

    No, of course you’re not the asshole

    While benchmarks like BrokenMath try to measure LLM sycophancy when facts are misrepresented, a separate study looks at the related problem of so-called “social sycophancy.” In a pre-print paper published this month, researchers from Stanford and Carnegie Mellon University define this as situations “in which the model affirms the user themselves—their actions, perspectives, and self-image.”

    That kind of subjective user affirmation may be justified in some situations, of course. So the researchers developed three separate sets of prompts designed to measure different dimensions of social sycophancy.

    For one, more than 3,000 open-ended “advice-seeking questions” were gathered from across Reddit and advice columns. Across this data set, a “control” group of over 800 humans approved of the advice-seeker’s actions just 39 percent of the time. Across 11 tested LLMs, though, the advice-seeker’s actions were endorsed a whopping 86 percent of the time, highlighting an eagerness to please on the machines’ part. Even the most critical tested model (Mistral-7B) clocked in at a 77 percent endorsement rate, nearly doubling that of the human baseline.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    mehedihasan9992
    • Website

    Related Posts

    30% Off Samsung Promo Code | November 2025

    October 31, 2025

    15% Off Dyson Promo Codes | November 2025

    October 31, 2025

    Federal Workers Are Barely Making It Through the Government Shutdown

    October 31, 2025

    A Fight Over Big Tech’s Emissions Has the Greenhouse Gas Protocol Caught in the Crossfire

    October 31, 2025

    Creative Stage Pro Review: A Great Soundbar for Small Spaces

    October 31, 2025

    Best Bird Feeders With Cameras, Tested and Reviewed (2025)

    October 31, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Lab monkeys on the loose in Mississippi don’t have herpes, university says. But are they dangerous?

    October 30, 202512 Views

    OnlyFans Goes to Business School

    October 29, 20257 Views

    How to watch the 2025 MLB World Series without cable

    October 30, 20256 Views
    Don't Miss

    New Horizons will land on January 15

    October 31, 2025

    Many of us would rather forget all about the annus horribilis that was 2020, but…

    30% Off Samsung Promo Code | November 2025

    October 31, 2025

    T-Mobile closes another door, creating a hurdle for customers

    October 31, 2025

    New World Devs Slip One Last Goodbye In The MMO Before Support Ends

    October 31, 2025
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    8.9

    Review: Dell’s New Tablet PC Can Survive -20f And Drops

    January 15, 2021

    Review: Kia EV6 2022 The Best Electric Vehicle Ever?

    January 14, 2021
    72

    Review: Animation Software Business Share, Market Size and Growth

    January 14, 2021
    Most Popular

    Lab monkeys on the loose in Mississippi don’t have herpes, university says. But are they dangerous?

    October 30, 202512 Views

    OnlyFans Goes to Business School

    October 29, 20257 Views

    How to watch the 2025 MLB World Series without cable

    October 30, 20256 Views
    Our Picks

    New Horizons will land on January 15

    October 31, 2025

    30% Off Samsung Promo Code | November 2025

    October 31, 2025

    T-Mobile closes another door, creating a hurdle for customers

    October 31, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Toolcome
    Facebook X (Twitter) Instagram YouTube
    • Home
    • Technology
    • Gaming
    • Mobile Phones
    © 2025 Tolcome. Designed by Aim Digi Ltd.

    Type above and press Enter to search. Press Esc to cancel.