Resident of the world, traveling the road of life
68332 stories
·
21 followers

AI gets 45% of news wrong — but readers still trust it

2 Shares

The BBC and the European Broadcasting Union have produced a large study of how well AI chatbots handle summarising the news. In short: badly. [BBC; EBU]

The researchers asked ChatGPT, Copilot, Gemini, and Perplexity about current events. 45% of the chatbot answers had at least one major issue. 31% were seriously wrong and 20% had major inaccuracies, from hallucinations or outdated sources. This is across multiple languages and multiple countries. [EBU, PDF]

The AI distortions are “significant and systemic in nature.”

Google Gemini was by far the worst. It would make up an authoritative-sounding summary with completely fake and wrong references — much more than the other chatbots. It also used a satire source as a news source. Pity Gemini’s been forced into every Android phone, hey.

Chatbots fail most with current news stories that are moving fast. They’re also really prone to making up quotes. Anything in quotes probably isn’t the words the person actually said.

7% of news consumers ask a chatbot for their news, and that’s 15% of readers under 25. And just over a third — though they don’t give the actual percentage number — say they trust AI summaries, and about half of those under 35. People pick convenience first. [BBC, PDF]

Peter Archer is the BBC’s Programme Director for Generative AI — what a job title — and is quoted in the EBU press release. Archer put forward these results even though they were quite bad. So full points for that.

Unfortunately, Archer also says in the press release: ‘We’re excited about AI and how it can help us bring even more value to audiences.”

Archer sees his task here as promoting the chatbots: “We want these tools to succeed and are open to working with AI companies to deliver for audiences and wider society.”

Anyone whose title is “Programme Director for Generative AI” is never going to sign off on a result that this stuff is poison to accurate news and the public discourse, and the BBC needs it gone — as this study makes clear. Because the job description is not to assess generative AI — it’s to promote generative AI. [job description]

So what happens next? The broadcasters have no plan to address the chatbot problem. The report doesn’t even offer ways forward. There’s no action points! Except do more studies!

They’re just going to cross their fingers and hope the chatbot vendors can be shamed into giving a hoot — the approach that hasn’t worked so far, and isn’t going to work.

Unless the vendors can cure chatbot hallucinations. And they can’t do that, because that’s how chatbots work. Everything a chatbot outputs is a hallucination, and some of the hallucinations are just closer to accurate.

The actual answer is to stop using chatbots for news, stop creating jobs inside the broadcasters whose purpose is to befoul the information stream with generative AI, and attach actual liability to the chatbot vendors when they output complete lies. Imagine a chatbot vendor having to take responsibility for what the lying chatbot spits out.

Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Oxford pretends AI benchmarks are science, not marketing

1 Share

Chatbot vendors routinely make up a new benchmark, then brag how well their hot new chatbot does on it. Like that time OpenAI’s o3 model trounced the FrontierMath benchmark, and it’s just a coincidence that OpenAI paid for the benchmark and got access to the questions ahead of time.

Every new model will be trained hard against all the benchmarks. There is no such thing as real world performance — there’s only benchmark numbers.

There’s a new conference paper from Oxford University’s Reasoning With Machines Lab: “Measuring what Matters: Construct Validity in Large Language Model Benchmarks.” [press release; paper, PDF]

Reasoning With Machines doesn’t work on reasoning, really. It’s almost entirely large language models — chatbots. Because that’s where the money — sorry, the industry interest — is. But this paper got into the NeurIPS 2025 conference.

The researchers did a systematic review of 46,000 AI papers. Well. What they actually did is they ran the papers through GPT-4o Mini. Using a chatbot anywhere in your supposedly scientific process is a bad sign if you’re claiming to do serious research.

The chatbot pointed the researchers at 445 benchmarking tests. You’ll be 0% surprised that most of these benchmarks were rubbish:

vague definitions for target phenomena or an absence of statistical tests. We consider these challenges to the construct validity of LLM benchmarks: many benchmarks are not valid measurements of their intended targets.

Wow, that’s terrible! How did these benchmarks get that way? Well, the paper never asks that question.

But pretty obviously, science-shaped text to make a product look good is precisely the job of marketing material. The purpose is to generate something to put in the press release.

So what’s the Reasoning With Machines answer to this problem? What’s the action item?

We built a taxonomy of these failures and translated them into an operational checklist to help future benchmark authors demonstrate construct validity.

Now, that’s the right answer — if what the benchmark authors are doing is actually science. But chatbot benchmarks are not science. They were never science. They’re marketing.

This paper never addresses this. This is an 82-page paper, and it never talks about what the AI benchmarks were created for, and what they’re used for in the real world. The word ”marketing” does not appear in the paper. The concept of marketing doesn’t appear in the paper. Not even as “we’re not addressing this right now,” it’s just not there.

It’s like when someone pretends you can talk about chatbots purely as technical artifacts — and somehow never mention what the chatbots are made for, who’s paying for them, why they’re paying all the money they have for chatbots, and the political programme they’re promoting the chatbots to advance. It’s glaringly dodging the issue.

That’s what this paper does — it artificially separates benchmarks from why the benchmarks are this bad. The researchers cannot have been this unaware.

What are the researchers envisioning here? The people creating the chatbot benchmarks — a lot of these are Ph.D scientists. Are these just poor distracted lab workers who somehow forgot how to do science, so it’s a good thing Reasoning With Machines is here to help?

No. Their job was to create marketing materials shaped like science.

This paper treats chatbot benchmarks as defective science that can be fixed. And that was never what chatbot benchmarks were for.

The Oxford Reasoning With Machines Lab is pretending not to understand something that they absolutely should understand, given most of the lab’s work is chatbots.

That’s because this paper is also marketing — to sell Reasoning With Machines’ services to the chatbot vendors, so they can do their marketing better. And make the benchmark lies a bit less obvious.

Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

FBI Tries to Unmask Owner of Infamous Archive.is Site

1 Share
FBI Tries to Unmask Owner of Infamous Archive.is Site

The FBI is attempting to unmask the owner behind archive.today, a popular archiving site that is also regularly used to bypass paywalls on the internet and to avoid sending traffic to the original publishers of web content, according to a subpoena posted by the website. The FBI subpoena says it is part of a criminal investigation, though it does not provide any details about what alleged crime is being investigated. Archive.today is also popularly known by several of its mirrors, including archive.is and archive.ph.

Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Automattic Inc. Claims It Owns the Word 'Automatic'

1 Comment and 2 Shares
Automattic Inc. Claims It Owns the Word 'Automatic'

Automattic, the company that owns WordPress.com, is asking Automatic.CSS—a company that provides a CSS framework for WordPress page builders—to change its name amid public spats between Automattic founder Matt Mullenweg and Automatic.CSS creator Kevin Geary. Automattic has two T’s as a nod to Matt.

“As you know, our client owns and operates a wide range of software brands and services, including the very popular web building and hosting platform WordPress.com,” Jim Davis, an intellectual property attorney representing Automattic, wrote in a letter dated Oct. 30. 



Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete
1 public comment
angelchrys
23 hours ago
reply
It's been so long since I had positive feelings for MM. Sigh.
Overland Park, KS
richard4339
1 hour ago
I don’t feel for him with what he’s been doing of late, he even almost screwed up Pocket Casts last month. But if someone says “I’m using Automatic CSS for Wordpress” you’d likely think that’s by Automattic, and the average person likely doesn’t know it should have two Ts. This is one where I’d assume it’s valid.

AI Is Supercharging the War on Libraries, Education, and Human Knowledge

1 Share
AI Is Supercharging the War on Libraries, Education, and Human Knowledge

This story was reported with support from the MuckRock Foundation.

Last month, a company called the Children’s Literature Comprehensive Database announced a new version of a product called Class-Shelf Plus. The software, which is used by school libraries to keep track of which books are in their catalog, added several new features including “AI-driven automation and contextual risk analysis,” which includes an AI-powered “sensitive material marker” and a “traffic-light risk ratings” system. The company says that it believes this software will streamline the arduous task school libraries face when trying to comply with legislation that bans certain books and curricula: “Districts using Class-Shelf Plus v3 may reduce manual review workloads by more than 80%, empowering media specialists and administrators to devote more time to instructional priorities rather than compliance checks,” it said in a press release.

In a white paper published by CLCD, it gave a “real-world example: the role of CLCD in overcoming a book ban.” The paper then describes something that does not sound like “overcoming” a book ban at all. CLCD’s software simply suggested other books “without the contested content.” 

Ajay Gupte, the president of CLCD, told 404 Media the software is simply being piloted at the moment, but that it  “allows districts to make the majority of their classroom collections publicly visible—supporting transparency and access—while helping them identify a small subset of titles that might require review under state guidelines.” He added that “This process is designed to assist districts in meeting legislative requirements and protect teachers and librarians from accusations of bias or non-compliance [...] It is purpose-built to help educators defend their collections with clear, data-driven evidence rather than subjective opinion.”

Librarians told 404 Media that AI library software like this is just the tip of the iceberg; they are being inundated with new pitches for AI library tech and catalogs are being flooded with AI slop books that they need to wade through. But more broadly, AI maximalism across society is supercharging the ideological war on libraries, schools, government workers, and academics.

CLCD and Class Shelf Plus is a small but instructive example of something that librarians and educators have been telling me: The boosting of artificial intelligence by big technology firms, big financial firms, and government agencies is not separate from book bans, educational censorship efforts, and the war on education, libraries, and government workers being pushed by groups like the Heritage Foundation and any number of MAGA groups across the United States. This long-running war on knowledge and expertise has sown the ground for the narratives widely used by AI companies and the CEOs adopting it. Human labor, inquiry, creativity, and expertise is spurned in the name of “efficiency.” With AI, there is no need for human expertise because anything can be learned, approximated, or created in seconds. And with AI, there is less room for nuance in things like classifying or tagging books to comply with laws; an LLM or a machine algorithm can decide whether content is “sensitive.”

“I see something like this, and it’s presented as very value neutral, like, ‘Here’s something that is going to make life easier for you because you have all these books you need to review,’” Jaime Taylor, discovery & resource management systems coordinator for the W.E.B. Du Bois Library at the University of Massachusetts told me in a phone call. “And I look at this and immediately I am seeing a tool that’s going to be used for censorship because this large language model is ingesting all the titles you have, evaluating them somehow, and then it might spit out an inaccurate evaluation. Or it might spit out an accurate evaluation and then a strapped-for-time librarian or teacher will take whatever it spits out and weed their collections based on it. It’s going to be used to remove books from collections that are about queerness or sexuality or race or history. But institutions are going to buy this product because they have a mandate from state legislatures to do this, or maybe they want to do this, right?”

The resurgent war on knowledge, academics, expertise, and critical thinking that AI is currently supercharging has its roots in the hugely successful recent war on “critical race theory,” “diversity equity and inclusion,” and LGBTQ+ rights that painted librarians, teachers, scientists, and public workers as untrustworthy. This has played out across the board, with a seemingly endless number of ways in which the AI boom directly intersects with the right’s war on libraries, schools, academics, and government workers. There are DOGE’s mass layoffs of “woke” government workers, and the plan to replace them with AI agents and supposed AI-powered efficiencies. There are “parents rights” groups that pushed to ban books and curricula that deal with the teaching of slavery, systemic racism, and LGBTQ+ issues and attempted to replace them with homogenous curricula and “approved” books that teach one specific type of American history and American values; and there are the AI tools that have been altered to not be “woke” and to reenforce the types of things the administration wants you to think. Many teachers feel they are not allowed to teach about slavery or racism and increasingly spend their days grading student essays that were actually written by robots.

“One thing that I try to make clear any time I talk about book bans is that it’s not about the books, it’s about deputizing bigots to do the ugly work of defunding all of our public institutions of learning,” Maggie Tokuda-Hall, a cofounder of Authors Against Book Bans, told me. “The current proliferation of AI that we see particularly in the library and education spaces would not be possible at the speed and scale that is happening without the precedent of book bans leading into it. They are very comfortable bedfellows because once you have created a culture in which all expertise is denigrated and removed from the equation and considered nonessential, you create the circumstances in which AI can flourish.”

Justin, a cohost of the podcast librarypunk, told me that the project of offloading cognitive capacity to AI continues apace: “Part of a fascist project to offload the work of thinking, especially the reflective kind of thinking that reading, study, and community engagement provide,” Justin said. “That kind of thinking cultivates empathy and challenges your assumptions. It's also something you have to practice. If we can offload that cognitive work, it's far too easy to become reflexive and hateful, while having a robot cheerleader telling you that you were right about everything all along.”

These two forces—the war on libraries, classrooms, and academics and AI boosterism—are not working in a vacuum. The Heritage Foundation’s right-wing agenda for remaking the federal government, Project 2025, talks about criminalizing teachers and librarians who “poison our own children” and pushing artificial intelligence into every corner of the government for data analysis and “waste, fraud, and abuse” detection. 

Librarians, teachers, and government workers have had to spend an increasing amount of their time and emotional bandwidth defending the work that they do, fighting against censorship efforts and dealing with the associated stress, harassment, and threats that come from fighting educational censorship. Meanwhile, they are separately dealing with an onslaught of AI slop and the top-down mandated AI-ification of their jobs; there are simply fewer and fewer hours to do what they actually want to be doing, which is helping patrons and students.

“The last five years of library work, of public service work has been a nightmare, with ongoing harassment and censorship efforts that you’re either experiencing directly or that you’re hearing from your other colleagues,” Alison Macrina, executive director of Library Freedom Project, told me in a phone interview. “And then in the last year-and-a-half or so, you add to it this enormous push for the AIfication of your library, and the enormous demands on your time. Now you have these already overworked public servants who are being expected to do even more because there’s an expectation to use AI, or that AI will do it for you. But they’re dealing with things like the influx of AI-generated books and other materials that are being pushed by vendors.” 

The future being pushed by both AI boosters and educational censors is one where access to information is tightly controlled. Children will not be allowed to read certain books or learn certain narratives. “Research” will be performed only through one of a select few artificial intelligence tools owned by AI giants which are uniformly aligned behind the Trump administration and which have gone to the ends of the earth to prevent their black box machines from spitting out “woke” answers lest they catch the ire of the administration. School boards and library boards, forced to comply with increasingly restrictive laws, funding cuts, and the threat of being defunded entirely, leap at the chance to be considered forward looking by embracing AI tools, or apply for grants from government groups like the Institute of Museum and Library Services (IMLS), which is increasingly giving out grants specifically to AI projects.

We previously reported that the ebook service Hoopla, used by many libraries, has been flooded with AI-generated books (the company has said it is trying to cull these from its catalog). In a recent survey of librarians, Macrina’s organization found that librarians are getting inundated with pitches from AI companies and are being pushed by their superiors to adopt AI: “People in the survey results kept talking about, like, I get 10 aggressive, pushy emails a day from vendors demanding that I implement their new AI product or try it, jump on a call. I mean, the burdens have become so much, I don’t even know how to summarize them.”

“Fascism and AI, whether or not they have the same goals, they sure are working to accelerate one another"

Macrina said that in response to Library Freedom Project’s recent survey, librarians said that misinformation and disinformation was their biggest concern. This came not just in the form of book bans and censorship but also in efforts to proactively put disinformation and right-wing talking points into libraries: “It’s not just about book bans, and library board takeovers, and the existing reactionary attacks on libraries. It’s also the effort to push more far-right material into libraries,” she said. “And then you have librarians who are experiencing a real existential crisis because they are getting asked by their jobs to promote [AI] tools that produce more misinformation. It's the most, like, emperor-has-no-clothes-type situation that I have ever witnessed.” 

Each person I spoke to for this article told me they could talk about the right-wing project to erode trust in expertise, and the way AI has amplified this effort, for hours. In writing this article, I realized that I could endlessly tie much of our reporting on attacks on civil society and human knowledge to the force multiplier that is AI and the AI maximalist political and economic project. One need look no further than Grokipedia as one of the many recent reminders of this effort—a project by the world’s richest man and perhaps its most powerful right-wing political figure to replace a crowdsourced, meticulously edited fount of human knowledge with a robotic imitation built to further his political project. 

Much of what we write about touches on this: The plan to replace government workers with AI, the general erosion of truth on social media, the rise of AI slop that “feels” true because it reinforces a particular political narrative but is not true, the fact that teachers feel like they are forced to allow their students to use AI. Justin, from librarypunk, said AI has given people “absolute impunity to ignore reality […] AI is a direct attack on the way we verify information: AI both creates fake sources and obscures its actual sources.”

That is the opposite of what librarians do, and teachers do, and scientists do, and experts do. But the political project to devalue the work these professionals do, and the incredible amount of money invested in pushing AI as a replacement for that human expertise, have worked in tandem to create a horrible situation for all of us.

“AI is an agreement machine, which is anathema to learning and critical thinking,” Tokuda-Hall said. Previously we have had experts like librarians and teachers to help them do these things, but they have been hamstrung and they’ve been attacked and kneecapped and we’ve created a culture in which their contribution is completely erased from society, which makes something like AI seem really appealing. It’s filling that vacuum.”

“Fascism and AI, whether or not they have the same goals, they sure are working to accelerate one another,” she added.

Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

One of the Greatest Wall Street Investors of All Time Announces Retirement

2 Shares
One of the Greatest Wall Street Investors of All Time Announces Retirement

Nancy Pelosi, one of Wall Street’s all time great investors, announced her retirement Thursday.

Pelosi, so known for her ability to outpace the S&P 500 that dozens of websites and apps spawned to track her seeming preternatural ability to make smart stock trades, said she will retire after the 2024-2026 season. Pelosi’s trades over the years, many done through her husband and investing partner Paul Pelosi, have been so good that an entire startup, called Autopilot, was started to allow investors to directly mirror Pelosi’s portfolio. 

According to the site, more than 3 million people have invested more than $1 billion using the app. After 38 years, Pelosi will retire from the league—a somewhat normal career length as investors, especially on Pelosi’s team, have decided to stretch their careers later and later into their lives. 

The numbers put up by Pelosi in her Hall of Fame career are undeniable. Over the last decade, Pelosi’s portfolio returned an incredible 816 percent, according to public disclosure records. The S&P 500, meanwhile, has returned roughly 229 percent. Awe-inspired fans and analysts theorized that her almost omniscient ability to make correct, seemingly high-risk stock decisions may have stemmed from decades spent analyzing and perhaps even predicting decisions that would be made by the federal government that could impact companies’ stock prices. For example, Paul Pelosi sold $500,000 worth of Visa stock in July, weeks before the U.S. government announced a civil lawsuit against the company, causing its stock price to decrease.  

Besides Autopilot and numerous Pelosi stock trade trackers, there have also been several exchange traded funds (ETFs) set up that allow investors to directly copy their portfolio on Pelosi and her trades. Related funds, such as The Subversive Democratic Trading ETF (NANC, for Nancy), set up by the Unusual Whales investment news Twitter account, seek to allow investors to diversify their portfolios by tracking the trades of not just Pelosi but also some of her colleagues, including those on the other team, who have also proven to be highly gifted stock traders.

Fans of Pelosi spent much of Thursday admiring her career, and wondering what comes next: “Farewell to one of the greatest investors of all time,” the top post on Reddit’s Wall Street Bets community reads. The sentiment has more than 24,000 upvotes at the time of publication. Fans will spend years debating in bars whether Pelosi was the GOAT; some investors have noted that in recent years, some of her contemporaries, like Marjorie Taylor-Green, Ro Khanna, and Michael McCaul, have put up gaudier numbers. There are others who say the league needs reformation, with some of Pelosi’s colleagues saying they should stop playing at all, and many fans agreeing with that sentiment. Despite the controversy, many of her colleagues have committed to continue playing the game.

Pelosi said Thursday that this season would be her last, but like other legends who have gone out on top, it seems she is giving it her all until the end. Just weeks ago, she sold between $100,000 and $250,000 of Apple stock, according to a public box score.

“We can be proud of what we have accomplished,” Pelosi said in a video announcing her retirement. “But there’s always much more work to be done.”

Read the whole story
mkalus
1 hour ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete
Next Page of Stories