Michael's Shares

Trying to Read by Reza
Friday September 19^th, 2025 at 11:38 AM

Poorly Drawn Lines

Read the whole story

mkalus

5 hours ago

reply

iPhone: 49.287476,-123.142136

Birds on the Moon by Reza
Friday September 19^th, 2025 at 11:38 AM

Poorly Drawn Lines

Read the whole story

mkalus

5 hours ago

reply

iPhone: 49.287476,-123.142136

#ricoh #519 #rangefinder #35mm #film by nobody@flickr.com (ecstaticist - evanleeson.art)
Friday September 19^th, 2025 at 10:56 AM

Uploads from ecstaticist - evanleeson.art

ecstaticist - evanleeson.art posted a photo:

Read the whole story

mkalus

6 hours ago

reply

iPhone: 49.287476,-123.142136

OpenAI fights the evil scheming AI! — which doesn’t exist yet by David Gerard
Friday September 19^th, 2025 at 9:57 AM

Pivot to AI

AI vendors sell access to chatbots. You can have conversations with the chatbot!

This convinces far too many people there’s an actual person in there. But there isn’t — they’re text completing machines with a bit of randomness.

That doesn’t sound very cool. So the AI companies encourage what we call criti-hype — the sort of AI “criticism” that makes the robot sound dangerous in a cool and edgy way. Our chatbot is so capable, it could take over the world! So it can definitely answer your email.

If you can’t get enough criti-hype, make up your own! Ask the AI doomsday crowd. The AI companies are full of these people. They will always tell you the super-autocomplete might take over and kill us all.

Anthropic puts out a lot of reports to stoke the fear of chatbots. Reasoning AI is lying to you! Or maybe it’s just hallucinating again. Anthropic did one report with Apollo Research where they got a chatbot to lie to them — by telling it to lie to them.

Apollo Research is an “AI safety” group — that is, they’re AI doomsday cultists. [Apollo]

After GPT-5 fell flat, OpenAI is getting a bit desperate. So they sponsored Apollo to write a report on the threat of scheming chatbots that will deceive you! [OpenAI; Apollo; paper, PDF]

OpenAI and Apollo “found behaviors consistent with scheming in controlled tests across frontier models.”

No, they didn’t. They’re anthropomorphising the chatbot without a licence. The bots are text completers. They don’t have intelligence, they don’t have plans or goals.

But people project their own intelligence onto the chatbot — and if they’re paranoid AI doomsday cultists, they project AI doom onto the chatbot.

AI scheming – pretending to be aligned while secretly pursuing some other agenda – is a significant risk that we’ve been studying.

No, it is not a “significant risk,” they don’t show that at all, and the claim is as stupid as it sounds.

The paper is 94 pages, but if you read through, they openly admit they’ve got nothing. Section 3.1, “Covert actions as a proxy for scheming”, admits directly:

Current frontier models likely lack the sophisticated awareness and goal-directedness required for competent and concerning scheming.

The researchers just said chatbots don’t scheme — but they really want to study this made-up threat. So they look for supposed “covert actions”. And they just assume — on no evidence — there are goals in there.

Apollo is assuming their conclusion as hard as they can, then they’re presenting the results of this assumption as evidence for their conclusion.

The researchers even set up “anti-scheming environments that create conflicts to elicit covert rule violations” — so whatever the bot output is, it’s evidence of wrongdoing. So’s the opposite.

One of the tasks is an “Impossible Coding Task.” Now, you might have noticed that vibe coding is terrible and chatbots are not actually good at coding. But these guys assume bad code presented as good code is evidence of evil.

“Misciting the spec” is cited as evidence of malign intent. In that case, the Github Copilot bot is a malevolent intelligence that’s out to mess you up. (… Hmm.)

If they can’t get anything else, they just tell the bot to please act evil:

Training the model to covertly pursue the misaligned goal. In order to instill the misaligned, covert goal into o4-mini, we use deliberative alignment.

Yes — if you train the bot to act like an evil robot, it’ll act like an evil robot.

After way too many pages of this guff, the conclusions straight up admit they’ve got nothing:

While current models likely lack the goal-directedness and situational awareness required for dangerous scheming, this failure mode may become critical in future AI systems.

We admit this is useless and dumb, but you can’t prove it won’t be huge in the future!

Scheming represents a significant risk for future AI systems

This is just after they said they’ve no evidence this is even a thing.

The whole paper is full of claims so stupid you think, I must be reading it wrong. But then they just come out and say the stupid version.

I bet these guys are haunted by the malevolent artificial intelligence power of thermostats. It switched itself on!!

Video version

Read the whole story

mkalus

7 hours ago

reply

iPhone: 49.287476,-123.142136

We simulated the human mind with a chatbot. It didn’t work by David Gerard
Friday September 19^th, 2025 at 9:56 AM

Pivot to AI

We talked previously about using machine-learning AI to produce fake data — sorry, synthetic data — for medical research. Avoid pesky human subjects and ethics requirements. The data is not real data — but you can get so many papers out of it!

Synthetic data generally uses old-fashioned machine learning. It didn’t come from the AI bubble, and it isn’t chatbots.

But what if … it was chatbots?

Here’s a remarkable paper: “A foundation model to predict and capture human cognition”. The researchers talk up their exciting new model called Centaur. They collected 160 psychological experiments and retrained Llama 3.1 on them. The researchers claim Centaur is so good that: [Nature]

it also generalizes to previously unseen cover stories, structural task modifications and entirely new domains.

That is, they asked Centaur about some experiments they didn’t train it on, and it got better answers than an untrained chatbot. They’d fooled themselves, and that was enough.

The paper ends with an example of “model-guided scientific discovery”. Their example is that Centaur does better at designing an experiment than DeepSeek-R1.

Now, the researchers are not saying you should go out and fake experiments with data from Centaur. Perish the thought!

They’re just talking about how you might use Centaur for science, and then they tweet that “Centaur is a computational model that predicts and simulates human behavior for any experiment described in natural language.” They’re just saying it. [Twitter]

I won’t name names, but I’ve seen academics vociferously defending this paper against the charge that it’s suggesting you could synthesize your data with Centaur, because they didn’t expressly say those words. This is a complaint that the paper said 2 and 2, and how dare you add them up and get 4.

You or I might think that claiming a chatbot model simulates human psychology was obviously a weird and foolish claim. So we have a response paper: “Large Language Models Do Not Simulate Human Psychology.” [arXiv, PDF]

This paper doesn’t let the first guys get away with weasel wording. It also replies to the first paper’s implied suggestions that Centaur would be just dandy for data synthesis:

Recently, some research has suggested that LLMs may even be able to simulate human psychology and can therefore replace human participants in psychological studies. We caution against this approach.

A chatbot doesn’t react consistently, it doesn’t show human levels of variance, and — being a chatbot — it hallucinates.

The second research team tested multiple chatbots, including Centaur, against 400 human subjects on a standard series of ethical questions, and subtly reworded them. Centaur’s human fidelity regressed to about average:

If inputs are re-worded, we would need LLMs to still align with human responses. But they do not.

Their conclusion should be obvious, but it looks like they did have to say it out loud:

LLMs should not be treated as (consistent or reliable) simulators of human psychology. Therefore, we recommend that psychologists should refrain from using LLMs as participants for psychological studies.

Some researchers will do anything not to deal with messy humans and ethics boards. They only want an excuse to synthesize their data.

The Centaur paper was precisely that excuse. The researchers could not have not known it was that excuse, in the context of the world they live and work in, in 2025. Especially as the first tool they reached for was everyone’s favourite academic cheat code — a chatbot.

Video version

Read the whole story

mkalus

7 hours ago

reply

iPhone: 49.287476,-123.142136

The hype after AI: lads, it’s looking really quantum by David Gerard
Friday September 19^th, 2025 at 9:55 AM

Pivot to AI

The tech press and the finance press have seen a barrage of quantum computing hype in the past few weeks This is entirely because venture capital is worried about the AI bubble.

The MIT report that 95% of AI projects don’t make any money frightened the AI bubble investors. This is even as the actual report is trash, and its purpose was to sell you on the authors’ Web3 crypto scam. (I still appear to be the only one to notice that bit.) They got the right answer entirely by accident.

Venture capital needs a bubble party to get lottery wins. The hype is the product. The tech itself is an annoying bag on the side of the hype.

This new wave of quantum hype is trying to pump up a bubble with rather more desperation than before.

Quantum computing is not a product yet. It’s as if investors were being sold the fabulous vision of the full Internet, ten minutes after the telegraph was invented. Get in early!

Now, quantum computing is a real thing. It is not a technology yet — right now, it’s physics experiments. You can’t buy it in a box.

The big prize in quantum computing is where you use quantum bits (qubits) to do particular difficult calculations fast — or at all. Such as factoring huge numbers to break encryption!

Here in the real-life present day, quantum computing still can’t factor numbers higher than 21. Three times seven. That’s the best the brightest minds of IBM have achieved.

But there’s also a lot of fudging results. The recent Peter Gutmann paper is a list of ways to cheat at quantum factoring. [IACR, PDF]

In this paper, Gutmann is telling cryptographers not to worry too much about quantum computing. Though cryptographers have still been on the case for a couple of decades, just in case there’s a breakthrough.

There’s other things that are not this qubit-based version of quantum computing, but the companies can technically call them computing that’s quantum, so they do, ’cos there’s money in it.

D-Wave will sell you a “quantum computer” that does a different thing called quantum annealing. Amazon and IBM will rent you “quantum computers,” and you look and they’re quantum computing simulators. IBM also runs a lot of vendor-funded trials.

The hype version of quantum computing, that uses qubits to factor numbers fast and so on, does not exist as yet. The hype is that it will exist any time soon. But there’s not a lot of sign of that.

Look up “quantum computing” in Google News. Some of this is science, such as a university press office talking up a small advance.

The hype results are PR company placements, funded by venture capital dollars. Today’s press release is a new physics experiment, and you have to read several paragraphs down to where they admit anything practical is a decade away. They say “this decade,” I say I wish them all the best. [press release]

Press releases like this come out all the time. Mostly they don’t go anywhere. I am not saying none of them will — but I am saying these are funding pitches, not scientific papers. Any results are always years away.

If everything goes really well, we might have a product in five years. I won’t say it can’t happen! I will say, show me.

The Financial Times ran an editorial on 21st August: “The world should prepare for the looming quantum era: New breakthroughs underscore the technology’s potential and perils.” [FT]

The “new breakthroughs” aren’t breakthroughs. All of this is “could”. The results the FT says “could make this imminent” are press releases from IBM and Google. It’s handwaving about big companies making promises and the earliest actual date anyone will put on the start of a result is 2033.

The FT editors can’t come up with any actual business advice, given all of this is years away. For almost anyone, there is no meaningful business action to take in 2025. No CEO or CTO needs to think about quantum computing until there’s an actual product in front of you. Unless you’re the CEO of IBM.

You must, of course, put all your money into funds investing in quantum computing companies. That’ll keep the numbers going up!

So can venture capital actually ignite a quantum computing bubble? I’m not going to say no, because stupid things happen every day. I will say they’ve got an uphill battle.

The AI bubble launched with a super-impressive demo called ChatGPT, and quantum computing doesn’t have anything like that. There are no products. But the physics experiments are very pretty.

Video version

Read the whole story

mkalus

7 hours ago

reply

iPhone: 49.287476,-123.142136

Trying to Read by Reza Friday September 19th, 2025 at 11:38 AM

Birds on the Moon by Reza Friday September 19th, 2025 at 11:38 AM

#ricoh #519 #rangefinder #35mm #film by nobody@flickr.com (ecstaticist - evanleeson.art) Friday September 19th, 2025 at 10:56 AM

OpenAI fights the evil scheming AI! — which doesn’t exist yet by David Gerard Friday September 19th, 2025 at 9:57 AM

We simulated the human mind with a chatbot. It didn’t work by David Gerard Friday September 19th, 2025 at 9:56 AM

The hype after AI: lads, it’s looking really quantum by David Gerard Friday September 19th, 2025 at 9:55 AM

Trying to Read by Reza
Friday September 19^th, 2025 at 11:38 AM

Birds on the Moon by Reza
Friday September 19^th, 2025 at 11:38 AM

#ricoh #519 #rangefinder #35mm #film by nobody@flickr.com (ecstaticist - evanleeson.art)
Friday September 19^th, 2025 at 10:56 AM

OpenAI fights the evil scheming AI! — which doesn’t exist yet by David Gerard
Friday September 19^th, 2025 at 9:57 AM

We simulated the human mind with a chatbot. It didn’t work by David Gerard
Friday September 19^th, 2025 at 9:56 AM

The hype after AI: lads, it’s looking really quantum by David Gerard
Friday September 19^th, 2025 at 9:55 AM