Resident of the world, traveling the road of life
67952 stories
·
20 followers

Medical research ethics is hard — but fake AI data is easy!

1 Comment and 2 Shares

Medical research means dealing with ethics boards — who keep asking all these pointed questions on how you’re going to use people’s most personal data.

What if we just … fake the data? Sorry, synthesise the data. Feed some real data to machine learning, churn out statistically similar synthetic numbers, then write up this fake data! Just as if you did science!

Remember: it’s not technically data fraud if you list it in your methodology!

Journal articles have pushed the idea of synthetic data for a few decades now. It isn’t actually very popular. But also, they keep pushing it.

You can only dodge the ethics board like this if your institution lets you. Nature spoke to a few institutions who do let researchers use AI-faked data so they don’t have to think about ethics. [Nature]

Here’s the use case for synthetic data:

protecting patient privacy, being more easily able to share data between sites and speeding up research.

The data is literally fake — but they get so many more papers out!

Synthetic data also solves data scarcity — there just isn’t enough real data in the world for all the researchers with papers to write.

So we turn a small real dataset into a huge fake dataset. Then we send the huge fake ethics-laundered datasets around the world! So everyone is using derivatives of the same small original dataset! With any noise in the original treated as the finest A-grade data that tells you things!

Sometimes the synthetic data fans admit this might cause issues: [Nature]

bias amplification, low interpretability, and an absence of robust methods for auditing data quality.

Are you sure the real data set you started with was good? That it doesn’t turn out to be wrong or horribly biased for some reason? Did you actually capture the statistics of the original — or did you oversimplify because you were in a hurry?

None of that matters! You already decided ethics was for dodging!

The other use for synthetic medical data is … training medical machine learning models. [ScienceDirect]

The first model fakes the data, then the second model trains on the fake data. Any problems in the synthetic data set are amplified further. Then the second model — based on fake data — is used to treat real patients. This is, of course, all fine.

The abstract of that paper makes this amazing claim for synthetic data:

unbiased data with sufficient sample size and statistical power.

You’re talking about a fake data set synthesized from a real data set. Even if you’ve duplicated its character completely, you can’t synthesize the statistical power to tell you things about the world which wasn’t in the original. You can’t do a statistical CSI “enhance!”

But sure, you can fake more data to write down a bigger N.

Synthetic data is not widely used in medical research yet because most researchers still actually give a hoot. It’s still at the hype stage — like this effusive bilge in BMJ Evidence Based Medicine in July, about the incredible potential of faking the evidence base for your medicine. Their main use case is: [BMJ EBM]

overcoming technical and regulatory barriers to assembling sufficiently large datasets for modern AI methods is paramount.

At least they’re doing just machine learning, not generative AI. So far.

Read the whole story
mkalus
9 hours ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete
1 public comment
tante
4 hours ago
reply
"The first model fakes the data, then the second model trains on the fake data. Any problems in the synthetic data set are amplified further. Then the second model — based on fake data — is used to treat real patients. This is, of course, all fine."

Sythetic data using "AI" is such a toxic pattern that keeps being amplified (because of the structures that guide "science")
Berlin/Germany

Saturday Morning Breakfast Cereal - Bet

1 Share


Click here to go see the bonus panel!

Hovertext:
I don't know if you can build ethics out of expected value, but I know that if I'm on a deserted island with someone who DOES think that, I'm running in the opposite direction.


Today's News:
Read the whole story
mkalus
9 hours ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Canadian education report riddled with fake AI references

1 Share

Newfoundland and Labrador has unveiled Education Accord NL, a report on the next 10 years for education in the province. [Education Accord NL, PDF, archive]

Unfortunately, the report has 15 fake citations — very much in chatbot style. [CBC]

One cite was to a 2008 movie called Schoolyard Games. This doesn’t exist. But that precise cite is an example in a University of Victoria style guide. [UVic]

Josh Lepawsky from Memorial University was on the report’s advisory board, but quit in January. He said it was a “deeply flawed process” leading to “top down recommendations” — someone decreed what the report was going to say.

NL education minister Bernard Davis insists this is fine. “There was an error made, the error’s been rectified, and that’s essentially where the story begins and ends in my opinion.” [CBC, video]

But did the report use AI? “Absolutely not,” says Davis. “It’s preposterous to even think.”

You’ll be glad to hear the report has a whole section on teaching the kids Artificial Intelligence:

It is about creating a system that uses AI to personalize education, empower educators, streamline operations, and create an understanding of AI’s ethical and societal implications.

They sure streamlined these ethics. The report’s co-chairs, Anne Burke and Karen Goodnough, insist the report was written by them with expert assistance, and “any suggestion otherwise is inaccurate.” [CBC]

Burke and Goodnough are working to rectify the report. That sounds like removing the fake stuff but not the conclusions based on it. Those were determined well ahead of time.

Read the whole story
mkalus
1 day ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Watching You

1 Share

Michael Kalus posted a photo:

Watching You



Read the whole story
mkalus
1 day ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Saturday Morning Breakfast Cereal - AI

2 Shares


Click here to go see the bonus panel!

Hovertext:
People who say there's no good use for AI need to think harder.


Today's News:
Read the whole story
mkalus
1 day ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete

Saturday Morning Breakfast Cereal - Suffering

4 Shares


Click here to go see the bonus panel!

Hovertext:
Anyone complaining about my delineations should decide they are not in fact a self and so no email need be sent to correct me.


Today's News:
Read the whole story
mkalus
1 day ago
reply
iPhone: 49.287476,-123.142136
Share this story
Delete
Next Page of Stories