this post was submitted on 13 Jul 2025
652 points (97.0% liked)
Comic Strips
18156 readers
2434 users here now
Comic Strips is a community for those who love comic stories.
The rules are simple:
- The post can be a single image, an image gallery, or a link to a specific comic hosted on another site (the author's website, for instance).
- The comic must be a complete story.
- If it is an external link, it must be to a specific story, not to the root of the site.
- You may post comics from others or your own.
- If you are posting a comic of your own, a maximum of one per week is allowed (I know, your comics are great, but this rule helps avoid spam).
- The comic can be in any language, but if it's not in English, OP must include an English translation in the post's 'body' field (note: you don't need to select a specific language when posting a comic).
- Politeness.
- Adult content is not allowed. This community aims to be fun for people of all ages.
Web of links
- !linuxmemes@lemmy.world: "I use Arch btw"
- !memes@lemmy.world: memes (you don't say!)
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
My favourite story about it was that one time when neural network trained on x-rays to recognise tumors I think, was performing amazingly at study, better than any human could.
Later it turned out that the network trained on real life x-rays with confirmed cases, and it was looking for penmarks. Penmarks mean the photo was studied by several doctors, which mean it's more likely to be the case that needed second opinion, which more often than not means there is a tumour. Which obviously means that if the case wasn't studied by humans before, the machine performed worse than random chance.
That's the problem with neural networks, it's incredibly hard to figure out what exactly is happening under the hood, and you can never be sure about anything.
And I'm not even talking about LLM, those are completely different level of bullshit
well it's also that they used biased data. biased data is garbage data. The problem with these neural networks is the human factor, humans tend to be biased, subconsciously or consciously, hence the data they provide to these networks will often be biased as well. It's like that ML that was designed to judge human faces and it would consistently give non-whites lower scores, because it turned out the input data was mostly full of white faces.
I am convinced that unbiased data doesn't exist, and at this point I'm not sure it can exist on principal. Then you take your data full of unknown bias, and feed it to a blackbox that creates more unknown bias.
if you get enough data of a specific enough task I'm fairly confident you can get something that is relatively unbiased. Almost no company wants to risk it though because the training would require that no human decisions are made.
The problems in thinking that your data is unbiased, is that you don't know where your data is biased, and you stopped looking