this post was submitted on 01 Nov 2024
29 points (100.0% liked)
Technology
37719 readers
278 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If you're planning to use LLMs for anything along these lines, you should filter out irrelevant details like names before any evaluation step. Honestly, humans should do the same, but it's impractical. This is, ironically, something LLMs are very well suited for.
Of course, that doesn't mean off-the-shelf tools are actually doing that, and there are other potential issues as well, such as biases around cities, schools, or any non-personal info on a resume that might correlate with race/gender/etc.
I think there's great potential for LLMs to reduce bias compared to humans, but half-assed implementations are currently the norm, so be careful.
Unfortunately, doing this can make things worse. It's not a simple problem to solve, but you are generally on the right track. A good example of how it's more than just names, is how orchestras screen applicants - when they play a piece they do so behind a curtain so you can't see the gender of the individual. But the obfuscation doesn't stop there - they also ensure the female applicants don't wear shoes with heels (something that makes a distinct sound) and they even have someone stand on stage and step loudly to mask their footsteps/gait. It's that second level of thinking which is needed to actually obscure gender from AI, and the more complex a data set the more difficult it is to obscure that.
Interesting read, thanks! I'll finish it later, but already this bit is quite interesting: