this post was submitted on 18 Jan 2024
508 points (98.1% liked)

Technology

59329 readers
5008 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

A ‘Shocking’ Amount of the Web Is Already AI-Translated Trash, Scientists Determine::Researchers warn that most of the text we view online has been poorly translated into one or more languages—usually by a machine.

top 50 comments
sorted by: hot top controversial new old
[–] Brkdncr@lemmy.world 69 points 10 months ago (1 children)

I recently was searching for some tips on overlanding routes. So many sites are just long strung together SEO word salad.

[–] 1984@lemmy.today 4 points 10 months ago

I bet you get better results with Kagi. I don't see much crap in my results with it.

[–] paddirn@lemmy.world 57 points 10 months ago (1 children)
[–] wikibot@lemmy.world 62 points 10 months ago (5 children)

Here's the summary for the wikipedia article you mentioned in your comment:

The dead Internet theory is an online conspiracy theory that asserts that the Internet now consists mainly of bot activity and automatically generated content that is manipulated by algorithmic curation, marginalizing organic human activity. Proponents of the theory believe these bots are created intentionally to help manipulate algorithms and boost search results in order to ultimately manipulate consumers. Furthermore, some proponents of the theory accuse government agencies of using bots to manipulate public perception, stating "The U. S. government is engaging in an artificial intelligence powered gaslighting of the entire world population".

^to^ ^opt^ ^out^^,^ ^pm^ ^me^ ^'optout'.^ ^article^ ^|^ ^about^

[–] BananaOnionJuice@lemmy.dbzer0.com 75 points 10 months ago (1 children)

Best time for a bot to reply.

[–] Asudox@lemmy.world 18 points 10 months ago* (last edited 10 months ago)
[–] riodoro1@lemmy.world 39 points 10 months ago

Fucking ironic

[–] stewsters@lemmy.world 24 points 10 months ago

Lol, read the room bot.

[–] Octopus1348@lemy.lol 13 points 10 months ago

WikiBot on Lemmy!

[–] robocall@lemmy.world 4 points 10 months ago
[–] Linssiili@sopuli.xyz 56 points 10 months ago* (last edited 10 months ago) (4 children)

Recently I was looking for info (in finnish) how to prevent car windows from fogging. I found a really weird website all about car windows, but it kept confusing car and house windows. It instructed to clean car windows by "opening the window and cleaning between the panels".

It was obviously ai-generated, but I couldn't figure out why. They weren't selling anything, there were no ads and no links to other websites or services.

Edit: I found the site again, I cannot spot anything nefarious, but proceed with caution: https://www.lasinvaihto.fi/

[–] theluddite@lemmy.ml 55 points 10 months ago (4 children)

It's probably either waiting for approval to sell ads or was denied and they're adding more stuff. Google has a virtual monopoly on ads, and their approval process can take 1-2 weeks. Google's content policy basially demands that your site by full of generated trash to sell ads. I did a case study here, in which Google denied my popular and useful website for ads until I filled it with the lowest-quality generated trash imaginable. That might help clarify what's up.

[–] cashews_best_nut@lemmy.world 28 points 10 months ago

What an absolute ballbag Google is.

[–] Linssiili@sopuli.xyz 7 points 10 months ago (1 children)

The posts are from march 2023, and there are no ads yet :/

[–] theluddite@lemmy.ml 5 points 10 months ago* (last edited 10 months ago)

Dates could be made up, too.The blog posts that I generated for my site included made up dates in the past. The internet archive says it has a snapshot for March of 2023, but when I click it, it says it doesn't, so I have no way of verifying. The theory about parking real estate hoping to sell it also seems pretty plausible to me. Who knows what dumb shit they're up to.

[–] aubertlone@lemmy.world 3 points 10 months ago

Hey man! I've read this article a few times, perhaps from other comments on Lemmy!

Thanks for the write-up. I'm a programmer myself.

Stuck in operations in my new job until we're done with the data center exit/ migration. Anyway cool beans, and very interesting article. Will keep all this in mind if any of my hobby projects take off.

[–] Lemminary@lemmy.world 3 points 10 months ago (1 children)

Instead of feeling defeated, like every other millennial that doesn't want to work,

That is one weird glib to throw in there.

load more comments (1 replies)
[–] jdf038@mander.xyz 12 points 10 months ago

Perhaps parking a site for traffic and then using the enshitified data to sell it?

It makes me sick how dumb it sounds.

[–] crazyCat@sh.itjust.works 8 points 10 months ago (1 children)

People who care about SEO for their window-related businesses will pay the blog to link to them from there.

[–] Linssiili@sopuli.xyz 3 points 10 months ago

That would make sence, also the domain is really good (lasinvaihto.fi, translates to windscreenreplacement.fi). Maybe they are planning to sell the domain?

load more comments (1 replies)
[–] aesthelete@lemmy.world 43 points 10 months ago* (last edited 10 months ago) (2 children)

For a time I thought this Fediverse thing would help or change things or something, but honestly...the Internet is just plain boring now...and it's pretty clear what is causing that: AI / SEO trash content, social media's rise, and commercialization of the Internet generally.

One day I was even feeling nostalgic so I went back to where I spent hours upon hours of my youth: EFNet on IRC...there was basically nobody there and of the few channels I saw some were even Trump-leaning weirdo "communities".

It's basically finished. I can't even find a decent place to procrastinate or hang out anymore on this POS. It's all just a giant ad surface and e-commerce portal. The fucking owners won.

[–] AMDIsOurLord@lemmy.ml 8 points 10 months ago (2 children)

EFNet is boomer shit. Most of IRC happens on other servers now, like LiberaChat, or on new protocols like Matrix.

We're still here, we're still alive

load more comments (2 replies)
[–] Just_Pizza_Crust@lemmy.world 8 points 10 months ago* (last edited 10 months ago) (1 children)

The fucking owners won.

Always has been 🔫

That said, I would suggest smaller communities and private messaging. Find your niche and make it home.

[–] Jax@sh.itjust.works 1 points 10 months ago* (last edited 10 months ago)

Yep, it might have been hijacked by consumers but it's still a communication network.

[–] Jayu@lemm.ee 28 points 10 months ago (2 children)

The most annoying aspect of this is when you know actual information has to be out there, but it is being drowned out by dozens of sites reposting the less relevant and low quality information... And then you go to search in another language and you see substandard machine translations of all the garbage you were just fleeing, lol.

[–] TheRealKuni@lemmy.world 4 points 10 months ago* (last edited 10 months ago)

I was trying to find the radius of the corner of the iPad Pro. Not the screen, the actual device. No matter what I modified my search term to all I could find was information about the screen corner (and how it isn't a true radius and blah blah blah) or AI generated bullshit.

Eventually I gave up and changed the way I was tackling my project. I know the info is out there, people make cases for these things.

load more comments (1 replies)
[–] ABCDE@lemmy.world 25 points 10 months ago (2 children)

Thanks, scientists, couldn't have known that without you.

[–] ForgotAboutDre@lemmy.world 30 points 10 months ago (1 children)

There is value in verifying and quantifying opinion, even if your sure this opinion is true.

load more comments (1 replies)
[–] jaybone@lemmy.world 3 points 10 months ago (1 children)

Next up: scientists detect sarcasm.

[–] ABCDE@lemmy.world 1 points 10 months ago
[–] KingThrillgore@lemmy.ml 15 points 10 months ago

Turing tests solving turing tests solving turing tests

[–] maegul@lemmy.ml 11 points 10 months ago (1 children)

The whole webring idea needs to come back. Human curated recommendations of good resources and pages. So long as these pages remain in the control of humans and dedicated to curation and are decentralised, unlike the search engines, then they’ll be reliable.

Plugging in some social and community organisation, perhaps like a wiki, and you could get even more out of it.

[–] Euphoma@lemmy.ml 4 points 10 months ago (1 children)

There are modern webrings. Dang the yesterweb webring shut down, that was a really good one.

Got any other reccos? I'm brand new to the concept

[–] BetaDoggo_@lemmy.world 9 points 10 months ago* (last edited 10 months ago) (1 children)

This isn't shocking at all. The markets for obscure language content are incredibly small so there's no incentive for most to spend resources on it. I'd argue mediocre machine translation is better than nothing at all in many cases, but for unsupervised training it does pose a challenge.

[–] xantoxis@lemmy.world 10 points 10 months ago* (last edited 10 months ago)

They didn't only look at low-resource languages, they just started there because that was the problem domain. They found that 57% of ALL sentences on the Internet appeared to be machine translated, including translations into high-resource languages. The remaining 43% might also be machine generated, it just wasn't found to be part of a multi-way parallel group.

[–] Falcon@lemmy.world 8 points 10 months ago

Translation is very different from generation.

As a matter of fact, even AI generation has different grades of quality.

SEO garbage is certainly not the same as an article with AI generated components and very different from a translated article.

[–] random_character_a@lemmy.world 4 points 10 months ago (1 children)

Too good not to be ruined by humanity

[–] laurelraven@lemmy.blahaj.zone 2 points 10 months ago (1 children)

In the beginning humanity was created. This had made many people very angry and has been widely regarded as a bad move.

Douglas Adams, probably...

load more comments (1 replies)
[–] GilgameshCatBeard@lemmy.ca 3 points 10 months ago (1 children)

AI is going to fuck up everything we’ve ever done.

[–] stewsters@lemmy.world 2 points 10 months ago* (last edited 10 months ago)

We fucked it up on our own with SEO long before chatgpt came along. Google has been going downhill for years as people learn to game the algorithm.

It will speed it along sure, but the core problem is that is profitable to dump garbage on the internet and put ads on it. The monitozation is the root of this.

load more comments
view more: next ›