this post was submitted on 21 Oct 2024
88 points (100.0% liked)

Technology

37724 readers
479 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
top 35 comments
sorted by: hot top controversial new old
[–] t3rmit3@beehaw.org 46 points 3 weeks ago (3 children)

Santa Clara County alone has 24 million property records, but the study team focused mostly on 5.2 million records from the period 1902 to 1980. The artificial intelligence model completed its review of those records in six days for $258, according to the Stanford study. A manual review would have taken five years at a cost of more than $1.4 million, the study estimated.

This is an awesome use of an LLM. Talk about the cost savings of automation, especially when the alternative was the reviews just not getting done.

[–] Killer_Tree@beehaw.org 36 points 3 weeks ago

Specialized LLMs trained for specific tasks can be immensely beneficial! I'm glad to see some of that happening instead of "Company XYZ is now needlessly adding AI to it's products because buzzwords!"

[–] knightly@pawb.social 9 points 3 weeks ago (2 children)

Given the error rate of LLMs, it seems more like they wasted $258 and a week that could have been spent on a human review.

[–] OmnipotentEntity@beehaw.org 22 points 3 weeks ago

LLMs are bad for the uses they've been recently pushed for, yes. But this is legitimately a very good use of them. This is natural language processing, within a narrow scope with a specific intention. This is exactly what it can be good at. Even if does have a high false negative rate, that's still thousands and thousands of true positive cases that were addressed quickly and cheaply, and that a human auditor no longer needs to touch.

[–] t3rmit3@beehaw.org 17 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

What do you believe would make this particular use prone to errors?

[–] knightly@pawb.social 3 points 3 weeks ago (2 children)

The use of LLMs instead of someone that can actually understand context.

[–] t3rmit3@beehaw.org 14 points 3 weeks ago (1 children)

I think you may have misunderstood the purpose of this tool.

It doesn't read the deeds, make a decision, and submit them for termination all on its own. It reads them, identifies racial covenants based on patterns of language (which is exactly what LLMs are very good at), and then flags them for a human to review.

This tool is not replacing jobs, because the whole point is that these reviews were never going to get the budget and manpower to be done manually, and instead would have simply remained on the books.

I get being disdainful or even angry about LLMs in our unregulated-capitalism anti-worker hellhole because of the way that most companies are using them, but tools aren't themselves good or bad, they're just tools. And using a tool to identify racial covenants in legal documents that otherwise would go un-remediated, seems like a pretty good use to me.

[–] knightly@pawb.social 3 points 3 weeks ago (1 children)

So, what? They're going to pay a human to OK the output and the whole lot of them never even gets seen?

Say 12 minutes per covenant, that's 1 million work hours that humans could get paid for. Pay them $50 an hour and it's $50 million. That's nothing, less than 36 hours worth of the $12.5 Billion in weapons shipments we've sent to Israel in the last year. We could pay for projects like this with the rounding errors on the budget for blowing up foreign kids, and the people we pay to do it could afford to put their kids through college.

Instead, we get a project to train a robotic bigotry filter for real estate legalese and 50 more cruise missiles from the savings.

[–] t3rmit3@beehaw.org 10 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

I think you are confused about the delineation between local and federal governments. It's not all one giant pool of tax money. None of Santa Clara County's budget goes to missiles.

Also, this feels like you are too capitalism-pilled, and rather than just spending the $240 to do this work, and using the remaining $49,999,760 to just fund free college or UBI programs, you're like, "how about we pay these people to do the most mind-numbingly, soul-crushingly boring work there is, reading old legal documents?"

You know what would actually happen if you did that? People would seriously read through them for 1 day, and then they'd be like, "clear", "clear", "clear" without looking at half of them. It's not like you're gonna find and fund another group to review the first group's work, after all. So you'd still be where we are now, but you also wasted x* peoples' time that they could have been enjoying doing literally anything else.

[–] knightly@pawb.social 2 points 3 weeks ago (1 children)

I think you are confused about the delineation between local and federal governments.

I am not, I simply don't believe the delineation is relevant since taxpayers fund both the state and federal budgets.

Also, this feels like you are too capitalism-pilled

This is me being "reasonable" and working within the constraints of the system. If we aren't going to have free universal college et al then we can at least trade some of the bloated military budget for a public works program.

People would seriously read through them for 1 day, and then they'd be like, "clear", "clear", "clear" without looking at half of them.

Sounds to me like a 50% improvement over zero human eyes.

It's not like you're gonna find and fund another group to review the first group's work, after all.

Why not? We could hire three teams to do it simultaneously in every state in the country and the cost would still be a tiny fraction of how much was wasted on the F-35 program.

[–] howrar@lemmy.ca 6 points 3 weeks ago (1 children)

Sounds to me like a 50% improvement over zero human eyes.

It certainly would be. Thankfully, there's many more than zero human eyes involved in this.

[–] knightly@pawb.social 1 points 3 weeks ago (1 children)
[–] howrar@lemmy.ca 7 points 3 weeks ago

Quickly filtering out a subset of them to prioritize so that we get the most value possible out of the time that humans spend on it.

[–] GetOffMyLan@programming.dev 5 points 3 weeks ago (1 children)

One of LLMs main strengths over traditional text analysis tools is the ability to "understand" context.

They are bad at generating factual responses. They are amazing at analysing text.

[–] knightly@pawb.social 5 points 3 weeks ago (1 children)

LLMs neither understand nor analyze text. They are statistical models of the text they were trained on. A map of language.

And, like any map, they should not be confused for the territory they represent.

If you admit that they have issues with facts, why would you assume that the randomly generated facts their "analysis" produces must be accurate?

[–] GetOffMyLan@programming.dev 3 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

I mean they literally do analyze text. They're great at it. Give it some text and it will analyze it really well. I do it with code at work all the time.

Because they are two completely different tasks. Asking them to recall information from their training is a very bad use. Asking them to analyze information passed into them is what they are great at.

Give it a sample of code and it will very accurately analyse and explain it. Ask it to generate code and the results are wildly varied in accuracy.

I'm not assuming anything you can literally go and use one right now and see.

[–] apotheotic@beehaw.org 7 points 3 weeks ago (1 children)

The person you're replying to is correct though. They do not understand, they do not analyse. They generate (roughly) the most statistically likely answer to your prompt, which may very well end up being text representing an accurate analysis. They might even be incredibly reliable at doing so. But this person is just pushing back against the idea of these models actually understanding or analysing. Its slightly pedantic, sure, but its important to distinguish in the world of machine intelligence.

[–] GetOffMyLan@programming.dev 2 points 3 weeks ago (1 children)

I literally quoted the word for that exact reason. It just gets really tiring when you talk about AIs and someone always has to make this point. We all know they don't think or understand in the same way we do. No one gains anything by it being pointed out constantly.

[–] apotheotic@beehaw.org 4 points 3 weeks ago (2 children)

You said "they literally do analyze text" when that is not, literally, what they do.

And no, we don't "all know" that. Lay persons have no way of knowing whether AI products currently in use have any capacity for genuine understanding and reasoning, other than the fact that the promotional material uses words like "understanding", "reasoning", "thought process", and people talking about it use the same words. The language we choose to use is important!

[–] GetOffMyLan@programming.dev 4 points 3 weeks ago* (last edited 3 weeks ago) (2 children)

No it's not. It's pedantic and arguing semantics. It is essentially useless and a waste of everyone's time.

It applies a statistical model and returns an analysis.

I've never heard anyone argue when you say they used a computer to analyse it.

It's just the same AI bad bullshit and it's tiring in every single thread about them.

[–] apotheotic@beehaw.org 4 points 3 weeks ago

I never made any "AI bad" arguments (in fact, I said that they may be incredibly well suited to this) I just argued for the correct use of words and you hallucinated.

[–] knightly@pawb.social 3 points 3 weeks ago (1 children)

LLMs arent "bad" (ignoring, of course, the massive content theft necessary to train them), but they are being wildly misused.

"Analysis" is precisely one of those misuses. Grand Theft Autocomplete can't even count, ask it how many 'e's are in "elephant" and you'll get an answer anywhere from 1 to 3.

This is because they do not read or understand, they produce strings of tokens based on a statistical likelihood of what comes next. If prompted for an analysis they'll output something that looks like an analysis, but to determine whether it is accurate or not a human has to do the work.

[–] howrar@lemmy.ca 2 points 3 weeks ago (1 children)

LLMs cannot:

  • Tell fact from fiction
  • Accurately recall data from its training set
  • Count

LLMs can

  • Translate
  • Get the general vibe of a text (sentiment analysis)
  • Generate plausible text

Semantics aside, they're very different skills that require different setups to accomplish. Just because counting is an easier task than analysing text for humans, doesn't mean it's the same it's the same for a LLM. You can't use that as evidence for its inability to do the "harder" tasks.

[–] knightly@pawb.social 1 points 3 weeks ago* (last edited 3 weeks ago)

You forgot to put caveats on all the things you claim LLMs can do, but only one of them doesn't need them.

Why would you think that LLMs can do sentiment analysis when they have no concept of context or euphemism and are wholly incapable of distinguishing sarcasm from genuine sentiment?

Why would you think that their translations are of any use given the above?

https://www.inc.com/kit-eaton/mother-of-teen-who-died-by-suicide-sues-ai-startup/90994040

[–] Rivalarrival@lemmy.today 2 points 3 weeks ago (1 children)

The human capacity for reason is greatly overrated. The overwhelming majority of conversation is regurgitated thought, which is exactly what LLMs are designed to do.

[–] apotheotic@beehaw.org 2 points 3 weeks ago

I don't really dispute that but at least we are able to apply formal analytical methods with repeatable outcomes. LLMs might (and do) achieve a similar result but they do so without any formal approach that can be reviewed, which has its drawbacks.

[–] dan@upvote.au 4 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Did you see something that said it was an LLM?

Edit: Indeed it's an LLM. They published the model here: https://huggingface.co/reglab-rrc/mistral-rrc

[–] howrar@lemmy.ca 2 points 3 weeks ago

Considering that it's a language task, LLMs exist, and the cost, it's a reasonable assumption. It'd be pretty silly to analyse a bag of words when you have tools you can use with minimal work with much better results. Even sillier to spend over $200 for something that can be run on a decade old machine in a few hours.

[–] Melody@lemmy.one 29 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

This is exactly the kind of task I'd expect AI to be useful for; it goes through a massive amount of freshly digitized data and it scans for, and flags for human action (and/or) review, things that are specified by a human for the AI to identify in a large batch of data.

Basically AI doing data-processing drudge work that no human could ever hope to achieve with any level of speed approaching that at which the AI can do it.

Do I think the AI should be doing these tasks unsupervised? Absolutely not! But the fact of the matter is; the AIs are being supervised in this task by the human clerks who are, at least in theory, expected to read the deed over and make sure it makes some sort of legal sense and that it didn't just cut out some harmless turn of phrase written into the covenant that actually has no racist meaning, intention or function. I'm assuming a lot of good faith here, but I'm guessing the human who is guiding the AI making these mass edits can just, by means of physicality, pull out the original document and see which language originally existed if it became an issue.

To be clear; I do think it's a good thing that the law is mandating and making these kinds of edits to property covenants in general to bring them more in line with modern law.

[–] bane_killgrind@slrpnk.net 5 points 3 weeks ago

didn’t just cut out some harmless turn of phrase written into the covenant that actually has no racist meaning

I gotta say, because of the nature of systemic racism turns of phrase that are ambiguous or are explicitly neutral can be prejudiced or discriminatory is different ways.

We can't rely on a statistical model to tell us what is infringing on right. We have to be critical.

[–] Computerchairgeneral@fedia.io 20 points 3 weeks ago

This actually isn't a terrible use of an LLM. It's actually kind of refreshing to see a news story about a beneficial use of it in a very specific context.

[–] nickwitha_k@lemmy.sdf.org 9 points 3 weeks ago

Next, ban SFH HOAs.

[–] PhlubbaDubba@lemm.ee 6 points 3 weeks ago (1 children)

Could be a decent moderating tool too since increasing layers of Innuendo wouldn't be as likely to dodge a pattern seaking algoriðm as ðey would be an underpayed overworked hand sorting mod.

[–] BarryZuckerkorn@beehaw.org 5 points 3 weeks ago

increasing layers of Innuendo

Well, also, these are documents written in the past, before 1948, when the Supreme Court invalidated the effect of racial covenants.

But the language remains, with no legal effect. But it's still there and should be eliminated. There's no cat and mouse game, just the need for cleanup of something left from the past.

[–] cotlovan@lemm.ee 1 points 3 weeks ago

I'm a bit confused. What have property records to do with racist language?