this post was submitted on 05 Jun 2024
295 points (94.3% liked)
Technology
59270 readers
3476 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
There is a huge corporate insensitive that everyone is not realizing here. By screen recording + OCR, there is a possibility to start using this data to replace some labor intensive, but simple tasks of operating a business. If you can create RPA+ML+LLM that can rerun repetitive tasks, you have holy grail on your hands. I think this is one of the big reason why M$ is pushing this.
I assume to be down voted to oblivion, but I do business automation and integration for living, and at the same time I am scared and excited.
Absolutely. Corporations - at least, shitty ones (most of them) - are absolutely salivating at using this. They want to be able to see and easily summarize eeeeeeverything you're doing.
Some are absolutely already using a form of this. It's not a hypothetical - this is currently happening and many want way way more.
Lmao do you have any idea how quickly that’s going to go off the rails? They’re going to get into a hallucination feedback loop, which will destroy the integrity of their systems and processes, and they’ll richly deserve it.
At any rate, most highly-effective technical teams have already automated the shit out of all their rote operations without using ML.
Automation suites exist and they are very much tuned to the individual apps. It seems giving ML an OCR readout of a page is not enough for it to know what it should do (accurately). We have had a training set for "booking flights on a browser" for about 6 years now and no one has figured out how to have it disrupt automated testing: https://miniwob.farama.org/
I was thinking about this, but I don't know what the plan us for annotating new flows with descriptions of the actions. There's no point in learning how to send an email or open a webpage, that's already easy. The value is in a database of uncommon interactions, but it's only valuable if there is a description to train on.