37
submitted 10 months ago by Meuzzin@lemmy.world to c/selfhosted@lemmy.world

Heyas, wondering if there's an open sourced piece of software or the like, that could scrape media platforms for a specific topic. Platforms like YT, X, Lemmy, News Media, etc., perhaps using RSS? But, a program I can host on my server, that only I have access too, via webpage, CLI, whatever...

Thanks for any info...

all 19 comments
sorted by: hot top controversial new old
[-] november@iusearchlinux.fyi 24 points 10 months ago* (last edited 10 months ago)

FreshRSS has been working great for me! It even has the ability for web scraping if you need it.

[-] Meuzzin@lemmy.world 3 points 10 months ago

Right when I saw you reply, I saw a post about it. Digging in to it now. Thanks!

[-] charles@lemmy.ca 1 points 10 months ago* (last edited 10 months ago)

Seconding the recommendation for FreshRSS, it's the one I ended up hosting when I looked into this a while back and it's been really great. Takes a minute to get everything setup, especially if you want to have different settings for different types of feeds, but once it's all set it's perfect (for my needs at least).

I've also got it setup with my domain so I can access the feed from anywhere and that's been one of my favourite features.

[-] anzo@programming.dev 16 points 10 months ago* (last edited 10 months ago)

Everyone is suggesting readers. I think you are looking for something like https://docs.rsshub.app it's capable of generating RSS feeds from pretty much everything.

[-] Meuzzin@lemmy.world 7 points 10 months ago

This is it. Exactly what I was looking for. Thanks much!

[-] anzo@programming.dev 1 points 10 months ago* (last edited 10 months ago)
[-] pipe01@programming.dev 11 points 10 months ago

I use Miniflux, it's a lightweight RSS reader

[-] Meuzzin@lemmy.world 2 points 10 months ago

That looks great as well. I like that I can integrate with my own domain.

[-] Father_Redbeard@lemmy.ml 1 points 10 months ago

And they just added Omnivore integration, which I'm so excited for.

[-] namelivia@lemmy.world 1 points 10 months ago

Oh, this looks nice! I need to try this!

[-] ComradeMiao@lemmy.world 9 points 10 months ago

Freshrss is really great!

[-] drkt@feddit.dk 5 points 10 months ago
[-] Urist@lemmy.ml 2 points 10 months ago

I would recommend miniflux a "minimalist and opinionated feed reader". It is great on mobile and desktop and dead simple to set up and use.

[-] i_am_not_a_robot@discuss.tchncs.de 1 points 10 months ago

YouTube has RSS feeds you can access without scraping, but it's per channel so if you follow a lot of channels you'll be following a lot of RSS feeds.

Lemmy also has RSS feeds for each community.

Are you looking for a reader instead? A reader aggregates the feeds and displays them. Usually it keeps track of which items you've already read.

[-] vegetaaaaaaa@lemmy.world 1 points 10 months ago* (last edited 10 months ago)
[-] tsl@lemmy.stefanoprenna.com 1 points 10 months ago

I use rss-bridge for scraping sites that don't offer rss feeds: https://rss-bridge.github.io/rss-bridge/index.html

[-] virtueisdead@lemmy.dbzer0.com 1 points 10 months ago

seconded. the built-in custom css selection is excellent. ive been strongly considering self-hosting an RSS bridge, but i think my server has too much unpredictable downtime for it

[-] AchtungDrempels@lemmy.world 0 points 10 months ago

Jumping in to ask if there'd be a good reason to use a stand alone feed reader instead of using the nextcloud "news" app?

this post was submitted on 21 Dec 2023
37 points (100.0% liked)

Selfhosted

39964 readers
205 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS