1
27
submitted 1 year ago by archivist@lemmy.ml to c/datahoarder@lemmy.ml

@ray@lemmy.ml Got it done, I'm first of the mods here and will be learning a little Lemmy over the next few weeks.

While everything is up in the air with the reddit changes I'll be very busy working on replacing the historical pushshift API without reddits bastardizations should a PS version come back.

In the mean time you should all mirror this data ensuring its survival, do what you do best and HOARD!!

https://the-eye.eu/redarcs/

2
18

Years ago I came across filecoin/sia decentralized data storage and I started trying them but then I stopped due to lack of time. Some days ago I've heard in a podcast about a kind of NAS that does kinda the same thing: it spreads chunks of data across other devices owned by other users.

Is there a service that does this but with your own hardware or, even better, something open source where you can have X GB as far as you share the same amount of space plus something extra?

It would be great for backup.

3
16
submitted 1 week ago by BlueKey@fedia.io to c/datahoarder@lemmy.ml

I'm entertaining the thought to write my backups onto tape storage. So my questiont to this community is: does someone know where (if any) to get cheap and simple (my requirements are "just writes & reads the data and is usable with a Linux machine") used tape drives?

Thanks for any hints.

4
24
5
21
Digitizing notebooks (katiesonger.com)

[The guide isn't mine and I'm not affiliated with it, I'm just sharing a mind-blown moment for me.]

Over the years, I have gathered many notebooks that admittedly not all contain very important information and take up a lot of space (possibly a cubic meter or more). But being kind of a (data)hoarder, I dont want to just throw them away. It's work that took years.

My solution: scanning them. My phone has a built-in camera scanner that does a suprisingly good job (it helps that the camera is kinda good too), so I have scanned thousands of pages so far. But the process is slow and takes a lot of manual labor (flipping pages, aligning pages, retaking bad photos, creating pds etc.). A typical notebook (~120pages) may take me 15minutes or more.

So I thought that maybe I could speed up the process (partially at least) by either buying a scanner or paying someone to scan them (I don't have a proper scanner, yet). Removing the pages without damaging them is a challenge though. That's where the guide in the link comes in: it turns out it's very easy to remove the spiral spring from the notebooks! I was gonna pull the pages until I found that guide. I suppose it's also very easy to remove the staples from staple-bound notebooks too. I might just have "won" many hours of my life with this idea.

The video in the guide that helped me:

https://www.youtube.com/watch?v=lfMUVpwLZGM

(For the record, my xiaomi 10 phone can scan items by creating ~20MP images which translates to typical-to-high resolutions if I scan A4 or A5 pages. Fortunately, many scanners can reach that quality. I just need them not to apply any weird effects or compression to the scanned document.)

6
5

Hi everyone,

I'm looking for a more efficient way to save and archive Lemmy comments and posts on my Android phone.
Currently, when I come across a comment I want to keep for future reference, I manually copy the text and link, then paste it into a note in my Obsidian vault. If there's an image or other media in the original post, I save and include that as well.

However, this process feels a bit cumbersome. Ideally, I’d like a way to quickly save or share a comment or post URL and automatically archive the top 20 or so comment chains, along with the original post, including any images, videos, or articles.

Has anyone found a streamlined method for doing this? I often find that by the time I return to check the responses or review the content, the post or article has disappeared. Any tips or tools that could help simplify this process would be greatly appreciated!

Thanks in advance for your suggestions!

7
29
How do you handle backup? (programming.dev)

cross-posted from: https://lemmy.dbzer0.com/post/26278528

I'm running my media server with a 36tb raid5 array with 3 disks, so I do have some resilience to drives failing. But currently can only afford to loose a single drive at a time, which got me thinking about backups. Normally I'd just do a backup to my NAS, but that quickly gets ridiculous for me with the size of my library, which is significantly larger than my NAS storage of only a few tb. And buying cloud storage is much too expensive for my liking with these amounts of storage.

Do you backup only the most valuable parts of your library?

8
99
submitted 2 weeks ago by Wojwo@lemmy.ml to c/datahoarder@lemmy.ml

I'm celebrating my datahoarding problem.

9
267
submitted 3 weeks ago by lars@lemmy.sdf.org to c/datahoarder@lemmy.ml
10
8

cross-posted from: https://beehaw.org/post/15404535

Data: https://archive.org/details/gamefaqs_txt

Mirror upload for faster download, 1 Mbit (expires in 30 days): https://ufile.io/f/r0tmt

GameFAQs at https://gamefaqs.gamespot.com hosts user created faqs and documents. Unfortunately they are baked into the HTML webpage and cannot be downloaded on their own. I have scraped lot of pages and extracted those documents as regular TXT files. Because of the sheer amount of data, I only focused on a few systems.

In 2020, a Reddit user named "prograc" archived faqs for all systems at https://archive.org/details/Gamespot_Gamefaqs_TXTs . So most of it is already preserved. I have a different approach of organizing the files and folders. Here a few notes about my attempt:

  • only 17 selected systems are included, so it's incomplete
  • folder names of systems have their long name instead short, i.e. Playstation instead ps
  • similarly game titles have their full name with spaces, plus a starting "The" is moved to the end of the name for sorting reasons, such as "King of Fighters 98, The"
  • in addition to the document id, the filename also contain category (such as "Guide and Walkthrough"), the system name in short "(GB)" and the authors name, such as "Guide and Walkthrough (SNES) by BSebby_6792.txt"
  • the faq documents contain an additional header taken from the HTML website, including a version number, the last update and the previously explained filename, plus a webadress to the original publication
  • HTML documents are also included here with a very poor and simple conversion, but only the first page, so multi page HTML faqs are still incomplete
  • no zip archives or images included, note: the 2020 archive from "prograc" contains false renamed .txt files, which are in reality .zip and other files mistakenly included, in my archive those files are correctly excluded, such as nes/519689-metroid/faqs/519689-metroid-faqs-3058.txt
  • I included the same collection in an alternative arrangement, where games are listed without folder names for the system, this has the side effect of removing any duplicates (by system: 67.277 files vs by title: 55.694 files), because the same document is linked on many systems and therefore downloaded multiple times
11
12

Hey guys, so it seems that Linkwarden isn't as good as I was hoping, since some websites will throw up a cookie popup or some other screen that basically prevents the capture.

Firefox Screenshot seems to work well, but it saves a PNG, which isn't really text searchable.

FF's "save page as..." feature seems to break things when viewing them back.

Save to PDF is another option, and that seems to be decent.

I'm not looking to copy entire websites, but I like to save web pages for later reference (i.e. instructions/specs).

I use Synology Note Station, but they don't have a web clipper for Firefox...

I'm fine with using a folder structure to store files, despite not being totally ideal when compared to Linkwarden.

Does anyone have any other suggestions that perhaps I've missed? Nothing too complicated... ideally, as simple as a button click would be great.

12
24
13
13
submitted 1 month ago by Sprokes@jlai.lu to c/datahoarder@lemmy.ml

YouTube is cracking down Adblocker and they may never work in a year or so.

I don't watch YouTube that much and most of the time I watch the same thing. So I am thinking of mirroring the videos I watch to other platforms. But I don't know which. I was just thinking of ok.ru. I don't know if they respond to DMCA requests.

Did anyone do something similar?

14
24

Running GParted gives me an error that says

fsyncing/closing dev/sdb: input/output error

Using Gnome Disk Utility under the assessment section it says

Disk is OK, one bad sector

Clicking to format it to EXT4 I'm getting a message that says

Error formatting volume

Error wiping device: Failed to probe the device 'dev/SDB' (udisks-error-quark, 0)

Running sudo smartctl -a /dev/SDB I get a few messages that all say

... SCSI error badly formed scsi parameters


In terms of the physical side I've swapped out the SATA data and power cable with the same results.


Any suggestions?

Amazon has a decent return policy so I'm not incredibly concerned but if I can avoid that hassle it would nice.

15
19

a few days ago i saw a post on the reddit datahoarder community asking how to backup keys and other small files for a long time.
it reminded me of a script i made some time ago to save my otp secrets in case of loss of device or a reenactment of the raivo otp incident,
so i decided to make it public on github, hope someone here finds it useful

github.com/Leviticoh/weedcup

the density is not great, about 1kB per A4 page, but it can recover from losing up to half of the printed surface and, if stored properly, paper should last very long

16
30

Basically title!

I want to run it through my NAS to free up some space.

Tha ks in advance.

17
8
submitted 1 month ago* (last edited 1 month ago) by andioop@programming.dev to c/datahoarder@lemmy.ml

I read something about once-reliable sites that would tell you the best [tech thing] now not giving legit reviews, being paid to say good things about certain companies, and I do not remember where I read that or which sites, so I figured I'd bypass the issue and ask people here. I'm pretty new to anything near the level of complexity and technical details that I see on datahoarder communities. I know about the 321 backup rule and that's it. This is me trying to find something to hold copy 3 of my data.

18
39
submitted 1 month ago by evasync@lemmy.world to c/datahoarder@lemmy.ml

i want to buy a few hard drives for backups.

What is the most reliable option for longetivity? i was looking at the wd ae, which they claim is fit for this purpose, but knowing nothing about hard drives, I wouldnt know if it was a marketing claim..

19
302
submitted 1 month ago by lars@lemmy.sdf.org to c/datahoarder@lemmy.ml

cross-posted from: https://lemmy.world/post/17689141

I'll just save them in this folder so that I can totally come back later and read them.

20
-3
submitted 1 month ago* (last edited 1 month ago) by velox_vulnus@lemmy.ml to c/datahoarder@lemmy.ml

They're all available on this Tumblr page. I'd appreciate it a lot.

Edit: I've fixed the Mediafire links:

21
107
22
23
submitted 2 months ago by Thavron@lemmy.ca to c/datahoarder@lemmy.ml
23
14

I was considering making a 30+ TB NAS to simplify and streamline my current setup but because it's a relatively low priority for me I am wondering is it worth it to hold off for a year or two?

I am unsure if prices have more or less plateaued and the difference won't be all that substantial. Maybe I should just wait for Black Friday.

For context it seems like two 16TB HDD would cost about $320 currently.


Here's some related links:

  • This article by Our World in Data contains a chart with how the price per GB has decreased overtime.

  • This article by Tom's Hardware talks about how in July 2023 SSD prices bottomed out before climbing back up predicted further increases in 2024.

24
16
Renewed drives (slrpnk.net)
submitted 2 months ago by greengnu@slrpnk.net to c/datahoarder@lemmy.ml

Are they worth considering or only worth it at certain price points?

25
148
submitted 3 months ago by xnx@slrpnk.net to c/datahoarder@lemmy.ml

cross-posted from: https://slrpnk.net/post/10273849

Vimms Lair is getting removal notices from Nintendo etc. We need someone to help make a rom pack archive can you help?

Vimms lair is starting to remove many roms that are being requested to be removed by Nintendo etc. soon many original roms, hacks, and translations will be lost forever. Can any of you help make archive torrents of roms from vimms lair and cdromance? They have hacks and translations that dont exist elsewhere and will probably be removed soon with ios emulation and retro handhelds bringing so much attention to roms and these sites

view more: next ›

datahoarder

6591 readers
1 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS