this post was submitted on 18 Mar 2024
44 points (94.0% liked)
Asklemmy
43901 readers
1068 users here now
A loosely moderated place to ask open-ended questions
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- !lemmy411@lemmy.ca: a community for finding communities
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm trying to do that; but all of the newer drives i have are being used in machines, while the ones that arent connected to anything are old 80gb ide drives, so they aren't really practical to backup 1tb of data on.
For the most part i prevented myself from doing the same mistake again by adding a 1gb swap partition at the beginning of the disk, so it doesn't immediatly kill the partition if i mess up again.
It's possible to make that work; through discipline and mechanism.
You'd need like 12 of them but if you'd carve your data into <80GB chunks, you could store every chunk onto a separate scrap drive and thereby back up 1TB of data.
Individual files >80GB are a bit more tricky but can also be handled by splitting them into parts.
What such a system requires is rigorous documentation where stuff is; an index. I use git-annex for this purpose which comes with many mechanisms to aid this sort of setup but it's quite a beast in terms of complexity. You could do every important thing it does manually without unreasonable effort through discipline.
Another good practice is to attempt any changes on a test model. You'd create a sparse test image (
truncate -s 1TB disk.img
), mount via loopback and apply the same partition and filesystem layout that your actual disk has. Then you first attempt any changes you plan to do on that loopback device and then verify its filesystems still work.The problem is that i didn't mean to write to the hdd, but to a usb stick; i typed the wrong letter out of habit from the old pc.
As for the hard drives, I'm already trying to do that, for bigger files i just break them up with split. I'm just waiting until i have enough disks to do that.
For that issue, I recommend never using unstable device names and always using
/dev/disk/by-id/
.I'd highly recommend to start backing up the most important data ASAP rather than waiting to be able to back up all data.
Or mount it in RAID0/whatever the zfs equivalent is.
The downside over one disk is many have more possible points of failed, taking out the whole array - so ideally another RAID would be best
That would require all of those disks to be connected at once which is a logistical nightmare. It would be hard with modern drives already but also consider that we're talking IDE drives here; it's hard enough to connect one of them to a modern system, let alone 12 simultaneously.
With an Index, you also gain the ability to lose and restore partial data. With a RAID array it's all or nothing; requiring wasting a bunch of space for being able to restore everything at once. Using an index, you can simply check which data was lost and prepare another copy of that data on a spare drive.
I'm just talking prebuilt solutions here, but how would you use an index'd storage base if the drives weren't connected? Sounds like that's an issue regardless
Note that all of this is in the context of backups; duplicates for the purpose of restoring the originals in case something happens to them. Though it is at least possible to use an index cold storage system like what I describe for more frequent access, I would find that very inconvenient for "hot" data.
You take a look at your index where the data you need is located, connect to that singular location (i.e. plug in a drive) and then copy it into the place it went missing from.
The difference is that, with an Index, you gain granularity. If you only need file A, you don't need to connect all 12 backup drives, just the one that has file A on it.