14

I have been doing hard drive backups for a while, which I believe a lot of you guys do as well, but am having trouble figuring out a better way storing them offsite. I am wondering how you guys out there do that. Any policy or tips & tricks when it comes to offsite store your backups, mainly hard drives not tapes.

Thanks in advance.

[update] Thanks for mentioning the online backup. We are actually in the middle of this process. And I 100% agree that it's the ultimate way to go. However, considering the cost, sometimes it may not be the option, as it's a quite expensive option if you also consider the application level. I guess online backup can be a very good one in the separate topic. :)

kentchen
  • 754

11 Answers11

9

Depending on how much you need to back up I would recommend the following:

  1. JungleDisk / Amazon S3 - Works VERY well.

  2. RSYNC to a remote machine also works very well. CRON job every XX hours.

We back up almost a TB of data to Amazon's S3 cloud and have a "warm standby" at our colo backing up from the master several times a day (via rsync). The cost for transfer/storage on Amazon S3 is extremely cheap. (ie. cheaper than burning to a DVD but not cheaper than backing up to HDD. I know some folks who simply plug in a 1TB UDB "My Book" or something into the server and back it up weekly/monthly. Depending on your needs one or two of those might be the cheapest solution for you.

Now that's just talking about DATA backups. Several comments below talk about backing up the server itself...

Depending on your needs, Norton Ghost or even Acronis (http://www.acronis.com) might be of help to you. Things like Norton Ghost tend to rely on your ability to be able to actually turn OFF the computer to make the backup. Some of us don't have that luxury but if YOU do then Norton Ghost is a VERY good product.

KPWINC
  • 11,554
3

Don't stop at just backing up the data--- we make regular Ghost images of our main servers, and keep those offsite as well.

Bill B
  • 399
2

Bit of a luxury I'll admit but we live-live our SANs and backup at each site with periodic tapes or disks going to an external company (such as IronMountain).

Chopper3
  • 101,808
2

It depends on the size/shape of your backup requirement, your technical ability and the frequency of data change...!

Simplest option is to hire another server from Rackspace (or another provider), VPN over to it and Robocopy your files. Make a script that does some simple Father, Grandfather, Son stuff and test test test... All of this can be automated.

As with all backups it's important to do a DR day periodically because you may have your data, but what if you forgot to have a copy of the application you are running handy...!

It's often the simplest things that fail, NOT the most complex or the ones you think...

Good luck

Mike

2

Things to consider:

  1. Who's responsiblitiy is it to take the backups off site, and who takes over if they're out/sick/on vacation/etc?
  2. How are you storing your HDDs? Padded containers? Climate controlled area?
  3. How can get to your stored disks? If it is only one person what happens if they get hit by a bus?
  4. When was the last time you tested restoring from one of these disks?
  5. When was the last time you rested EACH disk to make sure they were all still good? Media doesn't last forever.
  6. Is your rotation schema and procedure documented so even Sandy from the mail room, or Dan from the reception desk can rotate the media?

Storing disks in someone's home is only a good idea if multpiple people have keys to that home. There are of course companies which provide offsite media services including pickup/delivery and management of who has access. This of course costs money, but not necessarily very much as compared to loss of your data. We use IronMountain and I was shocked to discover how little it actually cost/month to have a container from them. We actually have 4 containers 3 of which are off site at any given time.

Laura Thomas
  • 2,855
1

I guess this depends on how much data you are storing, but online backup is where its at.

You don't have to worry about cycling through hard drives, it being stolen out of your car, and other hazards of carrying around your data.

Online backup is getting cheap - currently I am paying 50 cents a gig. Backups are run one a night, and old versions are kept to a specification that I decide. It is all encrypted before it leaves our site.

Switching from tape based (or HD based) backups to online backups was one of the best decisions we ever made! Hopefully you are at an organization that will consider online backups.

Dave Drager
  • 8,455
1

i've backups coming from couple of places [ countries in fact ] to one central server. backups are driven by backupninja, i use rdiff-backup, rsync and custom scripts.

central server keeps online 14days of history.

every morning [ after all data arrives ] i rsync whole content of online data to usb-attached 1TB disk. during the day content of the disk is verifies [ at least part done with rdiff-backup ] so i'm quite sure it can be recovered in the future. usb drives get rotated weekly. and are stored 'away' from server. data on usb drives sits on encrypted partition so there is no need for safty storage.

this works fine for reasonably small amount of data - in my case it's < 200GB of data, ~5GB of diff every 24hours. if there is need to restore - in 90% i can do it from online copy. if data that needs to be recovered is older than 14 days - i can quickly fetch it from offsite location.

pQd
  • 30,537
1

One method within the limits you set would be to find a close bank with safety-deposit boxes and put them in there. How close it is would be a trade-off between convenience / risk. The closer it is, the higher the chances that a disaster would also effect that location.

Kyle Brandt
  • 85,693
1

As Laura said more requirements would help here however I can tell you what I use to handle my personal backups.

I have a backup arrangement with a good friend. They host a external USB drive on their workstation and I host one of theirs. We each have limited ssh access to each others machines to push data to our remote drives. Before this I pushed data to a server I had access to with adequate space.

My backups are scheduled with cron and then performed by duplicity. I chose this tool because it supports many methods of moving your backup data(ssh,sftp, s3, local, ...). More importantly, you can use it to do encrypted backups. This is handy when you're dumping data to another location that you dont' have as much control over.

1

In our operation, we have found it somewhat important to consider the purpose of the storage and pick the appropriate media depending on that purpose. As a recording and video production facility, we have terrabytes of data which we have to move in and out of operation. We use an online, nearline, and offline thought process. The online stuff is of course on the local server. In our case I use NearLine to refer to storage used for quick restores in the case of a server failure. This is usually sets of terrabyte external hard drives stored both onsite and offsite. They can be plugged in quickly in order to rebuild a file server.

Where it gets interesting is offline storage. In our case this may be a video project which we know we may need to come back to in a year, but does not need to be immediately online. We need an archival media for large amounts of data. This is becoming more and more important in my industry as many of the hi-definition cameras are shooting directly to banks of 16 or 32 GB P2 cards so there is no tape or film media to go back to. The initial product is digital files. I know many production companies that are using firewire drives for this offline storage purpose. They copy the project to an external firewire drive and set it on a shelf.

However, we have had an abysmal failure rate on these drives. We had close to 20 of these drives at one point and have sent over a third of them back for repair at one point or another. After losing both our primary and secondary external drive backup for a project in the same week, we finally abandoned that concept and have returned to tape for long term storage. In our case LTO4.

To summarize, IMHO the media to use depends on the application and the longevity. We have tapes from over a decade ago that restore just fine. I am not convinced that a hard drive sitting on a shelf for ten years will neccessarily come back though.

AudioDan
  • 398
0

We manage our backups like this:

  • 1 small backup machine
  • 1 RAID1-Array (1TB, linux software RAID)
    • 1 internal HDD
    • 2 external HDDs (now: USB, future: eSATA)
  • md-Device is a luks-encrypted partition
  • mdadm, udev, UUIDs ... manage automatic array-resyncing when USB-Discs get (re-)attached
  • actual backups done with dirvish via ssh and curlftpfs
    • where needed: LVM-Snapshots, mysqldumps, whatever ... is done by dirvish pre-client -scritps
    • passwordless SSH Key with rsync-validation script as allowed command
    • backup container is mounted just before beginning to beackup and umounted afterwards

Pros/Cons:

[+] we always have a 'good copy' of the whole data on the ionternal HDD (which is strong monitored)

[+] you can pick any of the external HDDs and take it home for off-site backup

[+] if you're not going home straight after work and lose the backup-disc in bar data should be safe because of the encrypted container

[-] everytime a disc gets (re-)attached, the whole disc has to get synced from the other(s)

[-] if the backup gets bigger than one single disc it gets much more complicated (you could span an LV across multiple discs and use those as md-devices - but than you always have to pick the 2+ discs belonging together)

m.sr
  • 1,060
  • 1
  • 8
  • 19