Originally published on LinkedIn.
By Todd Brous, President at Untwist, Inc.
Over the past 10 years, I have observed that many people use Hard Drives for Archiving their data. In general, this is fine as long as one understands the technical limitations and tradeoffs of using hard drives. The big questions are:
- Will “Stiction” prevent the drive from spinning up?
- Will the electronic components still function over time?
- Will any of the data deteriorate due to Data Degradation or “bit rot”?
It is important to remember that hard drives were not originally designed for sitting on a shelf. They are supposed to be powered up and spinning, thus using a hard drive for cold archival storage introduces some concerns.
Why Use Internal Hard Drives for Data Archiving?
Hard drives are inexpensive, high capacity, and provide random access to stored data. The significant issue is data longevity, and in a nutshell, if you need to store data for more than 1-3 years on a shelf, then one should consider investing in LTO Tape, Cloud archiving options, or newer Archival Hard Drives.
In addition, when compared to using an external hard drive for archiving, you can save up to 50% by switching to an internal drive solution. You no longer need to pay for the enclosure, the power supply, the cables, the extra shelf space, and you will no longer suffer when someone inevitably looses a unique power adapter. All you need are the guts that actually store the data.
So, how does this work?
First You Need a Hard Drive Docking Station
Search for “Hard Drive Dock” on Amazon.com. There are lots of options out there, and most are very inexpensive. Docks that support multiple hard drives, as well as automatically perform full “drive cloning” are now available. You can connect a dock to a computer via USB, eSATA, or Thunderbolt. If a dock breaks, then it is super easy and inexpensive to replace. A nice feature is that most Drive Docks can read or write to any SATA drive, and should work with original SATA v1.0 drives that were manufactured in 2004. That is a very long backwards compatibility timeline. (See LTO Side Note at the end of this article.)
Purchase Some Bare Internal Hard Drives
Search Amazon for “internal hard drive”. Western Digital, Seagate, Samsung, Hitachi (now owned by Western Digital), and Toshiba are the main manufacturers. You do not need RAID-class drives. These are not the drives you are looking for.
If your budget allows, and your data is very important, then I highly suggest taking a look at the new Seagate Archive Hard Disk Drives. These are perfect for long term Cold storage. Unfortunately, these can be a bit expensive but are well worth the investment.
I have worked with many customers who use “desktop series” drives for archiving because they are very inexpensive. The issue is that these are not rated for long-term data archiving. As mentioned earlier, desktop series drives are good for 1-3 years sitting on a shelf. If your data retention needs extend beyond that time frame, then consider investing in LTO Tape, an appropriate Cloud based archive solution, or the Archive series drives.
Also, do not forget that every hard drive (or tape drive) ever manufactured will eventually fail. Thus, it is important to make multiple copies of your data and store it in multiple locations.
Making an Archive
You will need to format the hard drive for your operating system, and you will need to remember to properly unmount and eject the drive when you are finished. Once your drive is formatted and mounted on your system you are ready to go.
The best way to perform a data archive depends on the specific needs of each situation, but one of the easiest ways is to use some sort of a Cloning Utility to do the work for you. One could simply do a click-and-drag copy of their data, however this is really not a recommended practice. PLEASE, Please, please, make sure that your data has been VERIFIED, and that you are 100% sure that your clone is a complete duplicate of the original data. Double check that you can restore your data because I have seen people forget to make sure that their Archive worked.
Now, make a second copy. If your data does not exist in at least two places, then it does not exist. One copy of your data is not enough.
You may be thinking about how to organize multiple projects onto a single hard drive to use all available disk space. This becomes a numbers game that, unless you are using some backup software to manage everything, often requires keeping track of each individual archive’s data usage. To simply avoid this, I have seen some clients place one project per hard drive. Hard Drives come in multiple sizes so using an appropriate sized drive for each job is relatively easy to do.
The Hudzee Case
Now that you have completed your data archive, where do you put the Bare Internal Hard Drive? Introducing the Hudzee Case. The Hudzee is a protective antistatic case with a secure latch that is specifically designed for storing and organizing bare internal hard drives.
The Hudzee Case comes with a reversible paper label insert to record the contents of your Archive. It is a good idea to include the date, the software and version number used to make the archive, and any other pertinent data. Also, record information on the spine and the front of the label. This way it is easy to find a drive while looking at them on a shelf.
Place your hard drive with the electronics facing down inside the Hudzee Case. The anti-static foam and mylar is specifically designed to protect the hard drive and the electronic components.
Store everything in an environmentally controlled space. Keep your data archives away from magnetic fields and power lines. Do not let them fall over or drop.
(Full disclosure: I make the Hudzee Hard Drive Case, so I am biased. There are alternatives on the market, and they will certainly work for you as well, but I sincerely hope that you will choose to buy a high quality Hudzee. Also, I can get you a discount if you email me: email@example.com.)
Proper Hard Drive Care
Do not touch, lick, or chew, the electronics on the Hard Drive. Do not drop, bump, or shake the Hard Drive. Do not allow internal hard drives to touch each other. Handle the drives in an antistatic environment, and use a grounding wrist strap. If you are going to label the hard drive itself, place the label close to the drive’s edge. Do NOT cover the “breather hole” or cover the drive information label. Do not write on the top cover of the hard drive. Do not squeeze or press on the hard drive.
Do not wash or rinse the Hard Drive with soap or water.
I have to stress how important it is to properly care for hard drives. I have seen people accidentally drop a hard drive, resulting in unreadable data. Hard drives are sensitive components, and they do not react well to bullets.
Consider numbering all of your archives, and using a spreadsheet or database to keep track of everything. A simple Google Spreadsheet or Microsoft Office 360 Excel document can work. Bonus: the document gets stored in the cloud, and can be shared with others.
The amount of time it takes to complete your data archive unfortunately depends on entirely too many variables. These are some of the questions I would ask a client while designing an Hard Drive based archiving pipeline:
- How much data you are archiving?
- How much time do we have to complete a data archive?
- How fast is the hard drive that you are using?
- What bus is the Dock connected to? (USB2, USB3, eSATA, Thunderbolt?)
- How fast is the data source volume?
- Is the data source on a direct attached high-speed RAID?
- Is the data source on a mounted file system over the network
- How fast is your network connectivity to the data source? (Gigabit? 10-Gigabit?)
- What protocol is the network volume mounted with? (CIFS/SMB, AFP, NFS?)
- What software are you using to make the archive and perform the checksum (verification)?
Answering some of these questions can help narrow down a proper solution.
As all archives get older, the likelihood of data loss increases significantly. The only way to prevent data loss is to Migrate data from an old archive onto a new archive. This involves periodically restoring information from a data archive, verifying that the data is 100% intact, and then writing it to a new archive set.
If the restored data is incomplete, then one must refer to a redundant copy of the archived data in order to repair any holes. Hopefully, both archives were not damaged in the same data spaces otherwise the data loss will be permanent.
Data migration tends to be expensive and time consuming, but is the only way to ensure long term access to digital information.
If you find yourself in the nightmare situation where you are unable to recover mission critical data from a hard drive, then there still may be hope. Prepare yourself. This is going to be expensive. This is your last hope. This is the only option left, and the amount of data that can be recovered depends on the amount of damage to the drive.
First, please STOP WHAT YOU ARE DOING. Power down everything. Do not use any “data recovery software”. Call DriveSavers immediately. They are open 24/7. When you speak to one of their sales engineers, tell them Todd from Untwist sent you, and use Discount Code #DS23083.
Personally, I do not recommend any other data recovery services. DriveSavers was able to recover 100% of the data from an entire file server that was soaked in water from a burst water pipe. DriveSavers was able to recover wedding photos a week before their anniversary and “before he told his wife” that the photos were ever lost. DriveSavers was able to recover all of the files from the TV Shoot external hard drive that was dropped down a flight of stairs.
In my opinion, they are the best. They can also recover Tape media.
When properly used, modern hard drives can be a cost effective short-term archiving solution. Most drives no longer suffer from stiction, and with proper care and handling, a cold drive sitting on a shelf can still power up after a few years, and the data will be safe. Newer Archival Hard Drives are designed to hold data for long term storage, and time will tell how well these perform.
Please feel free to reach out to me with questions or comments! I would love to hear your thoughts.
Please take a look at www.hudzee.com for more information.
LTO Side Note:
Compare the history of SATA Drives to LTO Tape Drives that are only backwards compatible to 2 generations. This means that a brand new LTO7 tape drive cannot read data from a LTO4 tape. I would argue that, historically speaking, maintaining old hardware and software is one of the biggest hidden costs of using LTO Tape. That being said, despite the significant upfront costs, I think LTO7 is amazing technology, LTO is always the first on-site archiving solution that I recommend to my clients, it is super fast, and LTFS solves some big software issues of previous generations. Unfortunately, the LTFS Format Specification does not allow for any file names or folder names containing the below characters to be archived.
/ : ” * ? < > \ |
Everything LTFS will fail if anyone uses any of those characters in a file or folder name. This requires everyone in your facility to follow these rules without fail.