Thwarting the Demons of Digital Archiving

Copyright ©2009 by Ken Loge


A Little History

I embraced the digital revolution pretty much at the time of its birth. The actual birth date of the digital revolution may be disputable, but from my perspective it happened when computers were affordable enough to be purchased by the average consumer, and powerful enough to be useful at home or work.

My first computer was a Commodore 64, which at the time was probably the best computer that could be purchased for $250 or less. Of course I loved to play games with it, but I was especially enamored with its potential to allow me new possibilities of creative expression. The first digital muse I indulged was using my personal computer for writing. I tried several word processors, but fell in love with Paperclip, by Batteries Included. With Paperclip I wrote papers for college classes, wrote letters, fiction, and a screenplay. The latter pushed the limits of practicality, as I needed several 177 KB 5.25″ floppy disks to store the 120+ page screenplay, but the idea of a cursor and no whiteout was just too appealing to go back to a clunky, unruly typewriter, so I was definitely hooked.

Soon I was using my personal computer for much more than writing. I wrote programs in BASIC for my Dungeons and Dragons characters and campaign, made artwork, games, and lots of music. In essence I had knocked down the fourth wall. I was a virtual author and performer, and the computer was my proscenium. It was quite a lovely time to create digital media, but eventually I met the first demon of digital archiving, the storage medium itself.

From a 2009 perspective, 177 KB of storage is quite unremarkable. (Modern musical gift cards that are given away have more storage capability.) The disks themselves were flimsy (floppy too, but the industry wouldn’t have been able to sell “flimsy disks”), and they were highly susceptible to erasure by any nearby magnet, or whims of nature. As my collection of 5.25″ floppy disks increased I realized I would eventually need to transfer my Commodore 64 files to another medium. This was something I didn’t look forward to, so I essentially ignored the problem for more than 20 years. When I later realized I wanted some of my early creative files I met the second demon of digital archiving, the data format of my files.

With the advent of desktop publishing in the late 1980s I became a fan of Macintosh computers. They were much more practical for things like screenplays, but they eventually led me to the third demon of digital archiving, application-specific files. An example of my encounter with this demon was when I was given the task of converting a book written with Pagemaker 1.0 to a newer version. At the time I had Pagemaker 7, but no file translator by any publisher I could find could open this Precambrian format, so I was stuck having to scan the entire printed book, page-by-page, to try to get some semblance of the original. Let it be know that I would not recommend this any more than I would recommend exfoliating your skin to the bone with a carrot peeler.

There were, of course, many other demons I encountered on the primrose path to digital determinism, but the aforementioned 3 are the meanest, overall. To summarize, the first-order Demons of Digital Archiving are:

  1. Storage Medium
  2. Data Format
  3. Application-Specific Files

Storage Medium

Think of the storage media you use like produce. Eventually it will go bad. With that in mind the most obvious way to slay this demon is to keep the medium fresh and edible, and your hardware as up-to-date as possible. Think of all of those 100 MB Iomega Zip disks you purchased 10 years ago. The disks themselves may be fine, but do you have access to a drive that can read them? In any event, 100 MB Zip disks, for digital intents and purposes, are dead. So get those files off of those Zip disks, and at least onto CD or DVD-ROM discs. Better yet, put those files on a hard drive, or flash drive, and keep them in a safe place.

Some more friendly advice. If your archival medium of choice is a hard disk, make sure the filesystem of the hard drive is one that will still be readable in 5 or more years. For example, I currently use the EXT3 and EXT4 filesystems on my archival hard drives. The EXT filesystem is long-legged, open-source, and has a large installed user base through various Linux distributions. You are probably fine with other filesystem standards such as Apple’s HFS+, or Microsoft’s NTFS, but these are proprietary formats, and may be abandoned or changed at the discretion of their respective companies. Note that FAT32 is also a decent format, but you can’t store large files on it — 4 GB is the limit, and that’s not even enough for an ISO image of your favorite “home movies” DVD. Also, if you plan to lock away your hard drive somewhere, like in a safe deposit box, physically write down the filesystem type, and any other pertinent information on the outside of the drive’s case. (A Sharpie pen works well for this). This information will be especially welcome in 5 years when you blow the dust off of the drive and wonder what machine you should use to  mount the drive.

Data Format

The data format is the structure of digital data in a file. For example, an AIF audio file, or a PNG image file. The medium onto which you store your files may be readable by your computer, but your computer will also need to be able to determine what kind of file it is before examining the available software that may open it. Carefully note the software you use to create your digital masterpieces, and don’t count on the file’s metadata to provide enough details for it to know what spawned it, or when. In other words, always add a file extension to the end of a file’s name, and group like files in a common folder. For example, all original files you create with Adobe Photoshop should have a .PSD extension, or be a in a folder with other Photoshop files. It’s also not a bad idea to write the date you created the file into the name itself. This will prevent ill-maintained system clocks from changing the date and time stamp of the file.

Application-Specific Files

Application-specific files are files saved in a native format for a particular program. The danger of having files saved only in an application-native format is that most commercial software is updated at least every couple of years. Along with more” features you can’t live without” in the newest version of the software comes file format changes and potential incompatibilities. For example, I have been an Adobe Photoshop user since before it was version 1.0 in the early 1990s. However, I would be foolish to expect files I made 20 years ago with Photoshop to be readable with the latest version. File formats change as often as hair styles, or are usurped by more modern or popular formats. Always keep the original file as well as a common, or more standard version of the file. For example, keep original Photoshop files, but save a copy as a PNG or TIFF as well. This will help ensure that you will always have a readable format.

Here are Some General Archiving Tips for Files

  • Keep your archive up-to-date. Update your archive at least once per month, or more frequently if you create a lot of files each day, or week. Also, write a note that you keep with the hard drive or DVD-ROM discs that tell you when the files were last backed up, and possibly what files, or types of files the archive includes.
  • Keep multiple backups of all files you care about. Redundancy helps thwart all data demons.
  • Keep at least one off-site copy of your archive. Your home may be broken into, or burn to the ground. (Yes, these are grim thoughts, but you have to consider the possibilities). Keep a hard drive clone of your precious work in a safe deposit box, or in a fireproof and waterproof safe.
  • Keep all original files as well as a non application-specific version of the file. The original file is like the master, and the other version of the file provides for maximum accessibility. Common standard image file formats include: PNG, JPEG, GIF, TIFF, TGA, and BMP. Common audio file formats include: AIF, WAV, and MP3, and FLAC.
  • Update application-specific files regularly. Yes, this is a pain in the USB port, but you will be happier when you are able to easily open your files in the future. Plan to update application-specific files at least every two years.
  • Document your files with simple, plain text files. For example, in each directory where it might be useful create a file called “info.txt”. Write out specific information about the files in that folder, and save it to the “info.txt” file. For example, the program, and the version of the program you used to create a specific file, the typeface you used, and why you made the file (for context). Creating a descriptive text file is extra work, but well worth the effort.