There’s a crisis looming in scientific data, with a leading scientific journal estimating 80 per cent of current scientific research will be lost in 20 years.
The study, published in Nature, included all data – even paper stored in garages or mouldy basements – but digital-only information may be under even greater threat. Digital conservation and what’s called “digital archaeology” (picking up the pieces of data loss) is going to be an increasingly important strategy.
Rackspace chief technology officer Alan Perkins had to restore data from files written by discontinued and unsupported software. “I had to reverse-engineer the encrypted data,” he says. “And I noticed several repeating characters every 673 characters, which let me decipher it.”
Few of us need to be reminded that simply backing up regularly can rescue us from reaching that point, but data conservation is a complicated world. You might inherit information that was written decades ago, from a program lost in time, or stored on 8-inch floppy disks.
Advertisement As Andrew Martin, media migration manager at data migration provider DAMsmart says guarding against data loss “can only be achieved through digital stewardship, constantly monitoring the sustainability of the codecs and file formats to ensure they’re useable.”
Matthew Davies, acting manager of collection stewardship at the National Film and Sound Archive, comes up against the challenge of preserving data that is “born digital” every day. “In the old days university professors would put notes or books into an archive and you had them when you needed them,” he says. “Digital information needs regular attention. Standardisation is very important – we need to be all speaking the same language. Is the file format proprietary or open? In wide use or rare? Easy or hard to implement? Benign neglect is no longer an option.”
Source: The Age