Every Format on the Face of the Planet by Leslie Johnston

Some weeks ago I gave a presentation that I jokingly titled “The Challenges of Preserving Every Digital Format on the Face of the Planet.”

Except it’s not really a joke.

We often have little or no control over what comes into the Library of Congress Digital Collections, and we manage and preserve a wide variety of formats.  One collection brings in TIFF, JPEG, JPEG2000, and XML.  Another brings in MPEG-4, MP3, BWF, AVI, and a wide variety of specialized commercial media formats.  Another brings in JPEG, PDF, XML and a variety of metadata formats.  One is all JSON.  One is every flavor of GIS data.  The Web Archives include every format which has ever appeared online.  And yet another collection has, in 18 months, included almost 50 different file extensions (but not that many actual file types) with a huge number of metadata variations.

