When it comes to eDiscovery, the guru is Craig Ball.
Craig recently posted about the upcoming 2019 Georgetown eDiscovery Training Academy, a week-long, immersive boot camp in electronically stored information (ESI).
The technology of e-discovery is its centerpiece, and I’ve lately added a 21-point synopsis of the storage concepts, technical takeaways and vocabulary covered.
- Common law imposes a duty to preserve potentially-relevant information in anticipation of litigation
- Most information is electronically-stored information (ESI)
- Understanding ESI entails knowledge of information storage media, encodings and formats
- There are many types of e-storage media of differing capacities, form factors and formats:a) analog (phonograph record) or digital (hard drive, thumb drive, optical media)b) mechanical (electromagnetic hard drive, tape, etc.) or solid-state (thumb drive, SIM card, etc.)
- Computers don’t store “text,” “documents,” “pictures,” “sounds.” They only store bits (ones or zeroes)a) ASCII or Unicode for alphanumeric characters;b) JPG for photos, DOCX for Word files, MP3 for sound files, etc.
- Digital information is encoded as numbers by applying various encoding schemes:
- We express these numbers in a base or radix (base 2 binary, 10 decimal, 16 hexadecimal, 60 sexagesimal). E-mail messages encode attachments in base 64.
- The bigger the base, the smaller the space required to notate and convey the information
- Digitally encoded information is stored (written):a) physically as bytes (8-bit blocks) in sectors and partitionsb) logically as clusters, files, folders and volumes
- Files use binary header signatures to identify file formats (type and structure) of data
- Operating systems use file systems to group information as files and manage filenames and metadata
- File systems employ filename extensions (e.g., .txt, .jpg, .exe) to flag formats
- All ESI includes a component of metadata (data about data) even if no more than needed to locate it
- A file’s metadata may be greater in volume or utility than the contents of the file it describes
- File tables hold system metadata about the file (e.g., name, locations on disk, MAC dates): it’s CONTEXT
- Files hold application metadata (e.g., EXIF geolocation data in photos, comments in docs): it’s CONTENT
- File systems allocate clusters for file storage; deleting files releases cluster allocations for reuse
- If unallocated clusters aren’t reused, deleted files may be recovered (“carved”) via computer forensics
- Forensic (“bitstream”) imaging is a method to preserve both allocated and unallocated clusters
- Because data are numbers, data can be digitally “fingerprinted” using one-way hash algorithms (MD5, SHA1)
- Hashing facilitates identification, deduplication and de-NISTing of ESI in e-discovery
If you don’t follow Ball in Your Court, you should.
All Rights Reserved 2019 – Beverly Michaelis