/ˈdʒɜrnəlɪŋ/
noun — "tracks changes to protect data integrity."
Journaling is a technique used in modern file systems and databases to maintain data integrity by recording changes in a sequential log, called a journal, before applying them to the primary storage structures. This ensures that in the event of a system crash, power failure, or software error, the system can replay or roll back incomplete operations to restore consistency. Journaling reduces the risk of corruption and speeds up recovery by avoiding full scans of the storage medium after an unexpected shutdown.
Technically, a journaling system records metadata or full data changes in a dedicated log area. File systems such as NTFS, ext3, ext4, HFS+, and XFS implement journaling to varying degrees. Metadata journaling records only changes to the file system structure, like directory updates, file creation, or allocation table modifications, while full data journaling writes both metadata and the actual file contents to the journal before committing. The journal is often circular and sequential, which optimizes write performance and ensures ordered recovery.
In workflow terms, consider creating a new file on a journaling file system. The system first writes the intended changes—allocation of blocks, directory entry, file size, timestamps—to the journal. Once these journal entries are safely committed to storage, the actual file data is written to its designated location. If a crash occurs during the write, the system can read the journal and apply any incomplete operations or discard them, preserving the file system’s consistency without manual intervention.
A simplified example illustrating journaling behavior conceptually:
// Pseudocode for metadata journaling
journal.log("Create file /docs/report.txt")
allocateBlocks("/docs/report.txt")
updateDirectory("/docs", "report.txt")
journal.commit()
Journaling can be further categorized into several modes: write-back, write-through, and ordered journaling. Write-back prioritizes speed by writing data asynchronously while metadata is committed first; write-through ensures data and metadata are both journaled before completion; ordered journaling guarantees that data blocks are written to disk in a defined order relative to the metadata updates. These strategies balance performance, reliability, and crash recovery needs depending on the workload and criticality of the data.
Conceptually, journaling is like keeping a detailed ledger of all planned changes before making physical edits to a ledger book. If an error occurs midway, the ledger can be consulted to either complete or undo the changes, ensuring no corruption or lost entries.
See FileSystem, NTFS, Transaction.