File system fragmentation

File system fragmentation negatively impacts seek time in spinning storage media, which is known to hinder throughput.

Fragmentation can be remedied by re-organizing files and free space back into contiguous areas, a process called defragmentation.

It is recommended to not manually defragment solid-state storage, because this can prematurely wear drives via unnecessary write–erase operations.

[1] When a file system is first initialized on a partition, it contains only a few small internal structures and is otherwise one contiguous block of empty space.

As time goes on, and the same factors are continuously present, free space as well as frequently appended files tend to fragment more.

This is especially true when the file system becomes full and large contiguous regions of free space are unavailable.

[b] The file system could defragment the disk immediately after a deletion, but doing so would incur a severe performance penalty at unpredictable times.

Using the example in the table above, the attempt to expand file F in step five would have failed on such a system with the can't extend error message.

Instead, the user would find themselves dumped back at the command prompt with the Can't extend message and all the data which had yet to be appended to the file would be lost.

Unlike the previous two types of fragmentation, file scattering is a much more vague concept, as it heavily depends on the access pattern of specific applications.

However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.

Thus, a file system that simply orders all writes successively, might work faster for the given application.

The catalogs or indices used by a file system itself can also become fragmented over time, as the entries they contain are created, changed, or deleted.

[7] File system fragmentation is more problematic with consumer-grade hard disk drives because of the increasing disparity between sequential access speed and rotational latency (and to a lesser extent seek time) on which file systems are usually placed.

[9] File system fragmentation has less performance impact upon solid-state drives, as there is no mechanical seek time involved.

Due to the difficulty of predicting access patterns these techniques are most often heuristic in nature and may degrade performance under unexpected workloads.

Preemptive techniques attempt to keep fragmentation to a minimum at the time data is being written on the disk.

BitTorrent and other peer-to-peer filesharing applications limit fragmentation by preallocating the full space needed for a file when initiating downloads.

The defragmentation process is almost completely stateless (apart from the location it is working on), so that it can be stopped and started instantly.

Visualization of fragmentation and then of defragmentation
Simplified example of how free space fragmentation and file fragmentation occur