Log-structured merge-tree

In computer science, the log-structured merge-tree (also known as LSM tree, or LSMT[1]) is a data structure with performance characteristics that make it attractive for providing indexed access to files with high insert volume, such as transactional log data.

[2] As described by Patrick O'Neil, a two-level LSM tree comprises two tree-like structures, called C0 and C1.

If the insertion causes the C0 component to exceed a certain size threshold, a contiguous segment of entries is removed from C0 and merged into C1 on disk.

The performance characteristics of LSM trees stem from the fact that each component is tuned to the characteristics of its underlying storage medium, and that data is efficiently migrated across media in rolling batches, using an algorithm reminiscent of merge sort.

Such tuning involves writing data in a sequential manner as opposed to as a series of separate random access requests.

To perform a query on a particular key to get its associated value, one must search in the Level 0 tree and also each run.

Extensions to the 'leveled' method to incorporate B+ tree structures have been suggested, for example bLSM[5] and Diff-Index.

As increasingly more read and write workloads co-exist under an LSM-tree storage structure, read data accesses can experience high latency and low throughput due to frequent invalidations of cached data in buffer caches by LSM-tree compaction operations.

To re-enable effective buffer caching for fast data accesses, a Log-Structured buffered-Merged tree (LSbM-tree) is proposed and implemented.

Once the in-memory buffer becomes full, it is flushed to the disk as an immutable sorted component at the first level (C1).

This flush is performed sequentially, ensuring high I/O efficiency by avoiding the costly random writes typical of traditional indexing methods.

To maintain durability, the system may use a write-ahead log (WAL) that records all incoming writes before they are added to the memory buffer.

As data accumulates across levels on the disk, LSM trees employ a merge process similar to merge sort to consolidate entries and maintain a consistent sorted order across levels.

This process also handles updates and deletes, removing redundant or obsolete entries.

In LSM trees, because of the multi-level structure and the immutability of disk components, point lookups involve checking several levels to ensure the most up-to-date value for the key is returned.

The process begins with a search in the in-memory component (C0), which holds the latest data.

To make the search faster, LSM trees often use a bloom filter for each on-disk component.

The lookup operation stops as soon as the key is found, ensuring the most recent version is returned.

The operation involves scanning multiple components across the LSM tree's hierarchical structure and consolidating entries to ensure accurate and complete results.

Since this component is typically a sorted data structure, range queries can be done efficiently.

For efficiency, LSM trees often divide disk components into smaller, disjoint key ranges.

For long-range queries, which access many keys, the price is dominated by scanning the largest level, as it contains most of the data.