Maximal unique match

A maximal unique match or MUM, for short, is part of a key step [1] in the multiple sequence alignment of genomes in computational biology.

Finally, maximal states that the substring is not part of another larger string that fulfills both prior requirements.

The idea behind this is that long sequences that match exactly and occur only once in each genome are almost certainly part of the global alignment.

"Given two genomes A and B, Maximal Unique Match (MUM) substring is a common substring of A and B of length longer than a specified minimum length d (by default d = 20) such that Identifying the set of MUMs in two very long genome sequences is not computationally trivial.

To illustrate an example where expansion is needed to ensure that our MUM is not part of a larger sequence and unique, take the following:

MUMs are a subset of a larger set referred to as Maximal Exact Matches or MEMS.

MUM identification using a suffix tree
MUM identification using a suffix tree