To find a single match of a single pattern, the expected time of the algorithm is linear in the combined length of the pattern and text, although its worst-case time complexity is the product of the two lengths.
A naive string matching algorithm compares the given pattern against all positions in the given text.
In many practical cases, this time can be significantly reduced by cutting short the comparison at each position as soon as a mismatch is found, but this idea cannot guarantee any speedup.
These positions contribute to the running time of the algorithm unnecessarily, without producing a match.
Naively computing the hash value for the substring s[i+1..i+m] requires O(m) time because each character is examined.
A trivial (but not very good) rolling hash function just adds the values of each character in the substring.
The key to the Rabin–Karp algorithm's performance is the efficient computation of hash values of the successive substrings of the text.
For example, if the substring is "hi", the base is 256, and prime modulus is 101, then the hash value would be '%' is 'mod' or modulo, or remainder after integer division, operator Technically, this algorithm is only similar to the true number in a non-decimal system representation, since for example we could have the "base" less than one of the "digits".
For example, if we have text "abracadabra" and we are searching for a pattern of length 3, the hash of the first substring, "abr", using 256 as the base, and 101 as the prime modulus is:
for prime modulus $p$, we can ensure no underflow by adding p to the old hash before subtracting the value corresponding to the old 'a' (mod p).
the last '* 256' is the shift of the subtracted hash to the left although ((256%101)*256)%101 is the same as 2562 % 101, to avoid overflowing integer maximums when the pattern string is longer (e.g. 'Rabin-Karp' is 10 characters, 2569 is the offset without modulation ), the pattern length base offset is pre-calculated in a loop, modulating the result each iteration If we are matching the search string "bra", using similar calculation of hash("abr"), If the substrings in question are long, this algorithm achieves great savings compared with many other hashing schemes.
To find any of a large number, say k, fixed length patterns in a text, a simple variant of the Rabin–Karp algorithm uses a Bloom filter or a set data structure to check whether the hash of a given string belongs to a set of hash values of patterns we are looking for: We assume all the substrings have a fixed length m. A naïve way to search for k patterns is to repeat a single-pattern search taking O(n+m) time, totaling in O((n+m)k) time.