Manacher

Algorithm

Step 1. Inserting special characters

Assume the original string is s, and the string after insertion is t.

We insert special characters into s so as to avoid handling palindromes of odd length and even length separately.

We insert * around every character, prepend a ^ and append a $ to the string.

For example:

// Odd case
"aba" // the palindrome was of length 3
// becomes
"^*a*b*a*$" // the palindrome is of length 7
//   ^
//   this b is still the center.


// Even case
"abba" // the palindrome was of length 4
// becomes
"^*a*b*b*a*$" // the palindrome is of length 9
//    ^
//    this * becomes the new center

How to convert the indexes between s and t?

We only need to consider the indexes in t in range [2, s.size() * 2], which corresponds to [0, s.size()-1] in s.

Step 2. Calculate r array

Let r[i] be the number of palindromes centered at t[i] (aka. the radius of the longest palindrome centered at t[i]).

So we have:

  • the longest palindrome centered at t[i] is t[i-r[i]+1 .. i+r[i]-1].

  • For any k < r[i], substring t[i-k .. i+k] is a palindrome.

  • For any k >= r[i], substring t[i-k .. i+k] is not a palindrome.

Example:

For center t[6] = 'c', we have r[6] = 4:

  • The longest palindrome centered at t[6] is t[3 .. 9] = "*b*c*b*"

  • For any k < 4, substring t[6-k .. 6+k] is a palindrome. E.g. for k = 2, t[4 .. 8] = "b*c*b" is a palindrome.

  • For any k >= 4, substring t[6-k .. 6+k] is not a palindrome. E.g. for k = 4, t[2 .. 10] = "a*b*c*b*d" is not a palindrome.

To calculate r[i] efficiently, we need to leverage the computed r[j] (j < i) values.

Let j < i be the center with the furthest reach j + r[j].

For i-1, since r[i-1] >= 1, the corresponding reach i-1 + r[i-1] is at least i.

So, j + r[j] must be >= i.

Case 1: j + r[j] == i:

Since the palindrome t[j-r[j]+1 .. j+r[j]-1] doesn't cover t[i], we have no information to leverage, and have to expand from t[i] in a brute force way.

Example:

For t[6] = 'c', the corresponding j could be 4 and r[j] = 2. Since j + r[j] == i == 6, we expand at t[6] brute-forcely. And we will get r[6] = 4.

Since i + r[i] = 6 + 4 = 10 > j + r[j] = 6, we will make i = 6 the new center j.

Case 2: j + r[j] > i:

Now we can leverage some symmetry information.

Assume the symmetry point of i relative to j is k = 2*j-i. We need to consider 3 subcases about whether the range (j-r[j], j+r[j]) covers (k-r[k], k+r[k]).

Case 2a: k-r[k] > j-r[j]:

Example:

In this case, we must have r[i] = r[k] = r[2*j-i] due to symmetry.

Case 2b: k-r[k] < j-r[j]:

Example:

Since k-r[k] < j-r[j], not the entire range for k can be symmetrised relative to j. We can only symmetrise the range [j-r[j]+1, 2*k-j+r[j]-1]. So r[i] is at least k - (j-r[j]+1) + 1 = k-j+r[j] = j+r[j]-i.

Assume:

  • the range for j is surrounded by index p and s.

  • p and q are symmetric relative to k.

  • r and s are symmetric relative to i.

We know:

  • t[p] != t[s] and t[q] == t[r] because of r[j].

  • t[p] == t[q] because of r[k].

So, t[p] == t[q] == t[r] != t[s]. Thus, r[i] is exactly j+r[j]-i.

Case 2c: k-r[k] == j-r[j]:

Example:

This is similar to case 2b that we know r[i] is at least j+r[j]-i, with the exception that t[p] != t[q] now.

Given t[p] != t[s], t[q] == t[r] and t[p] != t[q], we can't know whether t[r] == t[s] or not.

So, we have to do brute force expansion with r[i] being at least j+r[j]-i.

Summary of the cases:

  • Case 1: j + r[j] == i, we do brute force expansion starting from r[i] = 1.

  • Case 2: j + r[j] > i.

    • Case 2a: k-r[k] > j-r[j], we have r[i] = r[k] = r[2*j-i] < j+r[j]-i

    • Case 2b: k-r[k] < j-r[j], we have r[i] = j+r[j]-i < r[k].

    • Case 2c: k-r[k] == j-r[j], we do brute force expansion starting from r[i] = j+r[j]-i = r[k].

Note that for the 3 subcases in case 2, r[i] is either exactly or at least the minimum of r[2*j-i] and j+r[j]-i.

Merging the cases:

We don't need to implement these cases separately. Instead, we use the following implementation to cover all the cases.

Step 3. Get the longest palindromic substrings in s

Get the length:

The length of the longest palindrome centering at t[i] is i+r[i]-(i-r[i])-1 = 2*r[i]-1.

The edges of this palindrome must be *. The number of * must be 1 plus the number of valid characters.

Thus, the length of the longest palindrome corresponding to r[i] must be r[i]-1.

Get the starting index:

The longest palindrome centering at t[i] starts at index i - floor((2*r[i] - 1) / 2). Since 2*r[i]-1 must be an odd number, i - floor((2*r[i] - 1) / 2) = i - (2*r[i] - 2) / 2 = i - r[i] + 1.

Since the palindrome must start with *, the first valid character starts at i - r[i] + 2.

Based on the aforementioned equation indexInT / 2 - 1 = indexInS, the palindrome starts at (i - r[i] + 2) / 2 - 1 = (i - r[i]) / 2 in s.

Implementation

Time Complexity

It seems like that we might do brute force expansion at every i so the time complexity is O(N^2), but actually we won't do that many expansions.

Whenever we successfully do an expansion with k steps, we will update j increasing j + r[j] by k. So, the amount of expansion is the same as the increase to j + r[j]. Since j + r[j] at most increases by t.size(), the amount of expansion we did must be at most t.size() times, which takes O(N) time.

Thus, the overall time complexity of Manacher is O(N).

Reference

Problems

Last updated

Was this helpful?