The longest repeated subsequence (LRS) problem is the problem of finding the longest subsequences of a string that occurs at least twice.

*The problem differs from problem of finding common substrings. Unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences.*

For example, consider the sequence ATACTCGGA

Length of Longest Repeating Subsequence is 4

Longest Repeating Subsequence is ATCG

**A T **A **C **T C **G **G A

A T **A **C **T C **G **G **A

**Note that repeated characters holds different index in the input string.**

The Longest Repeating Subsequence problem is classic variation of Longest Common Subsequence (LCS) problem. The idea is to find LCS of given string with itself i.e. call LCS(X, X) and excluding the cases when index are same (i = j) as repeated characters holds different index in the input string.

The LRS problem has an **optimal substructure**. That means the problem can be broken down into smaller, simple “subproblems”, which can be broken down into yet simpler subproblems, and so on, until, finally, the solution becomes trivial. The problem can be recursively defined as –

| 0 (if i = 0 or j = 0)

LRS[i][j] = | LRS[i-1][j-1] + 1 (if X[i-1] = X[j-1] & i!=j)

| min (LRS[i-1][j] + 1, LRS[i][j-1] + 1) (if X[i-1] != X[j-1])

Below solution finds the length of longest repeated Subsequence of sequence X recursively by using optimal substructure property of LRS problem.

**C++ implementation –**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
#include <bits/stdc++.h> using namespace std; // Function to find the length of Longest repeated Subsequence // of substring X[0..m-1] and X[0..n-1] int LRSLength(string X, int m, int n) { // return if we have reached the end of either string if (m == 0 || n == 0) return 0; // if characters at index m and n matches and index is different if (X[m - 1] == X[n - 1] && m != n) return LRSLength(X, m - 1, n - 1) + 1; // else if characters at index m and n don't match return max (LRSLength(X, m, n - 1), LRSLength(X, m - 1, n)); } // main function int main() { string X = "ATACTCGGA"; int m = X.length(); cout << "Length of Longest Repeating Subsequence is " << LRSLength(X, m, m); return 0; } |

**Output: **

Length of Longest Repeating Subsequence is 4

The worst case time complexity of above solution is O(2^{n}) and auxiliary space used by the program is O(1). The worst case happens when there is no repeated character present in X (i.e. LRS length is 0) and each recursive call will end up in two recursive calls.

The LRS problem exhibits **overlapping subproblems**. A problem is said to have overlapping subproblems if the recursive algorithm for the problem solves the same subproblem over and over rather than always generating new subproblems.

Let us consider recursion tree for sequence of length 5 having all distinct characters whose LRS length is 0.

As we can see, the same sub-problems (highlighted in same color) are getting computed again and again. We know that problems having optimal substructure and overlapping subproblems can be solved by dynamic programming, in which subproblem solutions are memoized rather than computed again and again. This method is illustrated below –

**Memoized C++ implementation –**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
#include <bits/stdc++.h> using namespace std; // Function to find the length of Longest repeated Subsequence // of substring X[0..m-1] and X[0..n-1] int LRSLength(string X, int m, int n, auto &lookup) { // return if we have reached the end of either string if (m == 0 || n == 0) return 0; // construct a unique map key from dynamic elements of the input string key = to_string(m) + "|" + to_string(n); // if sub-problem is seen for the first time, solve it and // store its result in a map if (lookup.find(key) == lookup.end()) { // if characters at index m and n matches and index is different if (X[m - 1] == X[n - 1] && m != n) lookup[key] = LRSLength(X, m - 1, n - 1, lookup) + 1; else // else if characters at index m and n don't match lookup[key] = max (LRSLength(X, m, n - 1, lookup), LRSLength(X, m - 1, n, lookup)); } // return the subproblem solution from the map return lookup[key]; } // main function int main() { string X = "ATACTCGGA"; int m = X.length(); // create a map to store solutions of subproblems unordered_map<string, int> lookup; cout << "Length of Longest Repeating Subsequence is " << LRSLength(X, m, m, lookup); return 0; } |

**Output: **

Length of Longest Repeating Subsequence is 4

The time complexity of above solution is O(n^{2}) and auxiliary space used by the program is O(n^{2}).

Above *Memo*ized version follows the top-down approach, since we first break the problem into subproblems and then calculate and store values. We can also solve this problem in bottom-up manner. In the bottom-up approach, we calculate the smaller values of LRS(i, j) first, then build larger values from them.

**C++ implementation –**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
#include <bits/stdc++.h> using namespace std; // Function to find the length of Longest repeated Subsequence // of substring X[0..n-1] int LRSLength(string X, int n) { // lookup table stores solution to already computed sub-problems // i.e. lookup[i][j] stores the length of LRS of substring // X[0..i-1] and X[0..j-1] int lookup[n + 1][n + 1]; // first column of the lookup table will be all 0 for (int i = 0; i <= n; i++) lookup[i][0] = 0; // first row of the lookup table will be all 0 for (int j = 0; j <= n; j++) lookup[0][j] = 0; // fill the lookup table in bottom-up manner for (int i = 1; i <= n; i++) { for (int j = 1; j <= n; j++) { // if characters at index i and j matches // and the index is different if (X[i - 1] == X[j - 1] && i != j) lookup[i][j] = lookup[i - 1][j - 1] + 1; // else if characters at index i and j are different else lookup[i][j] = max(lookup[i - 1][j], lookup[i][j - 1]); } } // Uncomment below code to print the lookup table /* for (int i = 0; i <= m; i++) { for (int j = 0; j <= n; j++) cout << lookup[i][j] << " "; cout << endl; }*/ // LRS will be last entry in the lookup table return lookup[n][n]; } // main function int main() { string X = "ATACTCGGA"; int n = X.length(); cout << "Length of Longest Repeating Subsequence is " << LRSLength(X, n); return 0; } |

**Output: **

Length of Longest Repeating Subsequence is 4

The time complexity of above solution is O(n^{2}) and auxiliary space used by the program is O(n^{2}). The space complexity of above solution can be improved to O(n) as calculating LRS of a row of the LRS table requires only the solutions to the current row and the previous row.

**How to extend above solution for printing Longest Repeating Subsequence?**

The idea is very similar to printing Longest Common Subsequence of two strings. Refer this post for details.

**C++ implementation –**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
#include <bits/stdc++.h> using namespace std; // Maximum string length #define N 20 // lookup[i][j] stores the length of LRS of substring X[0..i-1], X[0..j-1] int lookup[N][N]; // Function to find LRS of substrings X[0..m-1], X[0..n-1] string LRS(string X, int m, int n) { // if we have reached the end of either sequence // return empty string if (m == 0 || n == 0) return string(""); // if characters at index m and n matches and index is different if (X[m - 1] == X[n - 1] && m != n) return LRS(X, m - 1, n - 1) + X[m - 1]; else // else if characters at index m and n don't match { if (lookup[m - 1][n] > lookup[m][n - 1]) return LRS(X, m - 1, n); else return LRS(X, m, n - 1); } } // Function to fill lookup table by finding the length of LRS // of substring X[0..n-1] void LRSLength(string X, int n) { // first row and first column of the lookup table // are already 0 as lookup[][] is globally declared // fill the lookup table in bottom-up manner for (int i = 1; i <= n; i++) { for (int j = 1; j <= n; j++) { // if characters at index i and j matches // and the index is different if (X[i - 1] == X[j - 1] && i != j) lookup[i][j] = lookup[i - 1][j - 1] + 1; // else if characters at index i and j are different else lookup[i][j] = max(lookup[i - 1][j], lookup[i][j - 1]); } } // Uncomment below code to print the lookup table /* for (int i = 0; i <= m; i++) { for (int j = 0; j <= n; j++) cout << lookup[i][j] << " "; cout << endl; }*/ } // main function int main() { string X = "ATACTCGGA"; int n = X.length(); // fill lookup table LRSLength(X, n); // find Longest Repeating Subsequence cout << "The Longest Repeating Subsequence is " << LRS(X, n, n); return 0; } |

**Output: **

The Longest Repeating Subsequence is ATCG

**Exercise:** Write space optimized code for iterative version.

**Thanks for reading.**

Please use ideone or C++ Shell or any other online compiler link to post code in comments.

Like us? Please spread the word and help us grow. Happy coding 🙂