Longest Common Subsequence of K-sequences

The longest common subsequence (LCS) problem is the problem of finding the longest subsequence that is present in given two sequences in the same order. i.e. find a longest sequence which can be obtained from the first original sequence by deleting some items, and from the second original sequence by deleting other items.


 

The problem differs from problem of finding common substrings. Unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences.
 

Prerequisite: Longest Common Subsequence | Introduction & LCS Length

 
In previous post, we have discussed Longest Common Subsequence of two sequences. In this post, we will extend the solution to three sequences. The proposed solution can be easily generalized to handle more than three sequences as well.

For example, consider the three following sequences X, Y and Z


X: ABCBDAB
Y: BDCABA
Z: BADACB

 
The length of LCS is 4 and LCS is BDAB

 


 

We have seen in the previous post, LCS problem has an optimal substructure which means that the main problem can be broken down into smaller, simple “subproblems”, which can further be divided into yet simpler subproblems..
 
1. Let us consider given sequences X, Y and Z of length i, j and k that both end in the same element. To find their LCS, shorten each sequence by removing the last element, find the LCS of the shortened sequences, and to that LCS append the removed element. So if X[i] = Y[j] = Z[k], we can say that,

LCS(X[1..i], Y[1..j], Z[1..k]) = LCS(X[1..i-1], Y[1..j-1], Z[1..k-1]) + X[i]

 
2. Now suppose that the given sequences do not end in the same symbol. Then the LCS of X, Y, Z is the longer of the given sequences

LCS(X[1..i-1], Y[1..j], Z[1..k]),
LCS(X[1..i], Y[1..j-1], Z[1..k])
and
LCS(X[1..i], Y[1..j], Z[1..k-1])

.
 

Below solution finds the length of LCS of sequences X[0..m-1], Y[0..n-1] and Z[0..o-1] recursively by using optimal substructure property of LCS problem.

 
C++ implementation –
 

Download   Run Code

Output:

The length of LCS is 4

The worst case time complexity of above solution is O(3(m+n+o)). The worst case happens when there is no common subsequence present in X, Y, Z (i.e. LCS length is 0) and each recursive call will end up in three recursive calls.

 


 

The LCS problem exhibits overlapping subproblems. A problem is said to have overlapping subproblems if the recursive algorithm for the problem solves the same subproblem over and over rather than always generating new subproblems. We know that problems having optimal substructure and overlapping subproblems can be solved by dynamic programming, in which subproblem solutions are memoized rather than computed again and again. This method is illustrated below –

 
Memoized C++ implementation –
 

Download   Run Code

Output:

The length of LCS is 4

The time complexity of above solution is O(mno) and auxiliary space used by the program is O(mno).

 


 

Above Memoized version follows the top-down approach, since we first break the problem into subproblems and then calculate and store values. We can also solve this problem in bottom-up manner. In the bottom-up approach, we calculate the smaller values of LCS(i, j, k) first, then build larger values from them.


               | 0                                    if i == 0 or j == 0 or k == 0
LCS[i][j][k] = | LCS[i-1][j-1][k-1] + 1               if X[i-1] == Y[j-1] == Z[k-1]
               | longest(LCS[i-1][j][k], LCS[i][j-1][k],    otherwise
                        LCS[i][j][k-1])

 
C++ implementation –
 

Download   Run Code

Output:

The length of LCS is 4

 
The time complexity of above solution is O(mno) and auxiliary space used by the program is O(mno).

 
References: https://en.wikipedia.org/wiki/Longest_common_subsequence_problem

 
Thanks for reading.




Please use ideone or C++ Shell or any other online compiler link to post code in comments.
Like us? Please spread the word and help us grow. Happy coding 🙂
 





Leave a Reply

Notify of
avatar
wpDiscuz