Shortest Common Supersequence | Finding all SCS

The shortest common supersequence (SCS) is the problem of finding the shortest supersequence Z of given sequences X and Y such that both X & Y are subsequences of Z.

 

For example,


Consider the two following sequences X and Y
X: ABCBDAB
Y: BDCABA

The length of SCS is 9
SCS are ABCBDCABA, ABDCABDAB and ABDCBDABA

 


 

1. Finding a Shortest Common Supersequence –

 
The idea is to fill lookup table of SCS by finding the length of SCS of given sequence X[0..m-1] and Y[0..n-1] in bottom-up manner and then recursively find the SCS in top-down manner using values in the lookup table i.e. we follow the path from the bottom right corner of lookup table to its top left corner when reading out an SCS.

  • If the current characters of X and Y are equal, then they are part of the SCS and we move diagonally in the lookup table (i.e. we recurse to find SCS of sequence X[0..m-2], Y[0..n-2]).
  •  

  • If the current characters of X and Y are different, we go up or left, depending on which cell has a lower number.
    1. if the top cell of current cell has less value than the left cell, then we include current character of the sequence X in SCS and recurse to find SCS of sequence X[0..m-2], Y[0..n-1]
    2. if the left cell of current cell has less value than the top cell, then we include current character of the sequence Y in SCS and recurse to find SCS of sequence X[0..m-1], Y[0..n-2]
       
  • If we have reached the end of either sequence, then the other sequence forms part of the SCS

C++

Download   Run Code

Output:

The Shortest Common Supersequence of ABCBDAB and BDCABA is ABCBDCABA

The time complexity of above solution is O(nm) and auxiliary space used by the program is O(mn).

 


 

2. Finding All Shortest Common Supersequence –

The Shortest Common Supersequence for two strings is not unique and above code doesn’t generate all SCS of the given sequences. It will always print one SCS. The problem with the above code is when the last character of X and Y are different, we either go up or left in the lookup table depending upon which side has smaller value.
 

How can we modify the code so it prints all possible SCS?

 

The idea remains the same except if the top cell is equal to the left cell, then we consider both top and left directions else we go in direction of smaller value.

In other words, if the last characters of X and Y are not same, then SCS can be constructed from either top side or from the left side of lookup table depending upon which side has smaller value. If both the values are equal, then SCS can be constructed from both directions in the lookup table. So we go in direction of smaller value or go in both directions if the values are equal.

C++

Download   Run Code

Output:

All Shortest Common Supersequence of ABCBDAB and BDCABA are –

ABCBDCABA
ABDCABDAB
ABDCBDABA

 
Thanks for reading.




Please use ideone or C++ Shell or any other online compiler link to post code in comments.
Like us? Please spread the word and help us grow. Happy coding 🙂
 





Leave a Reply

Notify of
avatar
Sort by:   newest | oldest | most voted
prince vishal
prince vishal

what is the complexity of this code

wpDiscuz