![]() And then we're going to merge the nodes on either side of that edge. In other words, we're going to pick the edge that has the greatest number as it's label. And in each round, we're going to pick the edge that represents the longest remaining overlap in the graph. ![]() ![]() So, the principle behind the greedy shortest common superstring algorithm is that we're going to proceed in rounds. Each edge here is labeled with a number that gets the length of the overlap The length of the suffix prefix match. So remember that the nodes of the overlap graph correspond to reads and the edges of the overlap graph correspond to overlaps or suffix prefix matches between pairs of reads. So let's start with this overlap graph here. We can visualize the greedy shortest common superstring algorithm using an overlap graph. However as we'll see making the greedy decision at each point in the algorithm does not necessarily mean that we'll get to an optimal solution. This seems like a good strategy the shorter the superstring the closer we are to the shortest common superstring. And we call this algorithm greedy, because the algorithm will make a series of decisions and at each decision point, it will choose the option that reduces the length of the eventual superstring the most. Its called greedy shortest common superstring. So in this lecture we'll see an alternative that's actually much faster. So we saw a solution but it was very slow because it involved us trying every possible ordering for n different input strings and their n factorial such orderings. ![]() So this formulation had a downside though, which is that there were really no efficient solutions to solving this problem. In the previous lecture and practical, we introduced a computational problem called the shortest common superstring problem where given a set of input reads, we looked for the shortest string, which is called a superstring, that contains all of the input strings and sub-strings. ![]()
0 Comments
Leave a Reply. |