zoj 1076 Gene Assembly

Gene Assembly


Time Limit: 2 Seconds      Memory Limit: 65536 KB


Statement of the Problem

With the large amount of genomic DNA sequence data being made available, it is becoming more important to find genes (parts of the genomic DNA which are responsible for the synthesis of proteins) in these sequences. It is known that for eukaryotes (in contrast to prokaryotes) the process is more complicated, because of the presence of junk DNA that interrupts the coding region of genes in the genomic sequence. That is, a gene is composed by several pieces (called exons) of coding regions. It is known that the order of the exons is maintained in the protein synthesis process, but the number of exons and their lengths can be arbitrary.

Most gene finding algorithms have two steps: in the first they search for possible exons; in the second they try to assemble a largest possible gene, by finding a chain with the largest possible number of exons. This chain must obey the order in which the exons appear in the genomic sequence. We say that exon i appears before exon j if the end of i precedes the beginning of j.

The objective of this problem is, given a set of possible exons, to find the chain with the largest possible number of exons that cound be assembled to generate a gene.

Input Format

Several input instances are given. Each instance begins with the number 0 < n < 1000 of possible exons in the sequence. Then, each of the next n lines contains a pair of integer numbers that represent the position in which the exon starts and ends in the genomic sequence. You can suppose that the genomic sequence has at most 50000 basis. The input ends with a line with a single 0.

Output Format

For each input instance your program should print in one line the chain with the largest possible number of exons, by enumerating the exons in the chain. If there is more than one chain with the same number of exons, your program can print anyone of them.

Sample Input

6
340 500
220 470
100 300
880 943
525 556
612 776
3
705 773
124 337
453 665
0

Sample Output

3 1 5 6 4
2 3 1

类似于并查集的问题,首先排序,然后遍历即可

 1 #include <iostream>
 2 #include <cmath>
 3 #include <cstdio>
 4 #include <vector>
 5 #include <list>
 6 #include <string>
 7 #include <cstring>
 8 #include <cstdio>
 9 #include <algorithm>
10 #include <set>
11
12 using namespace std;
13
14 struct Exon
15 {
16     int start, end;
17     int index;
18 };
19
20 bool cmp(const Exon& e1, const Exon& e2)
21 {
22     if (e1.start == e2.start)
23         return e1.end < e2.end;
24
25     return e1.start < e2.start;
26 }
27
28 int main()
29 {
30     int n;
31
32     while (cin >> n && n)
33     {
34         vector<Exon> vec;
35         for (int i = 0; i < n; ++i)
36         {
37             Exon exon;
38             cin >> exon.start >> exon.end;
39             exon.index = i + 1;
40             vec.push_back(exon);
41         }
42         sort(vec.begin(), vec.end(), cmp);
43
44         vector<int> chains[1005];
45         int chainsNum = 0;
46         int len[1005], end[1005];
47         memset(len, 0, 1005 * sizeof(int));
48
49         for (int i = 0; i < vec.size(); ++i)
50         {
51             Exon exon = vec[i];
52
53             int j = 0;
54             for (; j < chainsNum; j++)
55             {
56                 if (end[j] <= exon.start)
57                 {
58                     end[j] = exon.end;
59                     chains[j].push_back(exon.index);
60                 }
61             }
62             if (j == chainsNum)
63             {
64                 chains[chainsNum].push_back(exon.index);
65                 end[chainsNum] = exon.end;
66                 ++chainsNum;
67             }
68         }
69
70         int maxIndex, max = 0;
71         for (int i = 0; i < chainsNum;i++)
72         {
73             if (chains[i].size() > max)
74             {
75                 max = chains[i].size();
76                 maxIndex = i;
77             }
78         }
79
80         for (int i = 0; i < chains[maxIndex].size() - 1; i++)
81         {
82             cout << chains[maxIndex][i] << " ";
83         }
84         cout << chains[maxIndex].back() << endl;
85     }
86 }
时间: 2024-12-11 17:23:27

zoj 1076 Gene Assembly的相关文章

贪心,Gene Assembly

题目链接:http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemId=76 解题报告: 1.类似活动安排问题. 2.输出格式要注意. #include <stdio.h> #include <string.h> #include <algorithm> using namespace std; struct gene { int s;///起始 int f;///结束 int index;///编号 } a[10

zoj1076 Gene Assembly

这道和zoj1025一样,本质是贪心算法,首先要求任意最长的序列,我们只要保证最长就行,也就是在一幅图中找一个最长的链,首先我们需要根据y排序(输入为x,y),因为y大的肯定在y小的后面,然后就直接贪心,前面取不到后面就不可能取到那个数,证明了贪心的正确性. #include<cstdio> #include<algorithm> #include<cmath> #include<map> #include<iostream> #include&

zoj题目分类

饮水思源---zoj 转载自:http://bbs.sjtu.edu.cn/bbscon,board,ACMICPC,file,M.1084159773.A.html 注:所有不是太难的题都被归成了“简单题”,等到发现的时候已经太晚了,我太死脑筋 了……:( 有些题的程序我找不到了,555……:( SRbGa的题虽然都很经典……但是由于其中的大部分都是我看了oibh上的解题报告后做 的,所以就不写了…… 题目排列顺序没有规律……:( 按照个人感觉,最短路有的算做了DP,有的算做了图论. 有些比较

POJ百道水题列表

以下是poj百道水题,新手可以考虑从这里刷起 搜索1002 Fire Net1004 Anagrams by Stack1005 Jugs1008 Gnome Tetravex1091 Knight Moves1101 Gamblers1204 Additive equations 1221 Risk1230 Legendary Pokemon1249 Pushing Boxes 1364 Machine Schedule1368 BOAT1406 Jungle Roads1411 Annive

zoj 1027 Human Gene Functions

Human Gene Functions Time Limit: 2 Seconds      Memory Limit: 65536 KB It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four letters, A, C, G, and T. Biologists have been

ZOJ 3805 Machine

搜索.... Machine Time Limit: 2 Seconds      Memory Limit: 65536 KB In a typical assembly line, machines are connected one by one. The first machine's output product will be the second machine's raw material. To simplify the problem, we put all machines

Falcon Genome Assembly Tool Kit Manual

Falcon Falcon: a set of tools for fast aligning long reads for consensus and assembly The Falcon tool kit is a set of simple code collection which I use for studying efficient assembly algorithm for haploid and diploid genomes. It has some back-end c

Gene Ontology (GO) 注释

Gene Ontology (GO) 注释 Posted on 2017-06-11 |  In 生信 相似的基因在不同物种中,其功能往往保守的.显然,需要一个统一的术语用于描述这些跨物种的同源基因及其基因产物的功能,否则,不同的实验室对相同的基因的功能的描述不同,将极大限制学术的交流.而 Gene Ontology (GO) 项目正是为了能够使对各种数据库中基因获基因产物功能描述相一致的努力结果. 所谓的 GO,是生物学功能注释的一个标准词汇表术语(GO term),将基因的功能分为三部分:

The sequence and de novo assembly of the giant panda genome.ppt

sequencing:使用二代测序原因:高通量,短序列 不用长序列原因: 1.算法错误率高 2.长序列测序将嵌合体基因错误积累.嵌合体基因:通过重组由来源与功能不同的基因序列剪接而形成的杂合基因 sequencing: 增多的total length>N>gap>missing in genome The reads with a frequency > 1 were called duplicated reads, and we defined the duplication r