kuangbin专题十六 KMP&&扩展KMP POJ3080 Blue Jeans

The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a
program that will find commonalities amongst given snippets of DNA that
can be correlated with individual survey information to identify new
genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in
the order in which they are found in the molecule. There are four
bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base
DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single
integer n indicating the number of datasets. Each dataset consists of
the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base
subsequence common to all of the given base sequences. If the longest
common subsequence is less than three bases in length, display the
string "no significant commonalities" instead. If multiple subsequences
of the same longest length exist, output only the subsequence that comes
first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT

感觉暴力可以,但是没有去写。想用kmp,但是又无从下手,就学习了一波操作。

首先暴力第一串的所有子串,然后再其他字符串里面找是否存在。技巧之一就是从长到短枚举。

暴力:

 1 #include<iostream>
 2 #include<stdio.h>
 3 #include<string>
 4 #include<set>
 5 #include<vector>
 6 using namespace std;
 7 vector<string> t;
 8 set<string> ss;
 9 string s;
10 int _,n;
11
12 string fun() {
13     ss.clear();
14     string str=t[0];
15     bool flag;
16     for(int len=60;len>=3;len--) {
17         for(int ix=0;ix<=60-len;ix++) {
18             string temp=str.substr(ix,len);
19             flag=true;
20             for(int k=1;k<t.size();k++) {
21                 if(t[k].find(temp)==-1) {
22                     flag=false;
23                     break;
24                 }
25             }
26             if(flag) ss.insert(temp);
27         }
28         if(ss.size()) return *ss.begin();
29     }
30     return "no significant commonalities";
31 }
32
33 int main() {
34    // freopen("in","r",stdin);
35     for(scanf("%d",&_);_;_--) {
36         scanf("%d",&n);
37         for(int i=0;i<n;i++) {
38             cin>>s;
39             t.push_back(s);
40         }
41         cout<<fun()<<endl;
42         t.clear();
43     }
44
45 }

kmp思想:不需要找第一个串的所有子串,只需枚举每一个后缀,去和其他字符串匹配就行了。其实这个匹配过程就好比所有子串进行匹配了。

 1 #include<stdio.h>
 2 #include<iostream>
 3 #include<string>
 4 #include<algorithm>
 5 #include<vector>
 6 using namespace std;
 7 int _,n,Next[61];
 8 string s,strans;
 9 vector<string> t;
10
11 void prekmp(string s) {
12     int len=s.size();
13     int i,j;
14     j=Next[0]=-1;
15     i=0;
16     while(i<len) {
17         while(j!=-1&&s[i]!=s[j]) j=Next[j];
18         if(s[++i]==s[++j]) Next[i]=Next[j];
19         else Next[i]=j;
20     }
21 }
22
23 int kmp(string p,string t) {
24     int len=t.size();
25     int i=0,j=0,res=-1;
26     while(i<len) {
27         while(j!=-1&&t[i]!=p[j]) j=Next[j];
28         ++i;++j;
29         res=max(res,j);
30     }
31     return res;
32 }
33
34
35 int main() {
36    // freopen("in","r",stdin);
37     for(scanf("%d",&_);_;_--) {
38         scanf("%d",&n);
39         for(int i=0;i<n;i++) {
40             cin>>s;
41             t.push_back(s);
42         }
43         int ans=-1;
44         string str=t[0];
45         for(int i=0;i<60;i++) {
46             string temp=str.substr(i,60-i);
47             prekmp(temp);
48             int maxx=60;
49             for(int j=1;j<t.size();j++) {
50                 maxx=min(maxx,kmp(temp,t[j]));
51             }
52             if(maxx>ans) {
53                 strans=temp.substr(0,maxx);
54                 ans=maxx;
55             } else if(maxx==ans) {
56                 string anstemp=temp.substr(0,maxx);
57                 if(anstemp<strans) strans=anstemp;
58             }
59         }
60         if(strans.size()<3) cout<<"no significant commonalities"<<‘\n‘;
61         else cout<<strans<<‘\n‘;
62         t.clear();
63     }
64 }

原文地址:https://www.cnblogs.com/ACMerszl/p/10290154.html

时间: 2024-10-08 11:32:05

kuangbin专题十六 KMP&&扩展KMP POJ3080 Blue Jeans的相关文章

kuangbin专题十六 KMP&amp;&amp;扩展KMP HDU1686 Oulipo

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book: Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, pu

kuangbin专题十六 KMP&amp;&amp;扩展KMP HDU1711 Number Sequence

Given two sequences of numbers : a[1], a[2], ...... , a[N], and b[1], b[2], ...... , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). Your task is to find a number K which make a[K] = b[1], a[K + 1] = b[2], ...... , a[K + M - 1] = b[M]. If there are

kuangbin专题十六 KMP&amp;&amp;扩展KMP HDU1238 Substrings

You are given a number of case-sensitive strings of alphabetic characters, find the largest string X, such that either X, or its inverse can be found as a substring of any of the given strings. InputThe first line of the input file contains a single

kuangbin专题十六 KMP&amp;&amp;扩展KMP HDU3347 String Problem(最小最大表示法+kmp)

Give you a string with length N, you can generate N strings by left shifts. For example let consider the string “SKYLONG”, we can generate seven strings: String Rank SKYLONG 1 KYLONGS 2 YLONGSK 3 LONGSKY 4 ONGSKYL 5 NGSKYLO 6 GSKYLON 7 and lexicograp

kuangbin专题十六 KMP&amp;&amp;扩展KMP HDU3613 Best Reward(前缀和+manacher or ekmp)

After an uphill battle, General Li won a great victory. Now the head of state decide to reward him with honor and treasures for his great exploit. One of these treasures is a necklace made up of 26 different kinds of gemstones, and the length of the

[kuangbin带你飞]专题十六 KMP &amp; 扩展KMP &amp; Manacher :G - Power Strings POJ - 2406(kmp简单循环节)

[kuangbin带你飞]专题十六 KMP & 扩展KMP & Manacher G - Power Strings POJ - 2406 题目: Given two strings a and b we define a*b to be their concatenation. For example, if a = "abc" and b = "def" then a*b = "abcdef". If we think of

KMP &amp; 扩展KMP &amp; Manacher 专题

KMP & 扩展KMP & Manacher  专题 先来模版: void getNext(int *b,int m) { Next[0]=-1; int i=0,j=-1; while(i<m&&j<m){ if(j==-1||b[i]==b[j]) Next[++i]=++j; else j=Next[j]; } } int kmp(int *a,int *b,int n,int m) { getNext(b,m); int i=0,j=0; while(i

字符串(1)---KMP &amp; 扩展KMP &amp; Manacher

练习:点击打开链接 字符串也是ACM中的重头戏,基本内容有KMP ,扩展KMP, Manacher ,AC自动机,后缀数组,后缀自动机.按照专题来做共分三部分. LCS LIS LCIS不知道算不算....点击打开链接 小技巧:匹配问题不区分大小写,则将其全部转为小写. 暴力匹配: 用strstr函数就能解决       I M N Z(枚举长度 三份) 一.KMP算法 解决单一模式串匹配问题. 利用失配后的nxt数组减少移位,达到O(n)级别.资料自行百度. 延展: 1.求最小循环节 点击打开

开发指南专题十六:JEECG微云快速开发平台Excel导出

Ladda 应用提交表单的时候显示loading加载中 包括不同位置,不同效果 不同大小,位置,效果,进度条等 演示 XML/HTML Code <article class="examples" style="margin-top:0px;"> <section class="button-demo"> <h3>expand-left</h3> <button class="lad