字符串处理------Brute Force与KMP

一,字符串的简单介绍

例:POJ1488   http://poj.org/problem?id=1488

题意:替换文本中的双引号;

#include <iostream>
#include <cstring>
#include <cstdio>
using namespace std;

int main()
{
    char c,flag=1;
    //freopen("Atext.in","r",stdin);
    while((c=getchar())!=EOF){
        if(c==‘"‘){printf("%s",(flag? "``" : "‘‘"));flag=!flag;}
        else    printf("%c",c);
    }
    return 0;
}

二,模式匹配------Brute Force与KMP简介

1,Brute Force算法

例:POJ3080  http://poj.org/problem?id=3080 枚举,BF

新:strstr(str1,str2) 函数用于判断字符串str2是否是str1的子串。如果是,则该函数返回str2在str1中首次出现的地址;否则,返回NULL。

Blue Jeans

Description

The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a program that will find commonalities amongst given snippets of DNA that can be correlated with individual survey information to identify new genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in the order in which they are found in the molecule. There are four bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single integer n indicating the number of datasets. Each dataset consists of the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base subsequence common to all of the given base sequences. If the longest common subsequence is less than three bases in length, display the string "no significant commonalities" instead. If multiple subsequences of the same longest length exist, output only the subsequence that comes first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT

Source

South Central USA 2006

#include <iostream>
#include <cstring>
#include <cstdio>
using namespace std;

int main()
{
    //freopen("Atext.in","r",stdin);
    int n,m,len;
    char ans[70],s[15][65],tmp[65];
    cin >> n;
    while(n--){
        cin >> m;
        int k=3,flag=0;        //枚举的字符串长度;
        ans[0]=‘\0‘,tmp[0]=‘\0‘,len=0;
        for(int i=0;i<m;i++)
            for(int j=0;j<60;j++)
                cin >> s[i][j];
        while(k<=60){   //字符串起点
            for(int i=0;i<=60-k;i++){    //枚举长度为k的字符串的起点
                memset(tmp,0,sizeof(tmp));//必须记得清空数组!!
                for(int j=i,t=0;j<i+k;j++)//这里的i+k,原来敲的是k,傻了傻了,还找了半天!!!
                    tmp[t++]=s[0][j];
                for(int j=1;j<m;j++){
                    if(strstr(s[j],tmp)==NULL){flag=1;break;}//不是公共子串就标记跳出;
                }
                if(flag==0){
                    if(k>len){strcpy(ans,tmp);len=k;}
                    else if(k==len&&strcmp(ans,tmp)>0){strcpy(ans,tmp);len=k;}
                }
                flag=0;
            }
            k++;
        }
        if(len!=0){
            for(int i=0;i<len;i++)
                cout << ans[i] ;
        }
        else
            cout << "no significant commonalities" ;
        cout << endl;
    }
    return 0;
}

2,KMP算法

 例:POJ3461 Ouliop

 例:POJ3461 Oulipo

#include <iostream>
#include <cstdio>
#include <cstring>
const int maxn=10005;
using namespace std;
string s,t;
int n,m;
int nex[maxn];
void getnex(){
    int j=0,k=-1;
    nex[0]=-1;
    while(j<n){
        if(k==-1||t[j]==s[k]){
            nex[++j]=++k;
        }else
            k=nex[k];
    }
}
int kmp(){
    int i=0,j=0,cnt=0;
    getnex();
    while(i<m){
        if(j==-1||s[i]==t[j]){
            i++;j++;
        }else
            j=nex[j];
        if(j==n)
            cnt++;
    }
    return cnt;
}
int main()
{
    int c;
    //freopen("Atext.in","r",stdin);
    ios::sync_with_stdio(false);    //加了这个,关闭了输入输出同步就过了,不然超时;
    cin >> c;
    while(c--){
        int ans=0;
        cin >> t >> s;
        n=t.size();
        m=s.size();
        ans=kmp();
        cout << ans << endl;
    }
    return 0;
}

原文地址:https://www.cnblogs.com/Cloud-king/p/8488514.html

时间: 2024-11-06 07:41:52

字符串处理------Brute Force与KMP的相关文章

String Matching -- Brute Force + Rabin-Karp + KMP

String Matching 这个问题已经被做烂了... 下面是C语言实现集合. http://www-igm.univ-mlv.fr/~lecroq/string/ 留个爪- 暴力解法: 暴力美啊- """ Programmer : EOF Date : 2015.02.28 Code file : nsm.py """ def naive_string_matcher(T, P) : if (T or P) is None : return

DVWA之Brute Force

DVWA简介 DVWA(Damn Vulnerable Web Application)是一个用来进行安全脆弱性鉴定的PHP/MySQL Web应用,旨在为安全专业人员测试自己的专业技能和工具提供合法的环境,帮助web开发者更好的理解web应用安全防范的过程. DVWA共有十个模块,分别是Brute Force(暴力(破解)).Command Injection(命令行注入).CSRF(跨站请求伪造).File Inclusion(文件包含).File Upload(文件上传).Insecure

hdoj 4971 A simple brute force problem. 【最大闭合权 --&gt; 最小割】

题目:hdoj 4971 A simple brute force problem. 题意:给出 n 个任务和 m 项技术,完成某个任务需要其中几项技术,完成某个任务有奖金,学习某个技术需要钱,技术之间有父子关系,某项技术可能需要先学习其他技术,然后问你选择做那些任务获得收益最大? 分析:看题意的黑体字部分,就是一个标准的闭合权问题,这个题目的关键忽悠点在于技术之间的关系,导致很多人想到了dp以及树形dp. 其实就是一个闭合权问题模板,官方题解说如果技术之间存在相互的关系需要缩点,其实不用缩点也

HDU 4971 A simple brute force problem.(dp)

HDU 4971 A simple brute force problem. 题目链接 官方题解写的正解是最大闭合权,但是比赛的时候用状态压缩的dp也过掉了- -,还跑得挺快 思路:先利用dfs预处理出每个项目要完成的技术集合,那么dp[i][j]表示第i个项目,已经完成了j集合的技术,由于j这维很大,所以利用map去开数组 代码: #include <cstdio> #include <cstring> #include <algorithm> #include &l

【最小割】HDU 4971 A simple brute force problem.

说是最大权闭合图.... 比赛时没敢写.... 题意 一共有n个任务,m个技术 完成一个任务可盈利一些钱,学习一个技术要花费钱 完成某个任务前需要先学习某几个技术 但是可能在学习一个任务前需要学习另几个任务 求最多能赚多少钱咯 先将缩点将需要一起学掉的技术缩成一个点 建s--任务 权值为该任务盈利多少钱 建技术(缩点后)-t 权值为学习这技术的花费(总) 任务-技术 (完成该任务所需的每个技术都需要建边)权值为INF #include<stdio.h> #include<stdlib.h

小白日记46:kali渗透测试之Web渗透-SqlMap自动注入(四)-sqlmap参数详解- Enumeration,Brute force,UDF injection,File system,OS,Windows Registry,General,Miscellaneous

sqlmap自动注入 Enumeration[数据枚举] --privileges -U username[CU 当前账号] -D dvwa -T users -C user --columns  [指定数据库,表,列] --exclude-sysdbs [排除系统层的库] ******************************************************************************* #查具体数据 [前提:当前数据库用户有权读取informatio

HDU 4971 A simple brute force problem.(最小割,最大权闭合图)

http://acm.hdu.edu.cn/showproblem.php?pid=4971 A simple brute force problem. Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others) Total Submission(s): 182    Accepted Submission(s): 115 Problem Description There's a com

初二DVWA之Brute Force (密码爆破)

初学安全,我也不知道从哪里开始,网上找了点教程,接触了DVWA,我用自己的服务器搭建了一个平台,从头开始看.第一个模块就是Brute Force,说白了就是暴力破解.然后我发现,我操,这玩意儿跟我写的爬虫贼像???? 原理应该就是先抓包,然后伪造一个header,不停的改变密码,然后post上去,找到一个可能的密码...教程上介绍了这么一个工具,burpsuite,原理就应该是上面说的,不过里面功能很多,我只看了最简单了,就是爆破. 首先浏览器界面接换到你要登录的界面,然后在浏览器里面设置代理,

HDU - 4971 A simple brute force problem. (DP)

Problem Description There's a company with several projects to be done. Finish a project will get you profits. However, there are some technical problems for some specific projects. To solve the problem, the manager will train his employee which may