Description
String Matching is an important problem in computer science research and finds applications in Bioinformatics, Data mining,pattern recognition, Internet security and many more areas.
The problem we consider here is a smaller version of it. You are given a string M and N other strings smaller in length than M. You have to find whether each of these N strings is a substring of M. All strings consist of only alphanumeric characters.
You are required to write a C/CPP code to solve the problem.
Input
Input to the program consists of a series of lines. The first line contains the string M (no more than 100000 characters long). The next line contains an integer N (<1000) the number of query strings. Each of the next N lines contain a string S (each of which is no more than2000 characters long).
Output
Output should consist of N lines each with a character ‘Y‘/‘N‘ indicating whether the string S is a substring of String M or not.
Sample Input
Input:
abghABCDE2abAB
ab
Output:
NY
Note: The test data for this problem not only consist of the official test cases from the contest,as well some cases of my own.
A testcase is added on 25.7.2010,after rejudging 3 users loose accepted.
•题意: 给定一个长为M(M≤100000 )的文本串,和N(N≤1000)个长度不超过2000的模式串,问每个模式串是否在文本串中出现过?
•几乎和周一课件上的第一个例题一模一样。。
•把文本串丢到AC自动机里面去跑。
•1.可能有两个相同的模式串(略坑吧。)
•2.一个模式串可能是另一个模式串的后缀,即如果一个点的fail指针指向的点是一个“危险节点”,那么它本身也是一个“危险节点”。
#include <iostream> #include <cstdio> #include <cstring> #include <algorithm> #include <queue> using namespace std; const int maxn = 1000050 ; const int sigma_size = 52 ; int ID[1010] , tot ; char text[100050] , word[2111] ; bool flag[1010] ; int son[maxn][sigma_size] , val[maxn] , f[maxn] , last[maxn] , q[maxn], sz ; inline int idx(char c) { if(c<=‘Z‘) return c - ‘A‘ ; else return c - ‘a‘ + 26 ; } int Insert(char *s){ int u = 0 ; for(int i=0 ; s[i] ; i++) { int v = idx(s[i]) ; if(!son[u][v]) son[u][v] = ++sz ; u = son[u][v] ; } if(!val[u]) val[u] = ++tot ; return val[u]; } void get_fail() { int rear = 0 ; f[0] = 0 ; for(int c=0; c<sigma_size ; c++) { int u = son[0][c] ; if(u) f[u] = last[u] = 0 , q[rear++] = u ; } for(int _i=0; _i<rear ; _i++) { int u = q[_i] ; for(int c=0; c<sigma_size; c++){ int v = son[u][c] ; if(!v) { son[u][c] = son[f[u]][c] ; continue ; } q[rear++] = v; int x = f[u] ; while(x && !son[x][c]) x = f[x] ; f[v] = son[x][c] ; last[v] = val[f[v]] ? f[v] : last[f[v]] ; } } } void print(int u){ while(u) { flag[val[u]] = true ; u = last[u] ; } } void Find(char *s){ int j = 0; for(int i=0; s[i] ; i++) { int c=idx(s[i]); while(j && !son[j][c]) j = f[j] ; j = son[j][c] ; print(j) ; } } int main() { gets(text) ; int n ; scanf("%d", &n) ; getchar() ; for(int i=1; i<=n; i++) { scanf("%s" , word) ; ID[i] = Insert(word); } Find(text) ; for(int i=1; i<=n; i++) { if(flag[ ID[i] ]) puts("Y") ; else puts("N") ; } return 0 ; }
Problem A SPOJ SUB_PROB