HDU - 3407 - String-Matching Automata

先上题目:

String-Matching Automata

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 215    Accepted Submission(s): 140

Problem Description

The finite state automaton (FSA) is an important model of behavioral investigations in computer science, linguistics and many other areas. A FSA can be typically modeled as a string pattern recognizer described by a quintuple <Σ, S, s0, δ, F>, where:

Σ is the input alphabet (a finite nonempty set of symbols).
S is a finite nonempty set of states.
s0 is an element in S designated as the initial state.
δ is a function δ: S × Σ → S known as the transition function.
F is a (possibly empty) subset of S whose elements are designated as the final states.

An FSA with the above description operates as follows:

At the beginning, the automaton starts in the initial state s0.
The automaton continuously reads symbols from its input, one symbol at a time, and transits between states according to the transition function δ. To be specific, let s be the current state and w the symbol just read, the automaton moves to the state given by δ(s, w).
When the automaton reaches the end of the input, if the current state belongs to F, the string consisting sequentially of the symbols read by the automaton is declared accepted, otherwise it is declared rejected.

Just as the name implies, a string-matching automaton is a FSA that is used for string matching and is very efficient: they examine each character exactly once, taking constant time per text character. The matching time used (after the automaton is built) is therefore Θ(n). However, the time to build the automaton can be large.

Precisely, there is a string-matching automaton for every pattern P that you search for in a given text string T. For a given pattern of length m, the corresponding automaton has (m + 1) states {q0, q1, …, qm}: q0 is the start state, qm is the only final state, and for each i in {0, 1, …, m}, if the automaton reaches state qi, it means the length of the longest prefix of P that is also a suffix of the input string is i. When we reaches state qm, it means P is a suffix of the currently input string, which suggest we find an occurrence of P.

The following graph shows a string-matching automaton for the pattern “ababaca”, and illustrates how the automaton works given an input string “abababacaba”.


Apparently, the matching process using string-matching automata is quite simple (also efficient). However, building the automaton efficiently seems to be tough, and that’s your task in this problem.

Input

Several lines, each line has one pattern consist of only lowercase alphabetic characters. The length of the longest pattern is 10000. The input ends with a separate line of ‘0’.

Output

For each pattern, output should contain (m + 1) lines(m is the length of the pattern). The nth line describes how the automaton changes its state from state (n-1) after reading a character. It starts with the state number (n – 1), and then 26 state numbers follow. The 1st state number p1 indicates that when the automaton is in state (n-1), it will transit to state p1 after reading a character ‘a’. The 2nd state number p2 indicates that when the automaton is in state (n-1), it will transit to state p2 after reading a character ‘b’… And so on.

Sample Input

ababaca
0

Sample Output

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

2 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

3 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

4 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

5 1 4 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

7 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Source

2010 National Programming Invitational Contest Host by ZSTU

  题意:给你一个串,问你当匹配到某个位置的时候如果当匹配的字母为‘a‘-‘z‘的时候分别需要转移的位置是哪里?

  解法可以用KMP+一点其他操作。另一种方法是直接写一个AC自动机,然后build了自动机以后就沿着Trie树上输入的单词对于某一个节点就直接打印这个节点的next[i][‘a‘~‘z‘]就可以了。

上代码:

 1 #include <cstdio>
 2 #include <cstring>
 3 #include <queue>
 4 #define MAX 10002
 5 using namespace std;
 6
 7 struct Trie{
 8     int next[MAX][26],fail[MAX],end[MAX],num[MAX][26];
 9     int root,L;
10
11     int newnode(){
12         for(int i=0;i<26;i++){ next[L][i]=-1; num[L][i]=0;}
13         end[L++]=0;
14         return L-1;
15     }
16     void init(){
17         L=0;    root=newnode();
18     }
19
20     void insert(char buf[]){
21         int len=strlen(buf);
22         int now = root;
23         for(int i=0;i<len;i++){
24             if(next[now][buf[i]-‘a‘]==-1){
25                 next[now][buf[i]-‘a‘]=newnode();
26
27             }
28             now=next[now][buf[i]-‘a‘];
29         }
30         end[now]++;
31     }
32
33     void build(){
34         queue<int> Q;
35         fail[root]=root;
36         for(int i=0;i<26;i++){
37             if(next[root][i]==-1) next[root][i]=root;
38             else{
39                 fail[next[root][i]]=root;
40                 Q.push(next[root][i]);
41             }
42         }
43         while(!Q.empty()){
44             int now=Q.front();
45             Q.pop();
46             for(int i=0;i<26;i++){
47                 if(next[now][i]==-1) next[now][i]=next[fail[now]][i];
48                 else{
49                     fail[next[now][i]]=next[fail[now]][i];
50                     Q.push(next[now][i]);
51                 }
52             }
53         }
54     }
55
56     void print(char buf[]){
57         int len=strlen(buf);
58         int now=root;
59         for(int i=0;i<=len;i++){
60             printf("%d",i);
61             for(int j=0;j<26;j++) printf(" %d",next[now][j]);
62             printf("\n");
63             now=next[now][buf[i]-‘a‘];
64         }
65     }
66 };
67
68 Trie ac;
69 char s[MAX];
70
71 int main()
72 {
73     //freopen("data.txt","r",stdin);
74     while(scanf("%s",s),strcmp(s,"0")){
75         ac.init();
76         ac.insert(s);
77         ac.build();
78         ac.print(s);//printf("\n");
79     }
80     return 0;
81 }

/*3407*/

时间: 2024-10-18 01:13:07

HDU - 3407 - String-Matching Automata的相关文章

hdu 6194 string string string(后缀数组)

题目链接:hdu 6194 string string string 题意: 给你一个字符串,给你一个k,问你有多少个子串恰好在原串中出现k次. 题解: 后缀数组求出sa后,用height数组的信息去找答案. 每次用k长度的区间去卡height数组,求出该区间的lcp. 该区间的贡献就是ans=lcp-max(height[i],height[i+k]). 如果ans<=0,就不贡献. 比如 2 aaa 后缀数组为: 1 a 2 aa 3 aaa height为 0,1,2 现在扫到[1,2],

hdu 4821 String(字符串hash)

题目链接:hdu 4821 String 题意: 给你一个字符串,问你有多少子串,满足长度为m*len,并且这个子串能分成m个len长度的不同串. 题解: BKDRhash+map来判重.注意的是要以len长分类来扫,这样才不会超时. 1 #include<bits/stdc++.h> 2 #define F(i,a,b) for(int i=a;i<=b;++i) 3 using namespace std; 4 typedef unsigned long long ull; 5 co

HDU 2476 String painter(字符串转变)

题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=2476 题意:给定两个长度相同的串A和B.每次操作可以将A的连续一段改变为另一个字母.求将A转换成B最少需要多少次操作? 思路:首先,我们假设没有A串,那么这就跟 BZOJ1260是一样的了,即答案为DFS(0,n-1)...但是这里有了A串就有可能使得操作次数更少.因为可能有些对应位置字母是相同的.我们设 ans[i]表示前i个字母变成一样的,那么若A[i]=B[i]则ans[i]=ans[i-1]

[POJ] String Matching

String Matching Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 4074   Accepted: 2077 Description It's easy to tell if two words are identical - just check the letters. But how do you tell if two words are almost identical? And how close

String Matching -- Brute Force + Rabin-Karp + KMP

String Matching 这个问题已经被做烂了... 下面是C语言实现集合. http://www-igm.univ-mlv.fr/~lecroq/string/ 留个爪- 暴力解法: 暴力美啊- """ Programmer : EOF Date : 2015.02.28 Code file : nsm.py """ def naive_string_matcher(T, P) : if (T or P) is None : return

HDU 4909 String(组合数学)

HDU 4909 String 题目链接 题意:给定一个字符串全是小写字符,可能有一个位置为?,问号可以替代任何字符,也可以删掉,问有多少连续字串满足所有字母是偶数个 思路:组合数学,计算所有前最串的各个字母的奇偶状态,用一个01串表示,然后记录下个数,对于每个相同的状态,任选两个就能得到一个子序列,答案为所有C(num, 2)的和. 但是这个问题多了一个?的情况,但是没关系,可以枚举?,然后把序列分为3部分去考虑,?之前,?之后,和包含了?的串分开考虑即可 代码: #include <cstd

HDU1306 String Matching 【暴力】

String Matching Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Total Submission(s): 847    Accepted Submission(s): 434 Problem Description It's easy to tell if two words are identical - just check the letters. But

hdu 4909 String(计数)

题目链接:hdu 4909 String 题目大意:给定一个字符串,由小写字母组成,最多包含一个问号,问号可以表示空或者任意一个字母.问有多少个子串,字母出现的次数均为偶数. 解题思路:因为最多又26个字母,对应每个字母的奇数情况用1表示,偶数情况用0.将一个前缀串表示成一个二进制数.然后对于每种相同的数s,任选两个即为一种可行子串(组合数学). 接着对于有问号的情况枚举一下问号替代的字符,然后对于问号后面的状态都要再加上一个该字符.这时计算个数时就要将前后分开讨论了. 这题交C++,结果卡FS

九度OJ 1094 String Matching

题目1094:String Matching 时间限制:1 秒 内存限制:32 兆 特殊判题:否 提交:1098 解决:587 题目描述: Finding all occurrences of a pattern in a text is a problem that arises frequently in text-editing programs. Typically,the text is a document being edited,and the pattern searched