AC自动机---Searching the String

ZOJ   3228

题目网址:http://acm.hust.edu.cn/vjudge/problem/viewProblem.action?id=16401

Description

Little jay really hates to deal with string. But moondy likes it very much, and she‘s so mischievous that she often gives jay some dull problems related to string. And one day, moondy gave jay another problem, poor jay finally broke out and cried, " Who can help me? I‘ll bg him! "

So what is the problem this time?

First, moondy gave jay a very long string A. Then she gave him a sequence of very short substrings, and asked him to find how many times each substring appeared in string A. What‘s more, she would denote whether or not founded appearances of this substring are allowed to overlap.

At first, jay just read string A from begin to end to search all appearances of each given substring. But he soon felt exhausted and couldn‘t go on any more, so he gave up and broke out this time.

I know you‘re a good guy and will help with jay even without bg, won‘t you?

Input

Input consists of multiple cases( <= 20 ) and terminates with end of file.

For each case, the first line contains string A ( length <= 10^5 ). The second line contains an integer N ( N <= 10^5 ), which denotes the number of queries. The next N lines, each with an integer type and a string a ( length <= 6 ), type = 0 denotes substring a is allowed to overlap and type = 1 denotes not. Note that all input characters are lowercase.

There is a blank line between two consecutive cases.

Output

For each case, output the case number first ( based on 1 , see Samples ).

Then for each query, output an integer in a single line denoting the maximum times you can find the substring under certain rules.

Output an empty line after each case.

Sample Input

ab
2
0 ab
1 ab

abababac
2
0 aba
1 aba

abcdefghijklmnopqrstuvwxyz
3
0 abc
1 def
1 jmn

Sample Output

Case 1
1
1

Case 2
3
2

Case 3
1
1
0

Hint

In Case 2,you can find the first substring starting in position (indexed from 0) 0,2,4, since they‘re allowed to overlap. The second substring starts in position 0 and 4, since they‘re not allowed to overlap.

For C++ users, kindly use scanf to avoid TLE for huge inputs.

题意:给了N个模式串,然后又给了一个长串,求每个模式串在这个长串中出现的次数。每个模式串前给了类型限制,若为0,表示串可以重叠,1表示不能重叠。

思路:当为0类型时,可以重叠,则和以前的AC自动机模板一样,1时,需要在模式串的最后一个字符的结构体内标识一下上一次这个串在长串中出现的位置,当再次匹配到这个串的末尾时,用当前串的位置序号减去上次出现的位置序号,若长大于等于模式串的长度,则串出现次数加一。注意:相同的串并且是相同类型的串可能出现多次,如 0 aba ,0 aba

所以可以在字符结构体中记录0型和1型串出现次数,记录输入的模式串,最后按顺序输出时,将对应的串和类型数传到trie树中查找,返回输出记录0型和1型串出现的次数。

参考别人的代码如下:

#include <stdio.h>
#include <string.h>
#include <memory.h>
struct node{
    node *fail;
    node *next[26];
    int id;
    node(){
        fail=NULL;
        id=0;
        memset(next,NULL,sizeof(next));
    }
}*q[770000],*root;
int head,tail;
char str1[100003][7],str2[100003];
int A[100003],cnt[100003][2],pos[100003],Len[100003],n;
void insert_Trie(char *str,int num1){
    node *p=root;
    int i=0,id;
    while(str[i]){
        id=str[i]-‘a‘;
        if(p->next[id]==NULL) p->next[id]=new node();
        p=p->next[id];i++;
    }
    p->id=num1;
}
int search_1(char *str){
    node *p=root;
    int m,i=0;
    while(str[i]){
        m=str[i]-‘a‘;
        if(p->next[m]==NULL) return -1;
        p=p->next[m];
        i++;
    }
    return p->id;
}
void setfail() ///初始化fail指针,BFS
{
    q[tail++]=root;
    while(head!=tail)
    {
        node *p=q[head++];
        node *temp=NULL;
        for(int i=0;i<26;i++)
        if(p->next[i]!=NULL)
        {
            if(p==root) ///首字母的fail必指向根
            p->next[i]->fail=root;
            else
            {
                temp=p->fail; ///失败指针
                while(temp!=NULL) ///2种情况结束:匹配为空or找到匹配
                {
                    if(temp->next[i]!=NULL) ///找到匹配
                    {
                        p->next[i]->fail=temp->next[i];
                        break;
                    }
                    temp=temp->fail;
                }
                if(temp==NULL) ///为空则从头匹配
                    p->next[i]->fail=root;
                }
            q[tail++]=p->next[i]; ///入队
        }
    }
}

void query(){
    int i=0;
    node *p=root,*temp;
    while(str2[i]){
        int id=str2[i]-‘a‘;
        while(p->next[id]==NULL&&p!=root) p=p->fail;
        p=p->next[id];
        p=(p==NULL)?root:p;
        temp=p;
        while(temp!=root){
            if(temp->id){
                cnt[temp->id][0]++;
            }
            temp=temp->fail;
        }
        i++;
    }
}
void query1(){
    int i=0;
    node *p=root,*temp;
    while(str2[i]){
        int id=str2[i]-‘a‘;
        while(p->next[id]==NULL&&p!=root) p=p->fail;
        p=p->next[id];
        p=(p==NULL)?root:p;
        temp=p;
        while(temp!=root){
            if(temp->id&&i-pos[temp->id]>=Len[temp->id]){
                pos[temp->id]=i;
                cnt[temp->id][1]++;
            }
            temp=temp->fail;
        }
        i++;
    }
}
int query_num(char *str,int aa){
    int i=0;
    node *p=root;
    while(str[i]){
        int id=str[i]-‘a‘;
        p=p->next[id];
        i++;
    }
    return cnt[p->id][aa];
}
void del(node *p){
     if(p==NULL)return ;
     for(int i=0;i<26;i++)del(p->next[i]);
     delete p;
}
int main(){
    int t=1;
    while(scanf("%s",str2)!=-1){
        scanf("%d",&n);
        head=0;
        tail=0;
        root=new node();
        memset(cnt,0,sizeof(cnt));
        memset(pos,-1,sizeof(pos));
        for(int i=1;i<=n;i++){
            scanf("%d%s",&A[i],str1[i]);
            Len[i]=strlen(str1[i]);
            insert_Trie(str1[i],i);
        }
        setfail();
        query();
        query1();
        printf("Case %d\n",t++);
        for(int i=1;i<=n;i++){
            int ttt=query_num(str1[i],A[i]);
            printf("%d\n",ttt);
        }
        printf("\n");
        del(root);
    }
    return 0;
}

我的代码如下:(我写的代码很清晰,各种样例都测试通过了,但提交就是wa,唉~)

#include<iostream>
#include<algorithm>
#include<cstdio>
#include<cstring>
using namespace std;
#define N 770010
char str[110010],keyword[110010][10];
int head,tail,key[110010];

struct node
{
    node *fail;
    node *next[26];
    int f;
    int count1;
    int count2;
    int b,l;
    node()
    {
        fail=NULL;
        count1=0;
        count2=0;
        f=-1;
        b=0;
        l=0;
        for(int i=0;i<26;i++)
        next[i]=NULL;
    }
}*q[N];
node *root;

int insert(char *str,int x) ///建立Trie
{
    int temp,len;
    node *p=root;
    len=strlen(str);
    for(int i=0;i<len;i++)
    {
        temp=str[i]-‘a‘;
        if(p->next[temp]==NULL)
           p->next[temp]=new node();
        p=p->next[temp];
    }
    p->f++;
    p->l=len;
    if(!x) return p->count1;
    else   return p->count2;
}

void setfail() ///初始化fail指针,BFS
{
    q[tail++]=root;
    while(head!=tail)
    {
        node *p=q[head++];
        node *temp=NULL;
        for(int i=0;i<26;i++)
        if(p->next[i]!=NULL)
        {
            if(p==root) ///首字母的fail必指向根
            p->next[i]->fail=root;
            else
            {
                temp=p->fail; ///失败指针
                while(temp!=NULL) ///2种情况结束:匹配为空or找到匹配
                {
                    if(temp->next[i]!=NULL) ///找到匹配
                    {
                        p->next[i]->fail=temp->next[i];
                        break;
                    }
                    temp=temp->fail;
                }
                if(temp==NULL) ///为空则从头匹配
                    p->next[i]->fail=root;
                }
            q[tail++]=p->next[i]; ///入队
        }
    }
}

void query()
{
    int index,len;
    node *p=root;
    len=strlen(str);
    for(int i=0;i<len;i++)
    {
        index=str[i]- ‘a‘;
        while(p->next[index]==NULL&&p!=root) ///跳转失败指针
        p=p->fail;
        p=p->next[index];
        if(p==NULL)
        p=root;
        node *temp=p; ///p不动,temp计算后缀串
        while(temp!=root&&temp->f!=-1)
        {
            temp->count1++;
            if(temp->b==0||(i-temp->b)>=(temp->l))
            {
                temp->count2++;
                temp->b=i;
            }
            temp=temp->fail;
        }
    }
}
void free_(node *r)
{
    for(int i=0; i<26; i++)
    {
        if(r->next[i])
        free_(r->next[i]);
    }
    free(r);
}

int main()
{
    int num,Case=1;
    while(~scanf("%s",str))
    {
        head=tail=0;
        memset(key,0,sizeof(key));
        root = new node();
        scanf("%d", &num);
        for(int i=1;i<=num;i++)
        {
            scanf("%d %s",&key[i],keyword[i]);
            insert(keyword[i],i);
        }
        setfail();
        query();
        printf("Case %d\n",Case++);
        for(int i=1;i<=num;i++)
        {
            printf("%d\n",insert(keyword[i],key[i]));
        }
        printf("\n");
        free_(root);
    }
    return 0;
}
时间: 2024-10-11 10:49:50

AC自动机---Searching the String的相关文章

zoj 3228 Searching the String(AC自动机)

题目连接:zoj 3228 Searching the String 题目大意:给定一个字符串,然后现在有N次询问,每次有一个type和一个子串,问说子串在字符串中出现几次,type 为0时为可重叠,为1时为不可重叠. 解题思路:不过没有type=1的限制,那么就是普通的AC自动机匹配问题,对于不可重叠问题,可以对于每个节点记录 一下上一次匹配到的pos,用当前匹配的i减掉pos看有没有超过长度,有超过即为合法匹配,否则忽略. 题目中有很多相同的子串,一开始我用jump数组用类似链表的形式记录每

ZOJ 题目3228 Searching the String(AC自动机)

Searching the String Time Limit: 7 Seconds      Memory Limit: 129872 KB Little jay really hates to deal with string. But moondy likes it very much, and she's so mischievous that she often gives jay some dull problems related to string. And one day, m

ZOJ3228 Searching the String (AC自动机)

Searching the String Time Limit: 7 Seconds                                      Memory Limit: 129872 KB Little jay really hates to deal with string. But moondy likes it very much, and she's so mischievous that she often gives jay some dull problems r

Searching the String ZOJ - 3228 AC自动机查询升级版

题意:先给你一个不超过1000000长度的大串s:接下来输入一个n代表接下来输入的小串个数,小串长度不超过6. 小串分两种类型0和1类型. 0类型表示小串在大串中的最大匹配个数就是常规的AC自动机的做法. 1类型表示小串在大串中不能重合的最大匹配数. 依次输出结果.(所有的串只包含小写字母) 按样例输出,注意每组测试数据后有一个换行. 题意我不想写了抄的,抄这里的 (不好意思啦) 0 类型的就是最开始的模板题 1 类型的处理方式就是,在建立字典树的时候弄一个dep数组,记录每一个节点的深度 然后

HDU 6096 String (AC自动机)

题意:给出n个字符串和q个询问,每次询问给出两个串 p 和 s .要求统计所有字符串中前缀为 p 且后缀为 s (不可重叠)的字符串的数量. 析:真是觉得没有思路啊,看了官方题解,真是好复杂. 假设原始的字符串 数组为A,首先将A中的每个字符串都进行翻转,得到字符串数组B,然后,将A和B按字典序排序. 对于一个查询来说有一个前缀p和后缀s, 所有包含前缀p的字符串在A中是连续的,可通过二分求出该区间 设为[Lp,Rp],同样,所有包含后缀s的字符串在B中也是连续的,设为[Ls,Rs] 接下来只需

2017多校第6场 HDU 6096 String AC自动机

题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=6096 题意:给了一些模式串,然后再给出一些文本串的不想交的前后缀,问文本串在模式串的出现次数. 解法: 因为要求前缀后缀都包含的个数,所以可以把字符串a转换成a#a这样一个字符串,比如abca就转换成abca#abca 然后对于一组前缀a后缀b转换成b{a,比如ab ca,就是ca{ab, 然后对前缀后缀的串建立AC自动机,让主串去匹配,如上述例子,ca{ab满足为abca{abca的一个子串,也就

ZOJ 3228 Searching the String (AC自己主动机)

题目链接:Searching the String 解析:给一个长串.给n个不同种类的短串.问分别在能重叠下或者不能重叠下短串在长串中出现的次数. 能重叠的已经是最简单的AC自己主动机模板题了. 不能重叠的记录一下每一个匹配的串的起始位置保证不重叠就可以. AC代码: #include <bits/stdc++.h> using namespace std; struct Trie{ int next[600010][26], fail[600010], deep[600010]; int r

HDU - 6096 :String (AC自动机,已知前后缀,匹配单词,弱数据)

Bob has a dictionary with N words in it. Now there is a list of words in which the middle part of the word has continuous letters disappeared. The middle part does not include the first and last character. We only know the prefix and suffix of each w

HDU 6096 String(AC自动机+树状数组)

题意 给定 \(n\) 个单词,\(q\) 个询问,每个询问包含两个串 \(s_1,s_2\),询问有多少个单词以 \(s_1\) 为前缀, \(s_2\) 为后缀,前后缀不能重叠. \(1 \leq n,q \leq 10^5\) 思路 字符串题有一个小技巧,拼接字符串,中间加上连接符.如这道题,可以将查询变成 \(s_2+\text{\{}+s_1\) 的形式,相应的,把单词 \(T\) 变为 \(T+\text{\{}+T\) 的形式.那么就是普通的匹配问题了. 对于询问建立\(\text