HDOJ 题目4416 Good Article Good sentence(后缀数组求a串子串在b串中不出现的种类数)


—每周六晚的BestCoder(有米!)

Good Article Good sentence

Time Limit: 6000/3000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)

Total Submission(s): 2784    Accepted Submission(s): 785

Problem Description

In middle school, teachers used to encourage us to pick up pretty sentences so that we could apply those sentences in our own articles. One of my classmates ZengXiao Xian, wanted to get sentences which are different from that of others, because he thought the
distinct pretty sentences might benefit him a lot to get a high score in his article.

Assume that all of the sentences came from some articles. ZengXiao Xian intended to pick from Article A. The number of his classmates is n. The i-th classmate picked from Article Bi. Now ZengXiao Xian wants to know how many different sentences she could pick
from Article A which don‘t belong to either of her classmates?Article. To simplify the problem, ZengXiao Xian wants to know how many different strings, which is the substring of string A, but is not substring of either of string Bi. Of course, you will help
him, won‘t you?

Input

The first line contains an integer T, the number of test data.

For each test data

The first line contains an integer meaning the number of classmates.

The second line is the string A;The next n lines,the ith line input string Bi.

The length of the string A does not exceed 100,000 characters , The sum of total length of all strings Bi does not exceed 100,000, and assume all string consist only lowercase characters ‘a‘ to ‘z‘.

Output

For each case, print the case number and the number of substrings that ZengXiao Xian can find.

Sample Input

3
2
abab
ab
ba
1
aaa
bbb
2
aaaa
aa
aaa

Sample Output

Case 1: 3
Case 2: 3
Case 3: 1

Source

2012 ACM/ICPC Asia Regional Hangzhou Online

Recommend

liuyiding   |   We have carefully selected several similar problems for you:  5421 5420 5419 5418 5417

一个a串,多个b串,问a串的子串在b串中不出现的种类数

思路来自爱酱http://blog.csdn.net/acm_cxlove/article/details/8013942

其实这种子串问题,比较明显是后缀数组,但是当时的数据让我没办法把所有的串拼接在一起。哎~~~~

将所有的串拼接在一起,中间用一个不同的字符分隔开。然后求一次后缀数组以及height数组。

然后对于A中的某一个后缀,统计一下有B中的LCA有多少,就OK了,说明有A的这个后缀有LCA个子串在B中出现过。

只需要从前往后以及从后往前统计一次height就OK了。注意我们这里统计的是A与B的LCA。如果连续的两个sa是A中,那我们需要求一次最小值,保证求的是和B串的LCA。

但是题目要求的是A中的不同的子串,所以还要去重,遍历一次,如果连续两个都是A串的,则更新一下

#include<stdio.h>
#include<string.h>
#include<algorithm>
#include<iostream>
#define min(a,b) (a>b?b:a)
#define max(a,b) (a>b?a:b)
using namespace std;
#define INF 0x3f3f3f3f
char str[303030];
int sa[303030],Rank[303030],rank2[303030],height[303030],c[303030],*x,*y,s[300030];
int n;
void cmp(int n,int sz)
{
    int i;
    memset(c,0,sizeof(c));
    for(i=0;i<n;i++)
        c[x[y[i]]]++;
    for(i=1;i<sz;i++)
        c[i]+=c[i-1];
    for(i=n-1;i>=0;i--)
        sa[--c[x[y[i]]]]=y[i];
}
void build_sa(int *s,int n,int sz)
{
    x=Rank,y=rank2;
    int i,j;
    for(i=0;i<n;i++)
        x[i]=s[i],y[i]=i;
    cmp(n,sz);
    int len;
    for(len=1;len<n;len<<=1)
    {
        int yid=0;
        for(i=n-len;i<n;i++)
        {
            y[yid++]=i;
        }
        for(i=0;i<n;i++)
            if(sa[i]>=len)
                y[yid++]=sa[i]-len;
            cmp(n,sz);
        swap(x,y);
        x[sa[0]]=yid=0;
        for(i=1;i<n;i++)
        {
            if(y[sa[i-1]]==y[sa[i]]&&sa[i-1]+len<n&&sa[i]+len<n&&y[sa[i-1]+len]==y[sa[i]+len])
                x[sa[i]]=yid;
            else
                x[sa[i]]=++yid;
        }
        sz=yid+1;
        if(sz>=n)
            break;
    }
    for(i=0;i<n;i++)
        Rank[i]=x[i];
}
void getHeight(int *s,int n)
{
    int k=0;
    for(int i=0;i<n;i++)
    {
        if(Rank[i]==0)
            continue;
        k=max(0,k-1);
        int j=sa[Rank[i]-1];
        while(s[i+k]==s[j+k])
            k++;
        height[Rank[i]]=k;
    }
}
int pos[303030];
int main()
{
	int t,c=0;
	scanf("%d",&t);
	while(t--)
	{
		int n;
		scanf("%d",&n);
		scanf("%s",str);
		int i,j;
		int len=strlen(str);
		int num=27;
		for(i=0;i<len;i++)
		{
			s[i]=str[i]-'a'+1;
		}
		int m=len;
		s[m++]=num;
		for(i=1;i<=n;i++)
		{
			scanf("%s",str);
			int len=strlen(str);
			for(j=0;j<len;j++)
			{
				s[m++]=str[j]-'a'+1;
			}
			s[m++]=num+i;
		}
		s[m]=0;
	//	printf("%d\n",m);
		build_sa(s,m+1,num+n+1);
		getHeight(s,m);
		memset(pos,0,sizeof(pos));
		int temp=INF;
		for(i=1;i<=m;i++)
		{
			if(sa[i]<len)
			{
				if(height[i]<temp)
					temp=height[i];
				if(pos[sa[i]]<temp)
					pos[sa[i]]=temp;
			}
			else
				temp=INF;
		}
		temp=INF;
		for(i=m;i>=1;i--)
		{
			if(sa[i-1]<len)
			{
				if(height[i]<temp)
					temp=height[i];
				if(pos[sa[i-1]]<temp)
					pos[sa[i-1]]=temp;
			}
			else
				temp=INF;
		}
		for(i=1;i<=m;i++)
		{
			if(sa[i]<len&&sa[i-1]<len)
			{
				if(pos[sa[i-1]]<height[i])
				{
					pos[sa[i-1]]=height[i];
				}
			}
		}
		__int64 ans=(__int64)len*(len+1)/2;//没有转换64位wa了一次
	//	printf("======%I64d\n",ans);
		for(i=0;i<len;i++)
		{
	//		printf("******%d\n",pos[i]);
			ans-=pos[i];
	//		printf("======%I64d\n",ans);
		}
		printf("Case %d: %I64d\n",++c,ans);
	}
}

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-10-15 22:38:50

HDOJ 题目4416 Good Article Good sentence(后缀数组求a串子串在b串中不出现的种类数)的相关文章

hdu 4416 Good Article Good sentence (后缀数组)

题目大意: 给出一个A串和很多个B串,求出A中有多少个子串,是所有的B中没有出现的. 思路分析: 后缀数组的作用很容易的求出来整个串中不同的子串个数. 现在要求的是A中不同的,且在B中没有出现过的. 先把AB 串全部连接,跑一遍suffix array.然后求出有多少个不同的子串. 然后再单独用B 串跑 suffix array.再求出单独在B 中有多少个不同的 子串. 然后结果就是 ans1 - ans2 ... 需要注意的问题就是,连接的时候需要把每一个串后面加一个特殊符.但是求不同串的时候

hdu 4416 Good Article Good sentence(后缀数组&amp;思维)

Good Article Good sentence Time Limit: 6000/3000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 2308    Accepted Submission(s): 649 Problem Description In middle school, teachers used to encourage us to pick up pre

hdu 4416 Good Article Good sentence(后缀自动机)

题目链接:hdu 4416 Good Article Good sentence 题意: 给你一个串A和n个串B,问你A有多少个子串不是这n个B的子串. 题解: 将A串建立后缀自动机,对于每个B串都拿去匹配一下,并记录后缀自动机中每个节点的最大匹配长度. 然后拓扑排序,更新每个节点的fail节点.最后对于每个节点的贡献就是ml[i]-max(is[i],mx[f[i]]) (is[i]是该节点的最大匹配长度) 1 #include<bits/stdc++.h> 2 #define F(i,a,

[hdu 4416]Good Article Good sentence

最近几天一直在做有关后缀自动机的题目 感觉似乎对后缀自动机越来越了解了呢!喵~ 这题还是让我受益颇多的,首先搞一个后缀自动机是妥妥的了 可是搞完之后呢? 我们来观察 step 这个变量,每个节点的 step 是从根节点到此节点所经过的最长步数 那么也就是以该点为结尾的最长的后缀长度 如何统计不被 Bi 串包含的子串呢? 其实很简单,维护每个节点所能匹配的最长的字符串长度 然后 节点->step-max(该节点所能匹配的最长的字符串长度, 节点->fail->step) 就是答案了 因为

HDOJ 4416 Good Article Good sentence

题解转自:http://blog.csdn.net/dyx404514/article/details/8807440 2012杭州网络赛的一道题,后缀数组后缀自己主动机都行吧. 题目大意:给一个字符串S和一系列字符串T1~Tn,问在S中有多少个不同子串满足它不是T1~Tn中随意一个字符串的子串. 思路:我们先构造S的后缀自己主动机,然后将每个Ti在S的SAM上做匹配,类似于LCS,在S中的每个状态记录一个变量deep,表示T1~Tn,在该状态能匹配的最大长度是多少,将每个Ti匹配完之后,我们将

UVA 题目1223 - Editor(后缀数组求出现次数超过两次的最长子串的长度)

Mr. Kim is a professional programmer. Recently he wants to design a new editor which has as many functions as possible. Most editors support a simple search function that finds one occurrence (or all occurrences successively) of a query pattern strin

URAL 1297. Palindrome(后缀数组求最大回文串)

题目大意:给你一串字符串,让你求出来它存在的最长连续的回文串. 解题思路:先把字符串逆序加到数组中,然后用后缀数组求解.两种方法:1,枚举排名,直接比较rank相同的字符串的位置差是不是len.如果是的话,就记录求解:2,枚举地址,求第i地址与第2*len-i+1的lcp的最大值. PS:需要注意如果多解输出靠前的字符串. 两种写法写在了一起,分别是Del,和Del1函数. 1297. Palindrome Time limit: 1.0 second Memory limit: 64 MB T

URAL 1297. Palindrome(后缀数组 求最长回文子串)

题目链接:http://acm.timus.ru/problem.aspx?space=1&num=1297 1297. Palindrome Time limit: 1.0 second Memory limit: 64 MB The "U.S. Robots" HQ has just received a rather alarming anonymous letter. It states that the agent from the competing ?Robots

面试题[后缀数组]: 最长重复子串

题目:给定一个字符串,求出最长重复子串. 这个题目可以用后缀数组来解:对后缀数组排好序,这样重复的子串就在相邻的后缀中找就可以了.我的C++代码实现如下: class Solution { public: string LongestRepeatingSubstring(string str) { size_t len = str.size(); vector<string> SuffixArray(len); for (size_t i = 0; i < len; ++i) Suffi