SPOJ题目687 Repeats(后缀数组+RMQ求重复次数最多的子串的重复次数)

REPEATS - Repeats

no tags

A string s is called an (k,l)-repeat if s is obtained by concatenating k>=1 times some seed string t with length l>=1. For example, the string

s = abaabaabaaba

is a (4,3)-repeat with t = aba as its seed string. That is, the seed string t is 3 characters long, and the whole string s is obtained by repeating t 4 times.

Write a program for the following task: Your program is given a long string u consisting of characters ‘a’ and/or ‘b’ as input. Your program must find some (k,l)-repeat that occurs as substring within u with k as large as possible.
For example, the input string

u = babbabaabaabaabab

contains the underlined (4,3)-repeat s starting at position 5. Since u contains no other contiguous substring with more than 4 repeats, your program must output the maximum k.

Input

In the first line of the input contains H- the number of test cases (H <= 20). H test cases follow. First line of each test cases is n - length of the input string (n <= 50000), The next n lines contain the input string, one character
(either ‘a’ or ‘b’) per line, in order.

Output

For each test cases, you should write exactly one interger k in a line - the repeat count that is maximized.

Example

Input:
1
17
b
a
b
b
a
b
a
a
b
a
a
b
a
a
b
a
b

Output:
4

since a (4, 3)-repeat is found starting at the 5th character of the input string.

ac代码

#include<stdio.h>
#include<string.h>
#include<algorithm>
#include<iostream>
#define min(a,b) (a>b?b:a)
#define max(a,b) (a>b?a:b)
using namespace std;
char str[53030];
int sa[53030],Rank[53030],rank2[53030],height[53030],c[53030],*x,*y,s[53030];
int n;
void cmp(int n,int sz)
{
    int i;
    memset(c,0,sizeof(c));
    for(i=0;i<n;i++)
        c[x[y[i]]]++;
    for(i=1;i<sz;i++)
        c[i]+=c[i-1];
    for(i=n-1;i>=0;i--)
        sa[--c[x[y[i]]]]=y[i];
}
void build_sa(char *s,int n,int sz)
{
    x=Rank,y=rank2;
    int i,j;
    for(i=0;i<n;i++)
        x[i]=s[i],y[i]=i;
    cmp(n,sz);
    int len;
    for(len=1;len<n;len<<=1)
    {
        int yid=0;
        for(i=n-len;i<n;i++)
        {
            y[yid++]=i;
        }
        for(i=0;i<n;i++)
            if(sa[i]>=len)
                y[yid++]=sa[i]-len;
            cmp(n,sz);
        swap(x,y);
        x[sa[0]]=yid=0;
        for(i=1;i<n;i++)
        {
            if(y[sa[i-1]]==y[sa[i]]&&sa[i-1]+len<n&&sa[i]+len<n&&y[sa[i-1]+len]==y[sa[i]+len])
                x[sa[i]]=yid;
            else
                x[sa[i]]=++yid;
        }
        sz=yid+1;
        if(sz>=n)
            break;
    }
    for(i=0;i<n;i++)
        Rank[i]=x[i];
}
void getHeight(char *s,int n)
{
    int k=0;
    for(int i=0;i<n;i++)
    {
        if(Rank[i]==0)
            continue;
        k=max(0,k-1);
        int j=sa[Rank[i]-1];
        while(s[i+k]==s[j+k])
            k++;
        height[Rank[i]]=k;
    }
}
int minv[53010][20],lg[53030];
void init_lg()
{
    int i;
    lg[1]=0;
    for(i=2;i<52020;i++)
    {
        lg[i]=lg[i>>1]+1;
    }
}
void init_RMQ(int n)
{
    int i,j,k;
    for(i=1;i<=n;i++)
    {
        minv[i][0]=height[i];
    }
    for(j=1;j<=lg[n];j++)
    {
        for(k=0;k+(1<<j)-1<=n;k++)
        {
            minv[k][j]=min(minv[k][j-1],minv[k+(1<<(j-1))][j-1]);
        }
    }
}
int lcp(int l,int r)
{
    l=Rank[l];
    r=Rank[r];
    if(l>r)
        swap(l,r);
    l++;
    int k=lg[r-l+1];
    return min(minv[l][k],minv[r-(1<<k)+1][k]);
}
int main()
{
	int t;
	scanf("%d",&t);
	while(t--)
	{
		int n;
		scanf("%d",&n);
		int i,j,k;
		for(i=0;i<n;i++)
		{
			scanf("%s",str+i);
		}
		build_sa(str,n+1,128);
		getHeight(str,n);
		init_lg();
		init_RMQ(n);
		int maxn=0;
		for(i=1;i<n;i++)
		{
			for(j=0;j+i<n;j+=i)
			{
				int k=lcp(j,j+i);
				int now=k/i;
				int tj=j-(i-k%i);
				if(tj>=0)
				{
					if(lcp(tj,tj+i)>=i-k%i)
						now++;
				}
				if(now>maxn)
					maxn=now;
			}
		}
		printf("%d\n",maxn+1);
	}
}

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-08-28 17:58:56

SPOJ题目687 Repeats(后缀数组+RMQ求重复次数最多的子串的重复次数)的相关文章

SPOJ 694、705 后缀数组:求不同子串

思路:这题和wikioi 1306一样,也都是求的不同子串的个数,但是wikioi 时间比较长,然后用Trie树就过了.但是我用那个代码提交这题的时候就WA了,比较晕--因为这题有多组样例,所以超了点时间. 所以这题当然就是用后缀数组做的啦! 算法分析: 每个子串一定是某个后缀的前缀,那么原问题等价于求所有后缀之间的不相同的前缀的个数.如果所有的后缀按照suffix(sa[1]),suffix(sa[2]),suffix(sa[3]),--,suffix(sa[n])的顺序计算,不难发现,对于每

URAL 1297 后缀数组:求最长回文子串

思路:这题下午搞了然后一直WA,后面就看了Discuss,里面有个数组:ABCDEFDCBA,这个我输出ABCD,所以错了. 然后才知道自己写的后缀数组对这个回文子串有bug,然后就不知道怎么改了. 然后看题解,里面都是用RMQ先预处理任意两个后缀的最长公共前缀,因为不太知道这个,所以又看了一下午,嘛嘛-- 然后理解RMQ和后缀一起用的时候才发现其实这里不用RMQ也可以,只要特殊处理一下上面这个没过的例子就行了,哈哈--机智-- 不过那个国家集训队论文里面正解是用RMQ做的,自己还得会和RMQ一

POJ 1743 后缀数组:求最长不重叠子串

数据:这题弄了好久,WA了数十发,现在还有个例子没过,可却A了,POJ 的数组也太弱了. 10 1 1 1 1 1 1 1 1 1 1 这组数据如果没有那个n-1<10判断的话,输入的竟然是5,我靠-- 思路:这个题目关键的地方有两个:第一,重复的子串一定可以看作是某两个后缀的公共前缀,第二,把题目转化成去判定对于任意的一个长度k,是否存在长度至少为k的不重叠的重复的子串. 转化成判定问题之后,就可以二分去解答了.在验证判定是否正确时,我们可以把相邻的所有不小于k的height[]看成一组,然后

SPOJ 687. Repeats(后缀数组求最长重复子串)

题目大意:给你一个串让你求出重复次数最多的连续重复子串的重复次数. 解题思路:论文上给出的解答是: 这还没完,因为经过这两个点的情况还不完备,应还可以假设起点在 [ i*j-i+1, i*j-d],其中 d = i-L/i (d = i-L%i)其意义为根据已知的匹配长度,可以将起点往前移动的范围,太靠后将不能够构造出比之前更好的解.如果要求出某个最多的连续重复子串的最小字典序子需要枚举所有起点,但如果只是要的到最多的重复次数或者任意最多的连续重复子串,那么只需要枚举i*j-d处的起点即可,因为

ZOJ1905Power Strings (KMP||后缀数组+RMQ求循环节)

Given two strings a and b we define a*b to be their concatenation. For example, if a = "abc" and b = "def" then a*b = "abcdef". If we think of concatenation as multiplication, exponentiation by a non-negative integer is defin

SPOJ 687 Repeats 后缀数组

和上一题差不多的方法..没什么好说的 #include <cstdio> #include <cstring> #include <algorithm> using namespace std; const int maxn = (5e4 + 10) * 4; #define F(x) ((x) / 3 + ((x) % 3 == 1 ? 0 : tb)) #define G(x) ((x) < tb ? (x) * 3 + 1 : ((x) - tb) * 3

UVA 题目11512 - GATTACA(后缀数组求出现次数最多的子串及重复次数)

The Institute of Bioinformatics and Medicine (IBM) of your country has been studying the DNA sequences of several organisms, including the human one. Before analyzing the DNA of an organism, the investigators must extract the DNA from the cells of th

POJ 题目 3693 Maximum repetition substring(后缀数组+RMQ+枚举求最小字典序的重复次数最多的子串)

Maximum repetition substring Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 8067   Accepted: 2463 Description The repetition number of a string is defined as the maximum number R such that the string can be partitioned into R same conse

SPOJ 220后缀数组:求每个字符串至少出现两次且不重叠的最长子串

思路:也是n个串连接成一个串,中间用没出现过的字符隔开,然后求后缀数组. 因为是不重叠的,所以和POJ 1743判断一样,只不过这里是多个串,每个串都要判断里面的最长公共前缀有没有重叠,所以用数组存下来就得了,然后再判断. #include<iostream> #include<cstdio> #include<cstring> #include<algorithm> #include<map> #include<queue> #in