POJ3261(后缀数组+2分枚举)

Milk Patterns

Time Limit: 5000MS   Memory Limit: 65536K
Total Submissions: 12972   Accepted: 5769
Case Time Limit: 2000MS

Description

Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can‘t predict the quality of milk from one day to the next, there are some regular patterns in the daily milk quality.

To perform a rigorous study, he has invented a complex classification scheme by which each milk sample is recorded as an integer between 0 and 1,000,000 inclusive, and has recorded data from a single cow over N (1 ≤ N ≤ 20,000) days. He wishes to find the longest pattern of samples which repeats identically at least K (2 ≤ K ≤ N) times. This may include overlapping patterns -- 1 2 3 2 3 2 3 1 repeats 2 3 2 3 twice, for example.

Help Farmer John by finding the longest repeating subsequence in the sequence of samples. It is guaranteed that at least one subsequence is repeated at least K times.

Input

Line 1: Two space-separated integers: N and K 
Lines 2..N+1: N integers, one per line, the quality of the milk on day i appears on the ith line.

Output

Line 1: One integer, the length of the longest pattern which occurs at least K times

Sample Input

8 2
1
2
3
2
3
2
3
1

Sample Output

4思路:用后缀数组求出lcp后,2分枚举L使得连续的lcp[i]>=L 的个数>=k-1;
#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
const int MAXN=1000005;
int buf[MAXN];
int sa[MAXN];
int rnk[MAXN];
int tmp[MAXN];
int lcp[MAXN];
int len,k;
int t;

bool comp(int i,int j)
{
    if(rnk[i]!=rnk[j])    return rnk[i]<rnk[j];
    else
    {
        int ri=(i+k<=len)?rnk[i+k]:-1;
        int rj=(j+k<=len)?rnk[j+k]:-1;
        return ri<rj;
    }
}

void getsa()
{
    memset(sa,0,sizeof(sa));
    memset(rnk,0,sizeof(rnk));
    memset(tmp,0,sizeof(tmp));

    for(int i=0;i<len;i++)
    {
        sa[i]=i;
        rnk[i]=buf[i];
    }
    sa[len]=len;
    rnk[len]=-1;

    for(k=1;k<=len;k*=2)
    {
        sort(sa,sa+len+1,comp);

        tmp[sa[0]]=0;
        for(int i=1;i<=len;i++)
        {
            tmp[sa[i]]=tmp[sa[i-1]]+(comp(sa[i-1],sa[i])?1:0);
        }

        for(int i=0;i<=len;i++)
        {
            rnk[i]=tmp[i];
        }
    }

}

void getlcp()
{
    getsa();
    memset(rnk,0,sizeof(rnk));
    memset(lcp,0,sizeof(lcp));
    for(int i=0;i<=len;i++)
    {
        rnk[sa[i]]=i;
    }

    int h=0;
    lcp[0]=h;
    for(int i=0;i<len;i++)
    {
        int j=sa[rnk[i]-1];
        if(h>0)    h--;
        for(;i+h<len&&j+h<len;h++)
        {
            if(buf[i+h]!=buf[j+h])    break;
        }
        lcp[rnk[i]-1]=h;
    }

}

void debug()
{
    for(int i=0;i<=len;i++)
    {
        int l=sa[i];
        if(l==len)
        {
            printf("0\n");
        }
        else
        {
            for(int j=sa[i];j<len;j++)
            {
                printf("%d ",buf[j]);
            }
            printf("     %d\n",lcp[i]);
        }
    }

}

bool judge(int l)
{
    int  cnt=0;
    for(int i=1;i<len;i++)
    {
        if(lcp[i]>=l)//求前缀大于等于l的连续长度
        {
            cnt++;
        }
        else
            cnt=0;
        if(cnt==t-1)    return true;
    }
    return false;
}

void solve()
{

    int l=1,r=len;
    int ans=0;
    while(l<=r)
    {
        int mid=(l+r)>>1;
        if(judge(mid))//2分枚举长度
        {
            ans=max(ans,mid);
            l=mid+1;
        }
        else    r=mid-1;
    }
    printf("%d\n",ans);
}

int main()
{
    while(scanf("%d%d",&len,&t)!=EOF)
    {
        for(int i=0;i<len;i++)
            scanf("%d",&buf[i]);
        getlcp();
    //    debug()
        solve();
    }
    return 0;
}
				
时间: 2024-09-30 11:00:22

POJ3261(后缀数组+2分枚举)的相关文章

uva 10829 - L-Gap Substrings(后缀数组)

题目链接:uva 10829 - L-Gap Substrings 题目大意:给定一个字符串,问有多少字符串满足UVU的形式,要求U非空,V的长度为g. 解题思路:对字符串的正序和逆序构建后缀数组,然后枚举U的长度l,每次以长度l分区间,在l和l+d+g所在的两个区间上确定U的最大长度. #include <cstdio> #include <cstring> #include <cstdlib> #include <algorithm> using nam

poj 3693 Maximum repetition substring(后缀数组)

题目链接:poj 3693 Maximum repetition substring 题目大意:求一个字符串中循环子串次数最多的子串. 解题思路:对字符串构建后缀数组,然后枚举循环长度,分区间确定.对于一个长度l,每次求出i和i+l的LCP,那么以i为起点,循环子串长度为l的子串的循环次数为LCP/l+1,然后再考虑一下从i-l+1~i之间有没有存在增长的可能性. #include <cstdio> #include <cstring> #include <vector>

Boring counting HDU - 3518 (后缀数组)

Boring counting \[ Time Limit: 1000 ms \quad Memory Limit: 32768 kB \] 题意 给出一个字符串,求出其中出现两次及以上的子串个数,要求子串之间不可以重合. 思路 对字符串后缀数组,然后枚举子串长度 \(len\),若某一段连续的 \(sa[i]\) 的 \(lcp \geq len\),那么说明这一段内存在一个长度为 \(lcp\) 的子串,而我们只需要其中的前 \(len\) 部分,接下来只要找出这个子串出现的最左和最右位置,

uva 11855 - Buzzwords(后缀数组)

题目链接:uva 11855 - Buzzwords 题目大意:给定一个字符串,输出重复子串长度大于1的重复次数(每种长度只算一个次数最多的),并且按照从大到小输出. 解题思路:后缀数组,处理处后缀数组,然后枚举子串长度,按照长度分段即可. #include <cstdio> #include <cstring> #include <vector> #include <algorithm> using namespace std; const int max

CF 427D 后缀数组

大意是寻找两个字符串中最短的公共子串,要求子串在两个串中都是唯一的. 造一个S#T的串,做后缀数组,从小到大枚举子串长度在height数组中扫描,如果某一个组中来自两个串的数量分别为1,就找到了答案. 1 #include <iostream> 2 #include <vector> 3 #include <algorithm> 4 #include <string> 5 #include <string.h> 6 #include <st

HDU2459 后缀数组+RMQ

题目大意: 在原串中找到一个拥有连续相同子串最多的那个子串 比如dababababc中的abababab有4个连续的ab,是最多的 如果有同样多的输出字典序最小的那个 这里用后缀数组解决问题: 枚举连续子串的长度l , 那么从当前位置0出发每次递增l,拿 i 和 i+l 开头的后缀求一个前缀和val , 求解依靠RMQ 得到区间 rank(i),rank(i+l) 那么连续的子串个数应该是val/l+1 但是由于你不一定是从最正确的位置出发,那么我们就需要不断将这个i往前推l位,直到某一位字符不

[XSY 1516] 兔子的字符串 后缀数组

题意 给定一个字符串 $S$ . 按照某种方式, 将字符串 $S$ 化成不超过 $K$ 段 $S_1, S_2, ..., S_K$ . 每段 $S_i$ 有字典序最大的子串 $C_i$ . 最小化 $C_i$ 的最大值. $N \le 200000$ . 分析 通过后缀数组, 先二分后缀, 再二分长度, 实现二分所有的字符串. 判定则可以贪心取, 利用后缀数组的信息, 记录 v[i] 表示位置 i 不能与位置 v[i] 在同一段中. 实现 #include <cstdio> #include

【BZOJ1717&amp;POJ3261】Milk Patterns(后缀数组,二分)

题意:求字符串的可重叠的k次最长重复子串 n<=20000 a[i]<=1000000 思路:后缀数组+二分答案x,根据height分组,每组之间的height>=x 因为可以重叠,所以只要判断是否有一组的height个数>=k即可 1 var sa,rank,x,y,a,wc,wd,height:array[0..1100000]of longint; 2 n,m,i,l,r,mid,last,k1:longint; 3 4 procedure swap(var x,y:long

[POJ3261] Milk Patterns (后缀数组+二分)

题目概述: Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can't predict the quality of milk from one day to the next, there are some regular patterns in t