POJ 3261 Milk Patterns ( 后缀数组 && 出现k次最长可重叠子串长度 )

题意 : 给出一个长度为 N 的序列,再给出一个 K 要求求出出现了至少 K 次的最长可重叠子串的长度

分析 : 后缀数组套路题,思路是二分长度再对于每一个长度进行判断,判断过程就是对于 Height 数组进行限定长度的分组策略,如果有哪一组的个数 ≥  k 则说明可行!

分组要考虑到一个事实,对于每一个后缀,与其相匹配能够产生最长的LCP长度的串肯定是在后缀数组中排名与其相邻。

一开始对分组的理解有误,所以想了一个错误做法 ==>

遍历一下 Height 将值 ≥ (当前二分长度) 的做一次贡献即 cnt++ ,若最后 cnt ≥ K 说明可行。当然这个肯定是炸了.......

下面说说我对于 Height 分组的理解吧,就看上面的图,如果当前 K == 2,那么第一组的含义是什么?换句话说就是为什么那么些个后缀要属于一组?可以看出第一组里面的 Height 值都不会小于 K ,实际的意义呢应当是第一组里面的有一个长度为 2 (不小于K)的共同前缀,即 “aa” ,那么是不是 “aa” 这个子串可重叠地出现了 cnt 次(cnt为第一组的后缀个数),可能你已经有点体会到分组的意义了!那么有没有可能有些前缀是 “aa” 但是没有被分进第一组呢?看见上面红字描述的事实么?根据上面的那个事实,而且 Height 的下标是根据排名有序的这个特点(有序的意思就是从小到大遍历 Height 实际传进去的下标就是排名!即 Height[i],i是表示第 i 名的后缀),我们就知道这样的事情不会发生,且分出来的组肯定的“连续的块”,即不会有这一组的元素在其他地方的可能性!

#include<stdio.h>
#include<string.h>
#include<algorithm>
using namespace std;
const int maxn = 1e6 + 10;

int sa[maxn], s[maxn], wa[maxn], Ws[maxn], wv[maxn], wb[maxn];
int Rank[maxn], height[maxn];

bool cmp(int r[], int a, int b, int l){ return r[a] == r[b] && r[a+l] == r[b+l]; }
void da(int r[], int sa[], int n, int m)
{
    int i, j, p, *x = wa, *y = wb;
    for (i = 0; i < m; ++i) Ws[i] = 0;
    for (i = 0; i < n; ++i) Ws[x[i]=r[i]]++;
    for (i = 1; i < m; ++i) Ws[i] += Ws[i-1];
    for (i = n-1; i >= 0; --i) sa[--Ws[x[i]]] = i;
    for (j = 1, p = 1; p < n; j *= 2, m = p)
    {
        for (p = 0, i = n - j; i < n; ++i) y[p++] = i;
        for (i = 0; i < n; ++i) if (sa[i] >= j) y[p++] = sa[i] - j;
        for (i = 0; i < n; ++i) wv[i] = x[y[i]];
        for (i = 0; i < m; ++i) Ws[i] = 0;
        for (i = 0; i < n; ++i) Ws[wv[i]]++;
        for (i = 1; i < m; ++i) Ws[i] += Ws[i-1];
        for (i = n-1; i >= 0; --i) sa[--Ws[wv[i]]] = y[i];
        for (std::swap(x, y), p = 1, x[sa[0]] = 0, i = 1; i < n; ++i)
            x[sa[i]] = cmp(y, sa[i-1], sa[i], j) ? p-1 : p++;
    }
}
void calheight(int r[], int sa[], int n)
{
    int i, j, k = 0;
    for (i = 1; i <= n; ++i) Rank[sa[i]] = i;
    for (i = 0; i < n; height[Rank[i++]] = k)
        for (k?k--:0, j = sa[Rank[i]-1]; r[i+k] == r[j+k]; k++);
}

bool IsOk(int len, int n, int aim)
{
    int cnt = 1;
//    for(int i=2; i<=n; i++){ //错误的!
//        if(height[i] >= len)
//            if(++cnt >= aim)
//                return true;
//    }return false;
    for(int i=2; i<=n; i++){
        if(height[i] >= len){ if(++cnt >= aim) return true; }
        else cnt = 1;
    }return false;
}

int arr[maxn];
int main(void)
{
    int N, K;
    while(~scanf("%d %d", &N, &K)){

        for(int i=0; i<N; i++)
            scanf("%d", &arr[i]);

        da(arr, sa, N+1, 1000005);
        calheight(arr, sa, N);

        int L = 0, R = N, ans = -1;
        while(L <= R){
            int mid = L + ((R-L)>>1);
            if(IsOk(mid, N, K)) ans = mid, L = mid + 1;
            else R = mid - 1;
        }
        ans==-1? puts("0") : printf("%d\n", ans);
    }
    return 0;
}

时间: 2024-10-07 18:42:32

POJ 3261 Milk Patterns ( 后缀数组 && 出现k次最长可重叠子串长度 )的相关文章

POJ 3261 Milk Patterns 后缀数组求 一个串种 最长可重复子串重复至少k次

Milk Patterns Description Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can't predict the quality of milk from one day to the next, there are some r

poj 3261 Milk Patterns 后缀数组+二分

1 /*********************************************************** 2 题目: Milk Patterns(poj 3261) 3 链接: http://poj.org/problem?id=3261 4 题意: 给一串数字,求这些数字中公共子串个数大于k的 5 最长串. 6 算法: 后缀数组+二分 7 ***********************************************************/ 8 #incl

POJ 3261 Milk Patterns 后缀数组

用后缀数组求重复出现至少k次的可重叠最长子串的长度, 当然是可以用hash搞的,用后缀数组的话,只要在分组之后看看个数是不是大于等于k #include <cstdio> #include <cstring> #include <algorithm> #include <queue> #include <stack> #include <map> #include <set> #include <climits>

POJ - 3261 Milk Patterns (后缀数组求可重叠的 k 次最长重复子串)

Description Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can't predict the quality of milk from one day to the next, there are some regular pattern

POJ 3261 Milk Patterns (求可重叠的k次最长重复子串)

Milk Patterns Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 14094   Accepted: 6244 Case Time Limit: 2000MS Description Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation,

[POJ3261] Milk Patterns (后缀数组+二分)

题目概述: Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he can't predict the quality of milk from one day to the next, there are some regular patterns in t

POJ 3261 Milk Patterns(后缀数组+二分答案)

[题目链接] http://poj.org/problem?id=3261 [题目大意] 求最长可允许重叠的出现次数不小于k的子串. [题解] 对原串做一遍后缀数组,二分子串长度x,将前缀相同长度超过x的后缀分组, 如果存在一个大小不小于k的分组,则说明答案可行,分治得到最大可行解就是答案. [代码] #include <cstdio> #include <cstring> #include <vector> using namespace std; const int

后缀数组 POJ 3261 Milk Patterns

题目链接 题意:可重叠的 k 次最长重复子串.给定一个字符串,求至少出现 k 次的最长重复子串,这 k 个子串可以重叠. 分析:与POJ 1743做法类似,先二分答案,height数组分段后统计 LCP>=m 的子串的个数. #include <cstdio> #include <cstring> #include <algorithm> const int N = 2e4 + 5; int sa[N], rank[N], height[N]; int t[N],

POJ 3261 Milk Patterns 可重复k次的最长重复子串

Milk PatternsTime Limit: 20 Sec Memory Limit: 256 MB 题目连接 http://poj.org/problem?id=3261 Description Farmer John has noticed that the quality of milk given by his cows varies from day to day. On further investigation, he discovered that although he c