SPOJ - PHRASES Relevant Phrases of Annihilation (后缀数组)

You are the King of Byteland. Your agents have just intercepted a batch of encrypted enemy messages concerning the date of the planned attack on your island. You immedietaly send for the Bytelandian Cryptographer, but he is currently busy eating popcorn and claims that he may only decrypt the most important part of the text (since the rest would be a waste of his time). You decide to select the fragment of the text which the enemy has strongly emphasised, evidently regarding it as the most important. So, you are looking for a fragment of text which appears in all the messages disjointly at least twice. Since you are not overfond of the cryptographer, try to make this fragment as long as possible.

Input

The first line of input contains a single positive integer t<=10, the number of test cases. t test cases follow. Each test case begins with integer n (n<=10), the number of messages. The next n lines contain the messages, consisting only of between 2 and 10000 characters ‘a‘-‘z‘, possibly with some additional trailing white space which should be ignored.

Output

For each test case output the length of longest string which appears disjointly at least twice in all of the messages.

Example

Input:
1
4
abbabba
dabddkababa
bacaba
baba

Output:
2

(in the example above, the longest substring which fulfills the requirements is ‘ba‘)

题意:

每个字符串至少出现两次且不重叠的最长子串的长度

思路:

二分答案,按height分组,记录同一组内同一字符串的起点的最大值和最小值,在判断最大值减最小值是否大于mid即可。

#include<iostream>
#include<algorithm>
#include<vector>
#include<stack>
#include<queue>
#include<map>
#include<set>
#include<cstdio>
#include<cstring>
#include<cmath>
#include<ctime>

#define fuck(x) cerr<<#x<<" = "<<x<<endl;
#define debug(a, x) cerr<<#a<<"["<<x<<"] = "<<a[x]<<endl;
#define ls (t<<1)
#define rs ((t<<1)|1)
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
const int maxn = 100186;
const int maxm = 100086;
const int inf = 0x3f3f3f3f;
const ll Inf = 999999999999999999;
const int mod = 1000000007;
const double eps = 1e-6;
const double pi = acos(-1);

int s[maxn];
int len, Rank[maxn], sa[maxn], tlen, tmp[maxn];

bool compare_sa(int i, int j) {
    if (Rank[i] != Rank[j]) { return Rank[i] < Rank[j]; }
    //如果以i开始,长度为k的字符串的长度,已经超出了字符串尾,那么就赋值为-1
    //这是因为,在前面所有数据相同的情况下,字符串短的字典序小.
    int ri = i + tlen <= len ? Rank[i + tlen] : -inf;
    int rj = j + tlen <= len ? Rank[j + tlen] : -inf;
    return ri < rj;
}

void construct_sa() {
    //初始的RANK为字符的ASCII码
    for (int i = 0; i <= len; i++) {
        sa[i] = i;
        Rank[i] = i < len ? s[i] : -inf;
    }
    for (tlen = 1; tlen <= len; tlen *= 2) {
        sort(sa, sa + len + 1, compare_sa);
        tmp[sa[0]] = 0;
        //全新版本的RANK,tmp用来计算新的rank
        //将字典序最小的后缀rank计为0
        //sa之中表示的后缀都是有序的,所以将下一个后缀与前一个后缀比较,如果大于前一个后缀,rank就比前一个加一.
        //否则就和前一个相等.
        for (int i = 1; i <= len; i++) {
            tmp[sa[i]] = tmp[sa[i - 1]] + (compare_sa(sa[i - 1], sa[i]) ? 1 : 0);
        }
        for (int i = 0; i <= len; i++) {
            Rank[i] = tmp[i];

        }
    }
}

int height[maxn];

void construct_lcp() {
//    for(int i=0;i<=n;i++){Rank[sa[i]]=i;}
    int h = 0;
    height[0] = 0;
    for (int i = 0; i < len; i++) {//i为后缀数组起始位置
        int j = sa[Rank[i] - 1];//获取当前后缀的前一个后缀(排序后)
        if (h > 0)h--;
        for (; j + h < len && i + h < len; h++) {
            if (s[j + h] != s[i + h])break;
        }
        height[Rank[i]] = h;
    }
}

int st[maxn][20];

void rmq_init() {
    for (int i = 1; i <= len; i++) {
        st[i][0] = height[i];
    }
    int l = 2;
    for (int i = 1; l <= len; i++) {
        for (int j = 1; j + l / 2 <= len; j++) {
            st[j][i] = min(st[j][i - 1], st[j + l / 2][i - 1]);
        }
        l <<= 1;
    }
}

int ask_min(int i, int j) {
    int k = int(log(j - i + 1.0) / log(2.0));
    return min(st[i][k], st[j - (1 << k) + 1][k]);
}

int lcp(int a, int b)//此处参数是,原字符串下标
{
    a = Rank[a], b = Rank[b];
    if (a > b)
        swap(a, b);
    return ask_min(a + 1, b);
}

char str[maxn];
int intr[maxn];
int mx[105];
int mn[105];
int n;
bool check(int mid){
    if(mid==0){ return  true;}
    memset(mx,0,sizeof(mn));
    memset(mn,0x3f,sizeof(mn));
    for(int i=1;i<=len;i++){
        int prestart =  lower_bound(intr+1,intr+1+n,sa[i-1])-intr;
        int start = lower_bound(intr+1,intr+1+n,sa[i])-intr;
//        cout<<height[i]<<endl;
        if(height[i]>=mid){
            mn[start]=min(mn[start],sa[i]);
            mx[start]=max(mx[start],sa[i]);
            mn[prestart]=min(mn[prestart],sa[i-1]);
            mx[prestart]=max(mx[prestart],sa[i-1]);
        }else{
            bool flag=true;
//            fuck(n)
            for(int j=1;j<=n;j++){
//                cerr<<mx[j]<<" "<<mn[j]<<endl;
                if(mx[j]-mn[j]<mid){flag=false;}
            }
            if(flag){return true;}
            memset(mx,0,sizeof(mn));
            memset(mn,0x3f,sizeof(mn));
        }
    }
    return false;
}

int main() {
//    ios::sync_with_stdio(false);
//    freopen("in.txt", "r", stdin);

    int cases=0;

    int T;
    scanf("%d",&T);
    while (T--){
        scanf("%d",&n);
        cases++;
        len=0;
        int lenx = 0;
        for(int i=1;i<=n;i++){
            scanf("%s",str);
            int l=strlen(str);
            lenx = max(lenx,l);
            for(int j=0;j<l;j++){
                s[len++]=(int)str[j]-‘a‘+1;
            }
            s[len++]=200+i;
            intr[i]=len-1;
        }

        construct_sa();
        construct_lcp();

//        fuck(check(4));
        int l=0,r=lenx;
        int ans=0;
        while (r>=l){
            int mid=(l+r)/2;
//            cout<<mid<<endl;
//            cout<<l<<" "<<r<<endl;
            if(check(mid)){
                ans=mid;
                l=mid+1;
            }else{
                r=mid-1;
            }
        }
        printf("%d\n",ans);
    }

    return 0;
}

原文地址:https://www.cnblogs.com/ZGQblogs/p/11181752.html

时间: 2024-08-24 23:18:44

SPOJ - PHRASES Relevant Phrases of Annihilation (后缀数组)的相关文章

SPOJ 220 Relevant Phrases of Annihilation (后缀数组)

题目大意: 求在m个串中同时出现两次以上且不覆盖的子串的长度. 思路分析: 二分答案,然后check是否满足,判断不覆盖的方法就是用up down 来处理边界. #include <cstdio> #include <iostream> #include <algorithm> #include <cstring> #include <map> #include <string> #define maxn 110005 using n

SPOJ 220 Relevant Phrases of Annihilation(后缀数组+二分答案)

[题目链接] http://www.spoj.pl/problems/PHRASES/ [题目大意] 求在每个字符串中出现至少两次的最长的子串 [题解] 注意到这么几个关键点:最长,至少两次,每个字符串. 首先对于最长这个条件,我们可以想到二分答案, 然后利用后缀数组所求得的三个数组判断是否满足条件. 其次是出现两次,每次出现这个条件的时候, 我们就应该要想到这是最大值最小值可以处理的, 将出现在同一个字符串中的每个相同字符串的起始位置保存下来, 如果最小值和最大值的差距超过二分长度L,则表明在

SPOJ 220. Relevant Phrases of Annihilation(后缀数组多次不重叠子串)

题目大意:给定N个串,求每个串至少出现两次的最长子串. 解题思路:每个字符串至少出现两次且不可重叠的最长子串:二分枚举长度后在同一分组中对每一个字符串保留一个最小的位置和一个最大的位置,最后查看是否每个串在同一组中都有至少两个后缀,并且后缀的坐标差大于枚举的长度. POJ Problem Set (classical) 220. Relevant Phrases of Annihilation Problem code: PHRASES You are the King of Byteland.

SPOJ220 Relevant Phrases of Annihilation(后缀数组)

引用罗穗骞论文中的话: 先将n 个字符串连起来,中间用不相同的且没有出现在字符串中的字符隔开,求后缀数组.然后二分答案,再将后缀分组.判断的时候,要看是否有一组后缀在每个原来的字符串中至少出现两次,并且在每个原来的字符串中,后缀的起始位置的最大值与最小值之差是否不小于当前答案(判断能否做到不重叠,如果题目中没有不重叠的要求,那么不用做此判断).这个做法的时间复杂度为O(nlogn). 二分枚举长度,对每个长度遍历height[]数组,将height[]数组分块,每个块内任意两串的lcp均大于等于

SPOJ PHRASES Relevant Phrases of Annihilation

这道题注意要是不重叠的,一开始这里WA了一次 其他的思路应该挺简单的 #include<stdio.h> #include<algorithm> #include<string.h> using namespace std; #define N 100100 int r[N]; char s[11][10005]; int wa[N],wb[N],wv[N],ws[N]; int sa[N],rank[N],height[N]; int m[N]; int cmp(in

SPOJ SUBST1 POJ 2406 POJ REPEATS 后缀数组小结

//聪神说:做完了题目记得总结,方便以后复习. SPOJ SUBST1 题目链接:点击打开链接 题意:给一个字符串,求不同子串个数. 思路:假设所有子串都不同,答案为len*(len+1)/2;然而不是这样... 下面我们就找出重复的子串: 首先先将后缀排序,对于后缀i能生成len-sa[i]个子串,这其中有height[i]个子串与第i-1个后缀生成的子串重复了: 所以答案为 len*(len+1)/2-segema(height[i]) . cpp代码: //spoj disubstr #i

SPOJ 题目694 Distinct Substrings(后缀数组,求不同的子串个数)

DISUBSTR - Distinct Substrings no tags Given a string, we need to find the total number of its distinct substrings. Input T- number of test cases. T<=20; Each test case consists of one string, whose length is <= 1000 Output For each test case output

Spoj-DISUBSTR - Distinct Substrings~New Distinct Substrings SPOJ - SUBST1~(后缀数组求解子串个数)

Spoj-DISUBSTR - Distinct Substrings New Distinct Substrings SPOJ - SUBST1 我是根据kuangbin的后缀数组专题来的 这两题题意一样求解字符串中不同字串的个数: 这个属于后缀数组最基本的应用 给定一个字符串,求不相同的子串的个数. 算法分析: 每个子串一定是某个后缀的前缀,那么原问题等价于求所有后缀之间的不相同的前缀的个数. 如果所有的后缀按照 suffix(sa[1]), suffix(sa[2]), suffix(sa

后缀数组 &amp; 题目

后缀数组被称为字符串处理神器,要解决字符串问题,一定要掌握它.(我这里的下标全部都是从1开始) 首先后缀数组要处理出两个数组,一个是sa[],sa[i]表示排名第i为的后缀的起始位置是什么,rank[i]表示第i个字符为起始点的后缀,它的排名是什么.可以知道sa[rank[i]] = i; rank[sa[i]] = i; 由于每个后缀各不相同,至起码长度不同,所以每个后缀是不可能相等的. 解除一个值,就能在O(n)时间内得到另外一个. 定义:suffix(i)表示从[i, lenstr]这个后