POJ2774Long Long Message (后缀数组&后缀自动机)

问题:

The little cat is majoring in physics in the capital of Byterland. A piece of sad news comes to him these days: his mother is getting ill. Being worried about spending so much on railway tickets (Byterland is such a big country, and he has to spend 16 shours on train to his hometown), he decided only to send SMS with his mother.

The little cat lives in an unrich family, so he frequently comes to the mobile service center, to check how much money he has spent on SMS. Yesterday, the computer of service center was broken, and printed two very long messages. The brilliant little cat soon found out:

1. All characters in messages are lowercase Latin letters, without punctuations and spaces. 
2. All SMS has been appended to each other – (i+1)-th SMS comes directly after the i-th one – that is why those two messages are quite long. 
3. His own SMS has been appended together, but possibly a great many redundancy characters appear leftwards and rightwards due to the broken computer. 
E.g: if his SMS is “motheriloveyou”, either long message printed by that machine, would possibly be one of “hahamotheriloveyou”, “motheriloveyoureally”, “motheriloveyouornot”, “bbbmotheriloveyouaaa”, etc. 
4. For these broken issues, the little cat has printed his original text twice (so there appears two very long messages). Even though the original text remains the same in two printed messages, the redundancy characters on both sides would be possibly different.

You are given those two very long messages, and you have to output the length of the longest possible original text written by the little cat.

Background:

The SMS in Byterland mobile service are charging in dollars-per-byte. That is why the little cat is worrying about how long could the longest original text be.

Why ask you to write a program? There are four resions: 
1. The little cat is so busy these days with physics lessons; 
2. The little cat wants to keep what he said to his mother seceret; 
3. POJ is such a great Online Judge; 
4. The little cat wants to earn some money from POJ, and try to persuade his mother to see the doctor :(

Input

Two strings with lowercase letters on two of the input lines individually. Number of characters in each one will never exceed 100000.

Output

A single line with a single integer number – what is the maximum length of the original text written by the little cat.

Sample Input

yeshowmuchiloveyoumydearmotherreallyicannotbelieveit
yeaphowmuchiloveyoumydearmother

Sample Output

27

题意:

求两个字符串的最长的公共字串。

思路:

后缀自动机:可以直接匹配,然后又默写了一遍后缀自动机。

#include<cstdio>
#include<cstdlib>
#include<iostream>
#include<cstring>
#include<algorithm>
using namespace std;
const int maxn=3000000;
char chr[maxn],str[maxn];
struct SAM
{
    int ch[maxn][26],fa[maxn],maxlen[maxn],Last,sz;
    void init()
    {
        sz=Last=1;    fa[1]=maxlen[1]=0;
        memset(ch[1],0,sizeof(ch[1]));
    }
    void add(int x)
    {
        int np=++sz,p=Last;Last=np;
        memset(ch[np],0,sizeof(ch[np]));
        maxlen[np]=maxlen[p]+1;
        while(p&&!ch[p][x]) ch[p][x]=np,p=fa[p];
        if(!p) fa[np]=1;
        else {
            int q=ch[p][x];
            if(maxlen[p]+1==maxlen[q]) fa[np]=q;
            else {
                int nq=++sz;
                memcpy(ch[nq],ch[q],sizeof(ch[q]));
                maxlen[nq]=maxlen[p]+1;
                fa[nq]=fa[q];
                fa[q]=fa[np]=nq;
                while(p&&ch[p][x]==q) ch[p][x]=nq,p=fa[p];
            }
        }
    }
    void solve()
    {
        scanf("%s",chr);
        int L=strlen(chr),x,tmp=0,ans=0;Last=1;
        for(int i=0;i<L;i++){
            x=chr[i]-‘a‘;
            if(ch[Last][x]) tmp++,Last=ch[Last][x];
            else {
                while(Last&&!ch[Last][x]) Last=fa[Last];
                if(!Last) tmp=0,Last=1;
                else tmp=maxlen[Last]+1,Last=ch[Last][x];
            }
            ans=max(ans,tmp);
        }
        printf("%d\n",ans);
    }
};
SAM Sam;
int main()
{

    int T,i,L;
    while(~scanf("%s",chr)){
        Sam.init();
        L=strlen(chr);
        for(i=0;i<L;i++) Sam.add(chr[i]-‘a‘);
        Sam.solve();
    }
    return 0;
} 

后缀数组:需要把两个串连接起来,之间加一个特殊符号,用来保证得到的结果符合两个串来自不同的母串。

效率对比:

后缀自动机94ms,后缀数组782ms。这种基本题型我还是愿意写后缀自动机,不过练一练总是有好处的。

#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<iostream>
#include<algorithm>
using namespace std;
const int maxn=4100000;
char str1[maxn],str2[maxn];
int L,ch[maxn];
struct SA
{
    int cntA[maxn],cntB[maxn],A[maxn],B[maxn];
    int rank[maxn],sa[maxn],tsa[maxn],ht[maxn];
    void sort()
    {
         for (int i = 0; i < 28; i ++) cntA[i] = 0;
         for (int i = 1; i <= L; i ++) cntA[ch[i]] ++;
         for (int i = 1; i < 28; i ++) cntA[i] += cntA[i - 1];
         for (int i = L; i; i --) sa[cntA[ch[i]] --] = i;
         rank[sa[1]] = 1;
         for (int i = 2; i <= L; i ++){
              rank[sa[i]] = rank[sa[i - 1]];
              if (ch[sa[i]] != ch[sa[i - 1]]) rank[sa[i]] ++;
         }
         for (int l = 1; rank[sa[L]] < L; l <<= 1){
              for (int i = 0; i <= L; i ++) cntA[i] = 0;
              for (int i = 0; i <= L; i ++) cntB[i] = 0;
              for ( int i = 1; i <= L; i ++){
                  cntA[A[i] = rank[i]] ++;
                  cntB[B[i] = (i + l <= L) ? rank[i + l] : 0] ++;
              }
              for (int i = 1; i <= L; i ++) cntB[i] += cntB[i - 1];
              for (int i = L; i; i --) tsa[cntB[B[i]] --] = i;
              for (int i = 1; i <= L; i ++) cntA[i] += cntA[i - 1];
              for (int i = L; i; i --) sa[cntA[A[tsa[i]]] --] = tsa[i];
              rank[sa[1]] = 1;
              for (int i = 2; i <= L; i ++){
                   rank[sa[i]] = rank[sa[i - 1]];
                   if (A[sa[i]] != A[sa[i - 1]] || B[sa[i]] != B[sa[i - 1]]) rank[sa[i]] ++;
              }
         }
    }
    void getht()
    {
         for (int i = 1, j = 0; i <= L; i ++){
              if (j) j --;
              while (ch[i + j] == ch[sa[rank[i] - 1] + j]) j ++;
              ht[rank[i]] = j;
        }
    }
};
SA Sa;
int main()
{
    scanf("%s",str1+1);
    scanf("%s",str2+1);
    int L1=strlen(str1+1);
    int L2=strlen(str2+1);
    for(int i=1;i<=L1;i++) ch[i]=str1[i]-‘a‘+1;
    ch[L1+1]=27;
    for(int i=1;i<=L2;i++) ch[i+L1+1]=str2[i]-‘a‘+1;
    L=L1+L2+1;
    Sa.sort();
    Sa.getht();
    int ans=0;
    for(int i = 1; i <= L; i++)
    {
        if((Sa.sa[i]<=L1)!=(Sa.sa[i-1]<=L1))
            ans = max(ans, Sa.ht[i]);
    }
    printf("%d\n",ans);
    return 0;
}
时间: 2024-10-11 12:59:10

POJ2774Long Long Message (后缀数组&后缀自动机)的相关文章

bzoj 3172 后缀数组|AC自动机

后缀数组或者AC自动机都可以,模板题. /************************************************************** Problem: 3172 User: BLADEVIL Language: C++ Result: Accepted Time:424 ms Memory:34260 kb ****************************************************************/ //By BLADEVI

hdu 4622 Reincarnation(后缀数组|后缀自动机|KMP)

Reincarnation Time Limit: 6000/3000 MS (Java/Others)    Memory Limit: 131072/65536 K (Java/Others) Total Submission(s): 2138    Accepted Submission(s): 732 Problem Description Now you are back,and have a task to do: Given you a string s consist of lo

bzoj 2251(后缀数组/后缀自动机)

题意: 给你一个长度为n的01串,问你这个串的所有子串中,出现次数大于1的子串的出现次数,最后按照字典序输出. 分析: 对于这个题目,我们显然可以用两种处理后缀的数据结构进行处理. 1:后缀自动机: 个人觉得在这个题中,用后缀自动机去解决会相对来说比较好理解. 我们知道,在后缀自动机上的结点状态\(st\),若前一个状态通过字符\(c\)与\(st\)相连,那么结点\(st\)表示的是\(endpos\)相同的子串的集合.而该点的\(endpos\)则代表的是以\(c\)为结尾的串的出现的位置集

后缀数组,目前比较赶进度,而且有点难,所以放到以后再来看

后缀数组 后缀数组2

Long Long Message(后缀数组)

Long Long Message Time Limit: 4000MS   Memory Limit: 131072K Total Submissions: 30427   Accepted: 12337 Case Time Limit: 1000MS Description The little cat is majoring in physics in the capital of Byterland. A piece of sad news comes to him these days

HUID 5558 Alice&#39;s Classified Message 后缀数组+单调栈+二分

http://acm.hdu.edu.cn/showproblem.php?pid=5558 对于每个后缀suffix(i),想要在前面i - 1个suffix中找到一个pos,使得LCP最大.这样做O(n^2) 考虑到对于每一个suffix(i),最长的LCP肯定在和他排名相近的地方取得. 按排名大小顺序枚举位置,按位置维护一个递增的单调栈,对于每一个进栈的元素,要算一算栈内元素和他的LCP最大是多少. 如果不需要输出最小的下标,最大的直接是LCP(suffix(st[top]),  suff

poj 2774 最长公共子串--字符串hash或者后缀数组或者后缀自动机

http://poj.org/problem?id=2774 想用后缀数组的看这里:http://blog.csdn.net/u011026968/article/details/22801015 本文主要讲下怎么hash去找 开始的时候写的是O(n^2 logn)算法 果断超时...虽然也用了二分的,, 代码如下: //hash+二分 #include <cstdio> #include <cstring> #include <algorithm> #include

PKU 2774 Long Long Message (后缀数组练习模板题)

题意:给你两个字符串,求最长公共字串的长度. by:罗穗骞模板 #include <iostream> #include <stdio.h> #include <string.h> #include <algorithm> using namespace std; #define M 303 #define inf 0x3fffffff #define maxn 500000 #define ws ww #define rank RANK #define F

poj 2774 Long Long Message 后缀数组基础题

Time Limit: 4000MS   Memory Limit: 131072K Total Submissions: 24756   Accepted: 10130 Case Time Limit: 1000MS Description The little cat is majoring in physics in the capital of Byterland. A piece of sad news comes to him these days: his mother is ge