FZU-2075 Substring(后缀数组)

Description

Given a string, find a substring of it which the original string contains exactly n such substrings.

Input

There are several cases. The first line of each case contains an integer n.The second line contains a string, no longer than 100000.

Output

If the such substring doesn‘t exist, output "impossible", else output the substring that appeared n times in the original string.If there are multiple solutions, output lexicographic smallest substring.

Sample Input

2
ababba

Sample Output

ab

题目大意:给一个字符串,找出其恰好出现n次的字典序最小的子串。题目分析:将所有的子串排序后,定义n块为相邻的n个子串构成的字符串集合,如果某个n块的lca大于包含它的n+1块的lca,那么这个n块的lca便是恰好出现了n次的子串。

代码如下:
//# define AC

# ifndef AC

# include<iostream>
# include<cstdio>
# include<cstring>
# include<vector>
# include<queue>
# include<list>
# include<cmath>
# include<set>
# include<map>
# include<string>
# include<cstdlib>
# include<algorithm>
using namespace std;

const int N=100000;

int SA[N+5];
int tSA[N+5];
int rk[N+5];
int cnt[N+5];
int height[N+5];
int *x,*y;

bool same(int i,int j,int k,int n)
{
    if(y[i]!=y[j]) return false;
    if(i+k<n&&j+k>=n) return false;
    if(i+k>=n&&j+k<n) return false;
    return y[i+k]==y[j+k];
}

void buildSA(char* s)
{
    x=rk,y=tSA;
    int m=130;
    int n=strlen(s);
    for(int i=0;i<m;++i) cnt[i]=0;
    for(int i=0;i<n;++i) ++cnt[x[i]=s[i]];
    for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
    for(int i=n-1;i>=0;--i) SA[--cnt[x[i]]]=i;

    for(int k=1;k<=n;k<<=1){
        int p=0;
        for(int i=n-k;i<n;++i) y[p++]=i;
        for(int i=0;i<n;++i) if(SA[i]>=k) y[p++]=SA[i]-k;

        for(int i=0;i<m;++i) cnt[i]=0;
        for(int i=0;i<n;++i) ++cnt[x[y[i]]];
        for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
        for(int i=n-1;i>=0;--i) SA[--cnt[x[y[i]]]]=y[i];

        p=1;
        swap(x,y);
        x[SA[0]]=0;
        for(int i=1;i<n;++i)
            x[SA[i]]=same(SA[i],SA[i-1],k,n)?p-1:p++;
        if(p>=n) break;
        m=p;
    }
}

void getHeight(char* s)
{
    int n=strlen(s);
    int k=0;
    for(int i=0;i<n;++i) rk[SA[i]]=i;
    for(int i=0;i<n;++i){
        if(rk[i]==0)
            height[rk[i]]=k=0;
        else{
            if(k) --k;
            int j=SA[rk[i]-1];
            while(s[i+k]==s[j+k]) ++k;
            height[rk[i]]=k;
        }
    }
}

char s[N+5];
int st[N+5][20];

void spare_table()
{
    int n=strlen(s);
    for(int i=1;i<n;++i)
        st[i][0]=height[i];
    for(int k=1;(1<<k)<=n;++k){
        for(int i=1;i+(1<<k)-1<n;++i){
            st[i][k]=min(st[i][k-1],st[i+(1<<(k-1))][k-1]);
        }
    }
}

int getST(int l,int r)
{
    int k=0;
    while((1<<(k+1))<=r-l+1) ++k;
    return min(st[l][k],st[r-(1<<k)+1][k]);
}

string solve(int m)
{
    string res="";
    int n=strlen(s);
    int a,b,c;
    for(int i=0;i+m-1<n;++i){
        a=b=c=0;
        if(m==1) a=n-SA[i];
        else a=getST(i+1,i+m-1);
        if(i+m<n) b=getST(i+1,i+m);
        if(i>0) c=getST(i,i+m-1);
        if(a>b&&a>c){
            for(int j=SA[i];j<SA[i]+a;++j) res+=s[j];
            return res;
        }
    }
    return "impossible";
}

int main()
{
    int m;
    while(~scanf("%d",&m))
    {
        scanf("%s",s);
        buildSA(s);
        getHeight(s);
        spare_table();
        cout<<solve(m)<<endl;
    }
    return 0;
}

# endif

  

时间: 2024-10-13 15:12:11

FZU-2075 Substring(后缀数组)的相关文章

poj 3693 Maximum repetition substring(后缀数组)

题目链接:poj 3693 Maximum repetition substring 题目大意:求一个字符串中循环子串次数最多的子串. 解题思路:对字符串构建后缀数组,然后枚举循环长度,分区间确定.对于一个长度l,每次求出i和i+l的LCP,那么以i为起点,循环子串长度为l的子串的循环次数为LCP/l+1,然后再考虑一下从i-l+1~i之间有没有存在增长的可能性. #include <cstdio> #include <cstring> #include <vector>

hdu_1403_Longest Common Substring(后缀数组的应用)

题目链接:hdu_1403_Longest Common Substring 题意: 给你两个字符串,然你找最长的公共子串 题解: 后缀数组的经典应用,要找两个字符串的公共子串,那么就相当于找两个串的后缀的最长公共前缀,我们将两个字符串拼接在一起,中间插一个特殊字符 然后我们考虑height数组,height数组存的是排i和i-1的最长前缀,如果sa[i]和sa[i-1]在特殊字符的两边,那么这个height[i]记录的就是这两个串的最长 子串,然后扫一遍height数组更新一下答案就行了 1

POJ - 3693 Maximum repetition substring(后缀数组求重复次数最多的连续重复子串)

Description The repetition number of a string is defined as the maximum number R such that the string can be partitioned into R same consecutive substrings. For example, the repetition number of "ababab" is 3 and "ababa" is 1. Given a

hdu 5769 Substring 后缀数组 + KMP

http://acm.hdu.edu.cn/showproblem.php?pid=5769 题意:在S串中找出X串出现的不同子串的数目? 其中1 官方题解: 处理出后缀数组中的sa[]数组和height[]数组.在不考虑包含字符X的情况下,不同子串的个数为 如果要求字符X,只需要记录距离sa[i]最近的字符X的位置(用nxt[sa[i]]表示)即可,个数 理解:后缀数组height[i]就是sa[i]与sa[i-1]的LCP,在后缀数组中求解全部的不同子串(之前只写过SAM处理所有不同子串..

2016多校联合训练4 F - Substring 后缀数组

Description ?? is practicing his program skill, and now he is given a string, he has to calculate the total number of its distinct substrings. But ?? thinks that is too easy, he wants to make this problem more interesting. ?? likes a character X very

POJ 3693 Maximum repetition substring ——后缀数组

重复次数最多的字串,我们可以枚举循环节的长度. 然后正反两次LCP,然后发现如果长度%L有剩余的情况时,答案是在一个区间内的. 所以需要找到区间内最小的rk值. 两个后缀数组,四个ST表,$\Theta(n\log n)$ 就可以解决了 空间卡死了,瞎晶胞卡过去了. #include <map> #include <cmath> #include <queue> #include <cstdio> #include <cstring> #incl

Substring (后缀数组 + 计数)

题意:求出字符串中包含了某个字符的字符序列不一样的数量. 思路:其实主要的是找出每个被包含字符的数量,假设除了目标字符之外的所有字符都不一样,那么应该就很好求了,但是显然不可能,所以我们可以枚举每一个起点,个数应该是从他的下一个字符是目标字符起的所有数量,但是通过观察我们可以发现这样计算我们又会多计算了一部分,例如a , abbabbabb 在计算第四个和第七个时,我们会多计算了a, ab, abb 或者计算第二位和第五位时多计算了bba,bbab,bbabb,我们可以这是就是相当于后缀数组里面

POJ3693:Maximum repetition substring(后缀数组+RMQ)

Description The repetition number of a string is defined as the maximum number R such that the string can be partitioned into R same consecutive substrings. For example, the repetition number of "ababab" is 3 and "ababa" is 1. Given a

poj3693 Maximum repetition substring 后缀数组

http://poj.org/problem?id=3693 Maximum repetition substring Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 7241   Accepted: 2162 Description The repetition number of a string is defined as the maximum number R such that the string can b

Maximum repetition substring 后缀数组

Maximum repetition substring Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 7578   Accepted: 2281 Description The repetition number of a string is defined as the maximum number R such that the string can be partitioned into R same conse