Lexicographical Substring Search SPOJ - SUBLEX （后缀数组）

Lexicographical Substrings Search

\[
Time Limit: 149 ms \quad Memory Limit: 1572864 kB
\]

题意

给出一个字符串，求出这个字符串上字典序第 \(k\) 小的子串。

思路

对于给出的字符串，求出后缀数组，根据后缀数组的 \(height\) 数组，我们可以很容易得到一个字符的总子串个数是 \(\sum_{i=1}^{n} (n-sa[i]+1+height[i])\)，利用这个式子，就可以求出第 \(k\) 小的子串了。

/***************************************************************
    > File Name    : a.cpp
    > Author       : Jiaaaaaaaqi
    > Created Time : 2019年05月22日 星期三 18时17分02秒
 ***************************************************************/

#include <map>
#include <set>
#include <list>
#include <ctime>
#include <cmath>
#include <stack>
#include <queue>
#include <cfloat>
#include <string>
#include <vector>
#include <cstdio>
#include <bitset>
#include <cstdlib>
#include <cstring>
#include <iostream>
#include <algorithm>
#define  lowbit(x)  x & (-x)
#define  mes(a, b)  memset(a, b, sizeof a)
#define  fi         first
#define  se         second
#define  pii        pair<int, int>

typedef unsigned long long int ull;
typedef long long int ll;
const int    maxn = 1e5 + 10;
const int    maxm = 1e5 + 10;
const ll     mod  = 1e9 + 7;
const ll     INF  = 1e18 + 100;
const int    inf  = 0x3f3f3f3f;
const double pi   = acos(-1.0);
const double eps  = 1e-8;
using namespace std;

int n, m;
int cas, tol, T;

char s[maxn];
int a[maxn], sa[maxn], rk[maxn], tax[maxn], height[maxn], tp[maxn];

void rsort(int n, int m) {
    for(int i=0; i<=m; i++) tax[i] = 0;
    for(int i=1; i<=n; i++) tax[rk[tp[i]]]++;
    for(int i=1; i<=m; i++) tax[i] += tax[i-1];
    for(int i=n; i>=1; i--) sa[tax[rk[tp[i]]]--] = tp[i];
}

int cmp(int *f, int x, int y, int w) {
    return f[x]==f[y] && f[x+w]==f[y+w];
}

void SA(int *a, int n, int m) {
    for(int i=1; i<=n; i++) rk[i] = a[i], tp[i] = i;
    rsort(n, m);
    for(int w=1, p=1, i; p<n; w<<=1, m=p) {
        for(p=0, i=n-w+1;i<=n; i++) tp[++p] = i;
        for(i=1; i<=n; i++) if(sa[i]>w) tp[++p] = sa[i]-w;
        rsort(n, m), swap(tp, rk);
        rk[sa[1]] = p = 1;
        for(i=2; i<=n; i++) rk[sa[i]] = cmp(tp, sa[i], sa[i-1], w) ? p : ++p;
    }
    int j, k=0;
    for(int i=1; i<=n; height[rk[i++]] = k)
        for(k=k ? k-1 : k, j=sa[rk[i]-1]; a[i+k]==a[j+k]; k++);
}

int main() {
    scanf("%s", s+1);
    n = strlen(s+1);
    for(int i=1; i<=n; i++) {
        a[i] = s[i];
    }
    SA(a, n, 260);
    // for(int i=1; i<=n; i++) {
    //     printf("sa[%d] = %d height[%d] = %d\n", i, sa[i], i, height[i]);
    // }
    scanf("%d", &T);
    while(T--) {
        scanf("%d", &m);
        for(int i=1; i<=n; i++) {
            if(m > n-sa[i]+1-height[i]) {
                m -= (n-sa[i]+1-height[i]);
            } else {
                int last = sa[i] + height[i] - 1;
                last += m;
                // printf("%d %d\n", sa[i], last);
                for(int j=sa[i]; j<=last; j++) {
                    printf("%c", s[j]);
                }
                printf("\n");
                break;
            }
        }
    }
    return 0;
}

原文地址：https://www.cnblogs.com/Jiaaaaaaaqi/p/10907821.html

时间： 2024-11-03 12:20:06

Lexicographical Substring Search SPOJ - SUBLEX （后缀数组）的相关文章

Lexicographical Substring Search SPOJ - SUBLEX （后缀自动机）

Lexicographical Substrings Search \[ Time Limit: 149 ms \quad Memory Limit: 1572864 kB \] 题意给出一个字符串,求出这个字符串上字典序第 \(k\) 小的子串. 思路先对给出的字符串构建后缀自动机,因为后缀自动机上从根节点走到任意节点都是原串的一个子串,所以我们可以 \(dfs\) 求出节点 \(i\) 往后存在多少个子串. 对于查询第 \(k\) 小的子串时,在用一个 \(dfs\) 来求,对于当前节点

SPOJ SUBLEX 7258. Lexicographical Substring Search

看起来像是普通的SAM+dfs...但SPOJ太慢了......倒腾了一个晚上不是WA 就是RE ..... 最后换SA写了...... Lexicographical Substring Search Time Limit: 1000MS Memory Limit: Unknown 64bit IO Format: %lld & %llu [Submit] [Go Back] [Status] Description Little Daniel loves to play wi

SPOJ SUBLEX - Lexicographical Substring Search

SUBLEX - Lexicographical Substring Search no tags Little Daniel loves to play with strings! He always finds different ways to have fun with strings! Knowing that, his friend Kinan decided to test his skills so he gave him a string S and asked him Q q

SPOJ 220后缀数组：求每个字符串至少出现两次且不重叠的最长子串

思路:也是n个串连接成一个串,中间用没出现过的字符隔开,然后求后缀数组. 因为是不重叠的,所以和POJ 1743判断一样,只不过这里是多个串,每个串都要判断里面的最长公共前缀有没有重叠,所以用数组存下来就得了,然后再判断. #include<iostream> #include<cstdio> #include<cstring> #include<algorithm> #include<map> #include<queue> #in

【SPOJ】7258. Lexicographical Substring Search（后缀自动机）

http://www.spoj.com/problems/SUBLEX/ 后缀自动机系列完成QAQ...撒花..明天or今晚写个小结? 首先得知道:后缀自动机中,root出发到任意一个状态的路径对应一个子串,而且不重复.(原因似乎是逆序后缀树? 所以我们在自动机上预处理每一个状态的子串数目,然后从小到大枚举字符. 子串数目可以这样预处理出:s[x]=sum{s[y]}+1, y是x出发的下一个点,意思就是说,以x开头的子串有那么多个(即将孩子的所有子串前边都加上x),然后x单独算一个子串. 然后

SPOJ PHRASES 后缀数组

题目链接:http://www.spoj.com/problems/PHRASES/en/ 题意:给定n个字符串,求一个最长的子串至少在每个串中的不重叠出现次数都不小于2.输出满足条件的最长子串长度思路:根据<<后缀数组——处理字符串的有力工具>>的思路,先将 n个字符串连起来, 中间用不相同的且没有出现在字符串中的字符隔开, 求后缀数组. 然后二分答案, 再将后缀分组.判断的时候, 要看是否有一组后缀在每个原来的字符串中至少出现两次, 并且在每个原来的字符串中, 后缀的起始位置

SPOJ DISUBSTR 后缀数组

题目链接:http://www.spoj.com/problems/DISUBSTR/en/ 题意:给定一个字符串,求不相同的子串个数. 思路:直接根据09年oi论文<<后缀数组——出来字符串的有力工具>>的解法. 还有另一种思想:总数为n*(n-1)/2,height[i]是两个后缀的最长公共前缀,所以用总数-height[i]的和就是答案 #define _CRT_SECURE_NO_DEPRECATE #include<iostream> #include<

SPOJ SUBST1 后缀数组

题目链接:http://www.spoj.com/problems/SUBST1/en/ 题意:给定一个字符串,求不相同的子串个数. 思路:直接根据09年oi论文<<后缀数组——出来字符串的有力工具>>的解法. 此题和SPOJ DISUBSTR一样,至少数据范围变大了. #define _CRT_SECURE_NO_DEPRECATE #include<iostream> #include<cstdio> #include<cstring> #i

SPOJ REPEATS 后缀数组

题目链接:http://www.spoj.com/problems/REPEATS/en/ 题意:首先定义了一个字符串的重复度.即一个字符串由一个子串重复k次构成.那么最大的k即是该字符串的重复度.现在给定一个长度为n的字符串,求最大重复次数. 思路:根据<<后缀数组——处理字符串的有力工具>>的思路,先穷举长度L,然后求长度为L 的子串最多能连续出现几次.首先连续出现1 次是肯定可以的,所以这里只考虑至少2 次的情况.假设在原字符串中连续出现2 次,记这个子字符串为S,那么S 肯