HDU - 5008 Boring String Problem (后缀数组+二分+RMQ)

Problem Description

In this problem, you are given a string s and q queries.

For each query, you should answer that when all distinct substrings of string s were sorted lexicographically, which one is the k-th smallest.

A substring si...j of the string s = a1a2 ...an(1 ≤ i ≤ j ≤ n) is the string aiai+1 ...aj. Two substrings sx...y and sz...w are cosidered to be distinct if sx...y ≠
Sz...w

Input

The input consists of multiple test cases.Please process till EOF.

Each test case begins with a line containing a string s(|s| ≤ 105) with only lowercase letters.

Next line contains a postive integer q(1 ≤ q ≤ 105), the number of questions.

q queries are given in the next q lines. Every line contains an integer v. You should calculate the k by k = (l⊕r⊕v)+1(l, r is the output of previous question, at the beginning of each case l = r = 0, 0 < k < 263, “⊕” denotes exclusive or)

Output

For each test case, output consists of q lines, the i-th line contains two integers l, r which is the answer to the i-th query. (The answer l,r satisfies that sl...r is the k-th smallest and if there are several l,r available, ouput l,r which with
the smallest l. If there is no l,r satisfied, output “0 0”. Note that s1...n is the whole string)

Sample Input

aaa
4
0
2
3
5

Sample Output

1 1
1 3
1 2
0 0

Source

2014 ACM/ICPC Asia Regional Xi‘an Online

题意:求第k大的子串,输出左右端点,且左端点尽量小。

思路:首先,我们可以计算出不同的子串个数,这个在论文里有的,就是

n-sa[i]-height[i]。然后我们就可以统计第i大的字符串有的子串个数,然后二分查找到第k个所在的第sa[i]后缀,接着我们可以先确定右端点的范围来RMQ查找sa[j]最小的那个,只要是满足和sa[i]后缀的lcp的长度大于len,就代表也包含这个子串了,接着就是RMQ了,坑点就是l=mid的时候的多一个判断

#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <cmath>
#include <queue>
//typedef long long ll;
typedef __int64 ll;
using namespace std;
const int maxn = 100010;

int sa[maxn];
int t1[maxn], t2[maxn], c[maxn];
int rank[maxn], height[maxn];

void build_sa(int s[], int n, int m) {
    int i, j, p, *x = t1, *y = t2;
    for (i = 0; i < m; i++) c[i] = 0;
    for (i = 0; i < n; i++) c[x[i] = s[i]]++;
    for (i = 1; i < m; i++) c[i] += c[i-1];
    for (i = n-1; i >= 0; i--) sa[--c[x[i]]] = i;

    for (j = 1; j <= n; j <<= 1) {
        p = 0;
        for (i = n-j; i < n; i++) y[p++] = i;
        for (i = 0; i < n; i++)
            if (sa[i] >= j)
                y[p++] = sa[i] - j;
        for (i = 0; i < m; i++) c[i] = 0;
        for (i = 0; i < n; i++) c[x[y[i]]]++;
        for (i = 1; i < m; i++) c[i] += c[i-1];
        for (i = n-1; i >= 0; i--) sa[--c[x[y[i]]]] = y[i];

        swap(x, y);
        p = 1, x[sa[0]] = 0;
        for (i = 1; i < n; i++)
            x[sa[i]] = y[sa[i-1]] == y[sa[i]] && y[sa[i-1]+j] == y[sa[i]+j] ? p-1 : p++;

        if (p >= n) break;
        m = p;
    }
}

void getHeight(int s[],int n) {
    int i, j, k = 0;
    for (i = 0; i <= n; i++)
        rank[sa[i]] = i;

    for (i = 0; i < n; i++) {
        if (k) k--;
        j = sa[rank[i]-1];
        while (s[i+k] == s[j+k]) k++;
        height[rank[i]] = k;
    }
}
int dp[maxn][30];
char str[maxn];
int r[maxn], ind[maxn][30];
ll b[maxn];

void initRMQ(int n) {
    int m = floor(log(n+0.0) / log(2.0));
    for (int i = 1; i <= n; i++)
        dp[i][0] = height[i];  

    for (int i = 1; i <= m; i++) {
        for (int j = n; j; j--) {
            dp[j][i] = dp[j][i-1];
            if (j+(1<<(i-1)) <= n)
                dp[j][i] = min(dp[j][i], dp[j+(1<<(i-1))][i-1]);
        }
    }
}

int lcp(int l, int r) {
    int a = rank[l], b = rank[r];
    if (a > b)
        swap(a,b);
    a++;
    int m = floor(log(b-a+1.0) / log(2.0));
    return min(dp[a][m], dp[b-(1<<m)+1][m]);
}

void init(int n) {
    int m = floor(log(n+0.0) / log(2.0));
    for (int i = 1; i <= n; i++)
        ind[i][0] = sa[i];  

    for (int i = 1; i <= m; i++) {
        for (int j = n; j; j--) {
            ind[j][i] = ind[j][i-1];
            if (j+(1<<(i-1)) <= n)
                ind[j][i] = min(ind[j][i], ind[j+(1<<(i-1))][i-1]);
        }
    }
}

int rmq(int a, int b) {
    int m = floor(log(b-a+1.0) / log(2.0));
    return min(ind[a][m], ind[b-(1<<m)+1][m]);
}

int main() {
    while (scanf("%s", str) != EOF) {
        int n = strlen(str);
        for (int i = 0; i <= n; i++)
            r[i] = str[i];
        build_sa(r, n+1, 128);
        getHeight(r, n);
        initRMQ(n);
        init(n);

        b[0] = 0;
        for (int i = 1; i <= n; i++)
            b[i] = b[i-1] + n - sa[i] - height[i];

        int m;
        scanf("%d", &m);
        ll k;
        int lastl = 0, lastr = 0;
        while (m--) {
            scanf("%I64d", &k);
            k = (k ^ lastl ^ lastr)  + 1;
            if (k > b[n]) {
                printf("0 0\n");
                lastl = 0;
                lastr = 0;
                continue;
            }
            int id = lower_bound(b+1, b+1+n, k) - b;
            k -= b[id-1];
            int len = height[id] + k;
            int ll = id;
            int rr = id;
            int L = id, R = n;
            while (L <= R) {
                int mid = (L + R) / 2;
                if (sa[id] == sa[mid] || lcp(sa[id], sa[mid]) >= len) {
                    rr = mid;
                    L = mid + 1;
                }
                else R = mid - 1;
            }

            int ansl = rmq(ll, rr) + 1;
            int ansr = ansl + len - 1;
            printf("%d %d\n", ansl, ansr);
            lastl = ansl;
            lastr = ansr;
        }
    }
    return 0;
}
时间: 2024-10-12 16:08:58

HDU - 5008 Boring String Problem (后缀数组+二分+RMQ)的相关文章

HDU 5008 Boring String Problem(后缀数组+二分)

题目链接 思路 想到了,但是木写对啊....代码 各种bug,写的乱死了.... 输出最靠前的,比较折腾... #include <cstdio> #include <cstring> #include <algorithm> #include <iostream> #include <cmath> #include <map> using namespace std; #define N 501000 #define LL __in

hdu 5008 Boring String Problem(后缀数组)

题目链接:hdu 5008 Boring String Problem 题目大意:给定一个字符串,初始状态l,r为0,每次询问子串中字典序第l^r^v+1的子串区间,对于重复的输出下标小的. 解题思路:后缀数组,对给定字符串做后缀数组,然后根据height数组确定每个位置做为起点的子串有多少,然后二分查找确定起点位置,但是因为子串的重复的要输出下表小的,所以确定起点后还要确定字典序最小的下标. #include <cstdio> #include <cstring> #includ

hdu 5008(2014 ACM/ICPC Asia Regional Xi&#39;an Online ) Boring String Problem(后缀数组&amp;二分)

Boring String Problem Time Limit: 6000/3000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others) Total Submission(s): 219    Accepted Submission(s): 45 Problem Description In this problem, you are given a string s and q queries. For each que

hdu 5008 Boring String Problem(后缀自动机构造后缀树)

hdu 5008 Boring String Problem(后缀自动机构造后缀树) 题意:给出一个字符串s,然后每次询问一个k,求s的所有子串中,字典序第k小的是谁?多个解,则输出最左边的那个 解题思路:这道题应该是为后缀树量身定制的吧.只要构造出了后缀树,然后按字典序遍历就可以得出每个节点包含的子串的字典序的范围了,而且必然是个连续的区间范围.但是我不会后缀树啊..比赛的时候突然想到,后缀自动机是可以构造后缀树的,虽然以前没写过,但还是硬着头皮上吧,居然还真的让我给撸出来了.我的做法是这样的

HDU 5008 Boring String Problem(西安网络赛B题)

HDU 5008 Boring String Problem 题目链接 思路:构造后缀数组,利用height的数组能预处理出每个字典序开始的前缀和有多少个(其实就是为了去除重复串),然后每次二分查找相应位置,然后在往前往后找一下sa[i]最小的 代码: #include <cstdio> #include <cstring> #include <algorithm> using namespace std; typedef long long ll; const int

[后缀数组+二分+rmq] hdu 5008 Boring String Problem

有点小可惜这道题,当时整个思路都想到了,就是最后找最左下标的时候不会处理, 然后结束完发现直接暴力就可以了,想到了可是不敢写,10w个a直接就T了啊... 数据太弱了,敢写就过系列啊 T T. 然后希望有大神提供完美思路! 题意: 给一个字符串 然后n次询问 对于每一次询问给一个v 然后问第 l⊕r⊕v+1小的子串的区间   (⊕代表异或) 然后输出l r 这里的l r 就是上一次输出的l r  初始化是0 0 不存在输出0 0  如果多个 输出出现最早的. 思路: 首先后缀数组就不说了,做完之

HDU 5008 Boring String Problem

题意:给定一个串长度<=1e5,将其所有的不同的字串按照字典序排序,然后q个询问,每次询问字典序第k小的的起始坐标,并且起始坐标尽量小. 分析: 一开始看错题意,没有意识到是求不同的字串中第k小的,果断不知道怎么做,感觉如果题目改成这样,似乎还有点难度,至少对我来说. 好了,这个题目是考虑不同的字串,首先后缀数组处理,也就是讲后缀按照字典序排序,对于每个后缀开始的字串,如h[i],容易知道i和i-1的后缀的LCP长度为h[i]那么i中除开前h[i]个字串,之后的字串在i-1之前都是没有出现过的,

HDU5008 Boring String Problem(后缀数组)

练习一下字符串,做一下这道题. 首先是关于一个字符串有多少不同子串的问题,串由小到大排起序来应该是按照sa[i]的顺序排出来的产生的. 好像abbacd,排序出来的后缀是这样的 1---abbacd     第一个串产生的6个前缀都是新的子串 2---acd          第二个串除了和上一个串的前缀1 3-1=2 产生了2个子串 3---bacd        4-0=4 4---bbacd      5-1=4 5---cd           2-0=0 6---d          

hdu 3518 Boring counting(后缀数组)

Boring counting                                                                       Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Problem Description 035 now faced a tough problem,his english teacher gives him