hdu4691---Front compression(后缀数组+RMQ)

Front compression

Time Limit: 5000/5000 MS (Java/Others) Memory Limit: 102400/102400 K (Java/Others)

Total Submission(s): 1490 Accepted Submission(s): 553

Problem Description

Front compression is a type of delta encoding compression algorithm whereby common prefixes and their lengths are recorded so that they need not be duplicated. For example:

The size of the input is 43 bytes, while the size of the compressed output is 40. Here, every space and newline is also counted as 1 byte.

Given the input, each line of which is a substring of a long string, what are sizes of it and corresponding compressed output?

Input

There are multiple test cases. Process to the End of File.

The first line of each test case is a long string S made up of lowercase letters, whose length doesn’t exceed 100,000. The second line contains a integer 1 ≤ N ≤ 100,000, which is the number of lines in the input. Each of the following N lines contains two integers 0 ≤ A < B ≤ length(S), indicating that that line of the input is substring [A, B) of S.

Output

For each test case, output the sizes of the input and corresponding compressed output.

Sample Input

frcode

2

0 6

0 6

unitedstatesofamerica

3

0 6

0 12

0 21

myxophytamyxopodnabnabbednabbingnabit

6

0 9

9 16

16 19

19 25

25 32

32 37

Sample Output

14 12

42 31

43 40

Author

Zejun Wu (watashi)

Source

2013 Multi-University Training Contest 9

Recommend

zhuyuanchen520 | We have carefully selected several similar problems for you: 5205 5204 5203 5201 5197

Statistic | Submit | Discuss | Note

这题只要解决前后两个串求LCP就行,但是要和这两个串的长度比较一下

/*************************************************************************
    > File Name: hdu4691.cpp
    > Author: ALex
    > Mail: [email protected]
    > Created Time: 2015年04月18日 星期六 12时53分45秒
 ************************************************************************/

#include <functional>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <cstring>
#include <cstdio>
#include <cmath>
#include <cstdlib>
#include <queue>
#include <stack>
#include <map>
#include <bitset>
#include <set>
#include <vector>

using namespace std;

const double pi = acos(-1.0);
const int inf = 0x3f3f3f3f;
const double eps = 1e-15;
typedef long long LL;
typedef pair <int, int> PLL;

class SuffixArray
{
    public:
        static const int N = 112000;
        int init[N];
        int X[N];
        int Y[N];
        int Rank[N];
        int sa[N];
        int height[N];
        int buc[N];
        int LOG[N];
        int dp[N][20];
        int size;

        void clear()
        {
            size = 0;
        }

        void insert(int n)
        {
            init[size++] = n;
        }

        bool cmp(int *r, int a, int b, int l)
        {
            return (r[a] == r[b] && r[a + l] == r[b + l]);
        }

        void getsa(int m = 256) //m一般为最大值+1
        {
            init[size] = 0;
            int l, p, *x = X, *y = Y, n = size + 1;
            for (int i = 0; i < m; ++i)
            {
                buc[i] = 0;
            }
            for (int i = 0; i < n; ++i)
            {
                ++buc[x[i] = init[i]];
            }
            for (int i = 1; i < m; ++i)
            {
                buc[i] += buc[i - 1];
            }
            for (int i = n - 1; i >= 0; --i)
            {
                sa[--buc[x[i]]] = i;
            }
            for (l = 1, p = 1; l <= n && p < n; m = p, l *= 2)
            {
                p = 0;
                for (int i = n - l; i < n; ++i)
                {
                    y[p++] = i;
                }
                for (int i = 0; i < n; ++i)
                {
                    if (sa[i] >= l)
                    {
                        y[p++] = sa[i] - l;
                    }
                }
                for (int i = 0; i < m; ++i)
                {
                    buc[i] = 0;
                }
                for (int i = 0; i < n; ++i)
                {
                    ++buc[x[y[i]]];
                }
                for (int i = 1; i < m; ++i)
                {
                    buc[i] += buc[i - 1];
                }
                for (int i = n - 1; i >= 0; --i)
                {
                    sa[--buc[x[y[i]]]] = y[i];
                }
                int i;

                for (swap(x, y), x[sa[0]] = 0, p = 1, i = 1; i < n; ++i)
                {
                    x[sa[i]] = cmp(y, sa[i - 1], sa[i], l) ? p - 1 : p++;
                }
            }
        }

        void getheight()
        {
            int h = 0, n = size;
            for (int i = 0; i <= n; ++i)
            {
                Rank[sa[i]] = i;
            }
            height[0] = 0;
            for (int i = 0; i < n; ++i)
            {
                if (h > 0)
                {
                    --h;
                }
                int j =sa[Rank[i] - 1];
                for (; i + h < n && j + h < n && init[i + h] == init[j + h]; ++h);
                height[Rank[i] - 1] = h;
            }
        }   

        //预处理每一个数字的对数,用于rmq,常数优化
        void initLOG()
        {
            LOG[0] = -1;
            for (int i = 1; i < N; ++i)
            {
                LOG[i] = (i & (i - 1)) ? LOG[i - 1] : LOG[i - 1] + 1;
            }
        }

        void initRMQ()
        {
            initLOG();
            int n = size;
            int limit;
            for (int i = 0; i < n; ++i)
            {
                dp[i][0] = height[i];
            }
            for (int j = 1; j <= LOG[n]; ++j)
            {
                limit = (n - (1 << j));
                for (int i = 0; i <= limit; ++i)
                {
                    dp[i][j] = min(dp[i][j - 1], dp[i + (1 << (j - 1))][j - 1]);
                }
            }
        }

        int LCP(int a, int b)
        {
            int t;
            a = Rank[a];
            b = Rank[b];
            if (a > b)
            {
                swap(a, b);
            }
--b;
            t = LOG[b - a + 1];
            return min(dp[a][t], dp[b - (1 << t) + 1][t]);
        }
}SA;
char str[100100];

int main()
{
    while (~scanf("%s", str))
    {
        int n;
        scanf("%d", &n);
        SA.clear();
        int len = strlen(str);
        for (int i = 0; i < len; ++i)
        {
            SA.insert(str[i] - ‘a‘ + 1);
        }
        SA.getsa(30);
        SA.getheight();
        SA.initLOG();
        SA.initRMQ();
        LL ans1 = 0, ans2 = 0;
        int lastl, lastr;
        for (int i = 1; i <= n; ++i)
        {
            int l, r;
            scanf("%d%d", &l, &r);
            ans1 += (r - l + 1);
            if (i == 1)
            {
                ans2 += (r - l + 1) + 2;
                lastl = l;
                lastr = r - 1;
                continue;
            }
            --r;
            int lcp = -1;
            if (l != lastl)
            {
                lcp = SA.LCP(l, lastl);
            }
            int len1 = lastr - lastl + 1;
            int len2 = r - l + 1;
            lastr = r;
            lastl = l;
            if (lcp == -1)
            {
                lcp = min(len1, len2);
            }
            if (lcp > min(len1, len2))
            {
                lcp = min(len1, len2);
            }
            ++ans2;
            ans2 += len2 - lcp + 1;
            if (lcp == 0)
            {
                ++ans2;
            }
            else
            {
                while (lcp)
                {
                    ++ans2;
                    lcp /= 10;
                }
            }
        }
        printf("%lld %lld\n", ans1, ans2);
    }
    return 0;
}
时间: 2024-10-13 01:00:04

hdu4691---Front compression(后缀数组+RMQ)的相关文章

hdu4691 Front compression(后缀数组)

Front compression Time Limit: 5000/5000 MS (Java/Others) Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1339 Accepted Submission(s): 496 Problem Description Front compression is a type of delta encoding compression algorithm whereby

HDOJ 4691 Front compression 后缀数组

后缀数组求两子串间的最大公共前缀. Front compression Time Limit: 5000/5000 MS (Java/Others)    Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1382    Accepted Submission(s): 517 Problem Description Front compression is a type of delta encoding compr

BZOJ 题目3172: [Tjoi2013]单词(AC自动机||AC自动机+fail树||后缀数组暴力||后缀数组+RMQ+二分等五种姿势水过)

3172: [Tjoi2013]单词 Time Limit: 10 Sec  Memory Limit: 512 MB Submit: 1890  Solved: 877 [Submit][Status][Discuss] Description 某人读论文,一篇论文是由许多单词组成.但他发现一个单词会在论文中出现很多次,现在想知道每个单词分别在论文中出现多少次. Input 第一个一个整数N,表示有多少个单词,接下来N行每行一个单词.每个单词由小写字母组成,N<=200,单词长度不超过10^6

【uva10829-求形如UVU的串的个数】后缀数组+rmq or 直接for水过

题意:UVU形式的串的个数,V的长度规定,U要一样,位置不同即为不同字串 https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&category=&problem=1770 题解:一开始理解错题意,以为是abcxxxcba(xxx为v),开心地打了后缀数组后发现哎样例不对丫.. UVA的意思是abcxxxabc(xxx为v). 类似poj3693,我们暴

HDOJ 题目4691 Front compression(后缀数组+RMQ最长前缀)

Front compression Time Limit: 5000/5000 MS (Java/Others)    Memory Limit: 102400/102400 K (Java/Others) Total Submission(s): 1652    Accepted Submission(s): 604 Problem Description Front compression is a type of delta encoding compression algorithm w

HDU_6194 后缀数组+RMQ

好绝望的..想了五个多小时,最后还是没A...赛后看了下后缀数组瞬间就有了思路...不过因为太菜,想了将近两个小时才吧这个题干掉. 首先,应当认为,后缀数组的定义是,某字符串S的所有后缀按照字典序有小到大的顺序排列(使用下标表示后缀).因为具体过程没太看懂,但是参见刘汝佳蓝书<算法竞赛黑暗圣典>可以得到一个聪明的NLOGN的神器算法.不过这个不太重要. 之后还可以通过他在LCP问题中提到的RANK,height数组相关算法,处理出来height数组,之后其他的可以扔掉. <黑暗圣典>

Codeforces Round #422 (Div. 2) E. Liar 后缀数组+RMQ+DP

E. Liar The first semester ended. You know, after the end of the first semester the holidays begin. On holidays Noora decided to return to Vi?kopolis. As a modest souvenir for Leha, she brought a sausage of length m from Pavlopolis. Everyone knows th

SPOJ687---REPEATS - Repeats(后缀数组+RMQ)

A string s is called an (k,l)-repeat if s is obtained by concatenating k>=1 times some seed string t with length l>=1. For example, the string s = abaabaabaaba is a (4,3)-repeat with t = aba as its seed string. That is, the seed string t is 3 charac

HDU2459 后缀数组+RMQ

题目大意: 在原串中找到一个拥有连续相同子串最多的那个子串 比如dababababc中的abababab有4个连续的ab,是最多的 如果有同样多的输出字典序最小的那个 这里用后缀数组解决问题: 枚举连续子串的长度l , 那么从当前位置0出发每次递增l,拿 i 和 i+l 开头的后缀求一个前缀和val , 求解依靠RMQ 得到区间 rank(i),rank(i+l) 那么连续的子串个数应该是val/l+1 但是由于你不一定是从最正确的位置出发,那么我们就需要不断将这个i往前推l位,直到某一位字符不